Syntactical characterization of a subset of Domain. Independent Formulas ONERA-CERT Toulouse FRANCE.

Syntactical characterization of a subset of Domain Independent Formulas Robert DEMOLOMBE ONERA-CERT 2 Av. E.Belin BP 4025 31055 Toulouse FRANCE demolomb@tls-cs.cert.fr January 17, 2003 Abstract A Domain Independent formula of First Order Predicate Calculus is a formula whose evaluation in a given interpretation does not change when we add a new constant to the interpretation domain. The formulas used to express queries, integrity constraints or deductive rules in the database eld which have an intuitive meaning are Domain Independent. That is the reason why this class is of great interest in practice. Unfortunately this class is not decidable, and the problem is to characterize new sub-classes, as large as possible, which are decidable. We provide a syntactic characterization of a class of formulas, the Evaluable formulas, which are proved to be Domain Independent. This class is dened only for function free formulas. It is also proved that the class of Evaluable formulas contains the other classes of syntactically characterized Domain Independent formulas usually found in the literature, namely : Range Separable formulas and Range Restricted formulas. Finally we show that the expressive power of Evaluable formulas is the same as that of Domain Independent formulas. That is, each Domain Independent formula admits an equivalent Evaluable one. An important advantage of this characterization is that, to check if a formula is Evaluable, it is not necessary to transform it to a normal form, as is the case for Range Restricted formulas. 1

1 Introduction In the Data Base eld the First Order Predicate Calculus language 1 is used by many researchers because it provides the framework of Mathematical Logic, and by users because it provides an easy way to express knowledge about the universe of an application. In this paper we consider only a subset of the language without functions. First of all this language has been used as a query language and was implemented, under dierent syntactical aspects, in Data Base Management Systems like QBE [21] and SYNTEX [2, 11]. Another application of the Predicate Calculus language is with rules that express integrity constraints on, or derive new information from, the information in the Data Base [13, 14]. A third application eld is for expressing properties of the relations, dening dierent kinds of dependencies and normal forms [12]. However some problems may arise with the Predicate Calculus language, because the meaning of some formulas of the language is not clear. For example the query \who are the x who do not teach English?" can be expressed by the formula : :Teach(x; English). The meaning of this query is not well-dened since it is not obvious that the set of possible values of the variable x is the set of persons, or the set of teachers, or the set of teachers who teach some language; likewise we cannot say if the sentence \All the x have a PhD", expressed by the formula : 8x Diploma(x; PhD), is true or false until the set to which the universal quantier is applied has been specied. These formulas have a formal meaning in a given interpretation, but this formal meaning can change in other interpretations even when they express the same positive knowledge relative to a given universe. From a practical point of view these queries cannot be translated into the Relational Algebra and cannot be evaluated by a Relational DBMS. Another example of a meaningless formula is : 8x8y(Language(y)! Teach(x; y)). If such a formula is used, for example, with Language(English), we can derive Teach(x,English) for any value of x; not only values which denote any teacher and any person, but also any entity like : English, PhD,..etc... 1 The Predicate Calculus language is also called Domain Calculus in [20]. 2

These problems have been tackled by many researchers. Some researchers have semantically characterized the set of formulas which have a clear intuitive meaning : J.L.Kuhns, who introduced in 1967 the concept of Denite Formula, and J.D.Ullman, who introduced in 1980 the concept of Safe Formula. It has been shown in [18] that if we extend the class of Safe Formulas by equivalence, we obtain the same class as the class of Denite Formulas. So these concepts are very close, and this class is usually called the class of Domain Independent Formulas [12, 16]. In 1969 R.A. Di Paola showed in [19] that the class of Domain Independent Formulas is not solvable. That is the reason why we cannot provide a syntactical characterization of the class of all Domain Independent Formulas. Some researchers have tried to dene the largest possible sovable subset of this class. In the eld of query languages J.L.Kuhns dened in [15] the Proper Formulas, A.Artaud and J.M.Nicolas dened in [1] the Acceptable Formulas, E.F.Codd, for the tuple Predicate Calculus, the Range Separable Formulas, and E.Cooper dened in [7] the Permissible Formulas. In the eld of integrity constraints J.M.Nicolas dened the Range Restricted Formulas. To express the general rules used to deduce information in the context of non-monotonic Logic, G.Bossu and P.Siegel [4], and P.Besnard [3] use Predened Variable Formulas. Finally, to characterize a large class of dependencies R.Fagin dened in [12] a subset of Horn clauses called implicational dependencies. In section 3 of this paper we syntactically dene the set of Evaluable Formulas. It will be proved in section 4 that Evaluable Formulas are Domain Independent. In section 5 we will show that this class of formulas is larger than the previous ones presented in the literature and in section 6 it will be shown that Evaluable Formulas have the same expressive power as Domain Independent Formulas. 2 Denition of Domain Independent Formulas 2.1 Denition of Predicate Calculus language We briey recall the denition of the Predicate Calculus language which will be used in the next sections. The vocabulary of the language is as follows : Constant names : a,b,c,... Variable names : x,y,z,... 3

Predicate names : P,Q,R,... Logical symbols : :; _; ^; 8; 9; ); ( We consider a Predicate Calculus language without function symbols. The well formed formulas (WFF's) of the language are dened by the following rules : 1. Each atomic formula is a WFF. Atomic formulas are Predicate names followed by a list of arguments in parentheses. The arguments may be constants or variables. 2. If F 1 and F 2 are WFF's then (F 1 ^ F 2 ), (F 1 _ F 2 ) are WFF's. 3. If F 1 is a WFF then (:F 1 ), (8xF 1 ) and (9xF 1 ) are WFF's. 4. All the WFF's are dened by the rules 1,2 and 3, modulo the following remarks. Remarks : When there is no risk of ambiguity some parentheses may be omitted. It is assumed that quantied variables are free in the scope of their quantiers. That is the formula 9xP(y) is not allowed. It is assumed that quantied variables cannot appear out of the scope of their quantiers. That is, a variable cannot be both free and quantied, and it is quantied at most once. For instance the formula : P(x; y) ^ 9x(Q(x; y) ^ R(x)) ^ 9xQ(x; x), has to be written in the form : P(x; y) ^ 9z(Q(z; y) ^ R(z)) ^ 9tQ(t; t). We do not permit use of the equality predicate; we will see later the reason for this restriction. An open formula ( resp. closed formula) is a formula having at least one free variable ( resp. having no free variable). 2.2 Interpretation of the language Let D be a non empty set : D = f a 1 ; a 2 ; a 3 ; ::: g. An interpretation I of the Predicate Calculus language is a mapping which assigns to each constant c i an element 4

a j of D, and which assigns to each predicate P i of rank n i a subset of D n i, called the relation R i. We will use the shorthand notation I =< D; R 1 ; R 2 ; : : : ; R n > for an interpretation I assuming that the constants and their corresponding elements in the domain have the same name. Valuation of a formula : let F(x 1 ; x 2 ; : : : ; x s ) be a formula, where x 1 ; x 2 ; : : : ; x s are the free variables of F, and let I be an interpretation, the valuation T(F,I) of the formula F in the interpretation I is : - if s > 0 : the set of the elements of D s which satisfy F, - if s = 0 : t or f, according to whether F is satised or not satised in I, where t and f respectively denote true and false. 2.3 Stable Formulas The anomalies of some formulas mentioned in the introduction can be formally dened using the concept of stable formulas. Let F be a formula in which only the predicate symbols P 1 ; P 2 ; : : :; P t occur, and I and I' be two interpretations where the predicates P 1 ; P 2 ; : : : ; P t are interpreted by the same relations R 1 ; R 2 ; : : : ; R t : I =< D; R 1 ; R 2 ; : : : ; R t ; R t+1 ; : : :; R n > I 0 =< D 0 ; R 1 ; R 2 ; : : : ; R t ; R 0 t+1 ; : : : ; R0 n > We say that I and I' are equivalent with regard to F, and express this in our notations as E(F,I,I'). Denition 1 (D1). Stable Formulas : A formula F is stable i : 8I8I 0 (E(F; I; I 0 ) ) T(F; I) = T(F; I 0 )) This property will be illustrated by means of a short example. Let us consider 5

a Data Base with three relations : Married, Speak and Teach, whose interpretations are : Married John Lucy Jack Marilyn Speak John English Lucy English Jack English Marilyn English Teach Lucy English This Data Base can be formalized as an interpretation I which has the domain D=fJohn,Lucy,Jack,Marilyn,Englishg and the relations Married, Speak and Teach, so that : I =< D; Married; Speak; Teach > Suppose now we add the new information to the Data Base : \Robert teaches French". The new relations are : Married John Lucy Jack Marilyn Speak John English Lucy English Jack English Marilyn English Teach' Lucy English Robert French The new interpretation associated to this Data Base will be : I 0 =< D 0 ; Married; Speak; Teach 0 > with : D 0 = fjohn; Lucy; Jack; Marilyn; English; Robert; Frenchg Let us now consider the formulas : F 1 = :Married(x; Marilyn) F 2 = Speak(x; English)^:Married(x; Marilyn) We have : 6

T(F 1 ; I) = fjohn; Lucy; Marilyn; Englishg T(F 1 ; I 0 ) = fjohn; Lucy; Marilyn; English; Robert; Frenchg so we have T(F 1 ; I) 6= T(F 1 ; I 0 ) but E(F,I,I'). Therefore F 1 is not stable. On the other hand F 2 is stable. Note that E(F,I,I') and : T(F 2 ; I) = fjohn; Lucy; Jackg F 2 is stable because for each < I; I 0 > T(F 2 ; I 0 ) = fjohn; Lucy; Jackg, E(F,I,I') will hold. We can make two remarks about F 1 : If we are dealing with Many Sorted Logic we nd the same anomaly T(F 1 ; I) 6= T(F 1 ; I 0 ). Suppose we have two sorts : Person and Language, and x is of the sort Person in F 1, then we have : T S (F 1 ; I) = fjohn; Lucy; Marilyng T S (F 1 ; I 0 ) = fjohn; Lucy; Marilyn; Robertg that is T S (F 1 ; I) 6= T S (F 1 ; I 0 ), where T S (F; I) is the valuation of F in the Many Sorted interpretation I. If the domain of I is innite then T(F 1 ; I) is an innite set. 2.4 Safe Formulas Safe Formulas were introduced by J.D.Ullman [20] for reasons similar of those of as J.L.Kuhns. We will rst dene the concept of \Domain of a formula". Let F be a formula. We dene D F (x), the domain of F, by : If F is an atomic formula P(t 1 ; t 2 ; : : : ; t n ), where the t i are variable symbols or constant symbols, then : D F (x) = 9x 2 9x 3 : : : 9x n P(x; x 2 ; x 3 ; : : : ; x n )_ 9x 1 9x 3 : : : 9x n P(x 1 ; x; x 3 ; : : : ; x n )_... 9x 1 9x 2 : : : 9x n 1 P(x 1 ; x 2 ; : : : ; x n 1 ; x)_ x = t i1 _ x = t i2 _ : : : _ x = t ip 7

where the t ij are all the constant symbols in F. If F is not an atomic formula, and A 1 ; A 2 ; : : : ; A m are all the atomic formulas occurring in F, then : D F (x) = D A1 (x) _ D A2 (x) _ : : : _ D Am (x) For example, if F = P(a; x) ^ Q(x; y) we have : D F (x) = 9x 2 P(x; x 2 ) _ 9x 1 P(x 1 ; x) _ x = a _ 9x 2 Q(x; x 2 ) _ 9x 1 Q(x 1 ; x) Denition 2 (D2). Safe Formulas : A formula F(x 1 ; x 2 ; : : : ; x r ) whose free variables are x 1 ; x 2 ; : : : ; x r is Safe i : 1. F(a 1 ; a 2 ; : : : ; a r ) ) D F (a 1 ) ^ D F (a 2 ) ^ : : : ^ D F (a r ) 2. For each sub-formula of F of the form 9x j F 0 (x 1 ; x 2 ; ::; x j ; ::; x s ) and for each a 1 ; a 2 ; ::; a j ; ::; a s, we have : F 0 (a 1 ; a 2 ; ::; a j ; ::; a s ) ) D F (a j ). 3. For each sub-formula of F of the form 8x k F 00 (x 1 ; x 2 ; ::; x k ; ::; x t ) and for each a 1 ; a 2 ; ::; a k ; ::; a t, we have : :D F (a k ) ) F 00 (a 1 ; a 2 ; ::; a k ; ::; a t ). where the a i s are constant names. Remark : the denition given by Ullman does not consider the case of closed formulas; the equivalence of the expressive power of Safe Formulas and of Relational Algebra shown by Ullman is valid only for open formulas since truth values cannot be represented by formulas in the Relational Algebra. We note that for any Safe Formula F(x), 9xF(x) (resp. 8xF(x)) and 9x(D F (x)^ F(x)) (resp. 8x(D F (x)! F(x))) have always the same valuation. We note also that the property of a formula of being Safe is not entirely semantic. Indeed there are formulas, for example (9xP(x)) _ (9yQ(y)), which are Safe and which are such that there are equivalent formulas, for example (9x(9y(P(x) _ Q(y))), which are not Safe. 8

2.5 Denite Formulas The Denite Formulas were dened by J.L.Kuhns in [15]. Their denition refers to the concept of the -extension of an interpretation. Let I be the interpretation < D; R 1 ; R 2 ; : : :; R n >. The -extension of I, called I*, is an interpretation which has the same domain plus a new constant, *,which is dierent from all the elements of D, and where the predicates are interpretated by the same relations. That is : I =< D; R 1 ; R 2 ; : : :; R n > I =< D 0 ; R 1 ; R 2 ; : : : ; R n > D 0 = D [ fg Denition 3 (D3). Formulas) Denite Formulas ( or Domain Independent A formula is denite i : 8I (T(F; I) = T(F; I)) It is intuitively clear, and is formally proved in [17], that the class of stable formulas is identical to the class of denite formulas. Moreover this class is identical to the class of formulas which are equivalent to a Safe formula. In the sequel this class will be called the class of Domain Independent formulas, and for convenience we will use in the proofs the denition (D3) for the denition of Domain Independent formulas. Remark : the reason why we consider only formulas without the equality predicate in this paper is that the interpretation of this predicate is not free, in particular any interpretation must satisfy 8x(x = x). The consequence is that the equality interpretation cannot be the same in I and I*, because in I* we have =. The denitions and the results presented in the next section could be adapted to a language with equality, but the denitions would be more complex at the level of atomic formulas, because we might have to distinguish atomic formulas formed with equality. 9

3 Denition of Evaluable Formulas We will give a syntactic characterization 2 of a subset of Domain Independent formulas which is based on the concepts of Restricted, Unrestricted, Positive and Negative variables. We will use the following notations : FRes(x,F) : means that the free variable x is Restricted in formula F. FUnres(x,F) : means that the free variable x is Unrestricted in formula F. FPos(x,F) : means that the free variable x is Positive in formula F. FNeg(x,F) : means that the free variable x is Negative in formula F. Free(x,F) : means that the variable x is free in formula F. Before formally dening these concepts, we present the intuitive ideas behind them. The rst idea is to dene the property of Restrictedness in such a way that if a free variable x is restricted in a formula F, then for any tuple satisfying F the instantiation of x must belong to the interpretations of the predicates which appear in F. If a formula contains only free variables, and if this property holds for all its free variables, we can see that the truth value of F, for a given instantiation of its variables depends only on the predicate interpretations, and therefore F is Domain Independent. The second idea is that it is not necessary to impose on quantied variables conditions as strong as for free variables. For example on existentially quantied variables we impose the condition that they be Positive, which is weaker than requiring them to be Restricted. Positiveness guarantees that for the quantied subformula F 1 (x) there is at least one instantiation of the quantied variable x satisfying F 1 (x) in I i there is at least one instantiation of x satisfying F 1 (x) in I*. For example, for F = 9x(P(x) _ Q(a)), the number of x instantiations satisfying F 1 (x) may be dierent in I and I*, although the truth value of F is always the same in I and I*. The third idea is to dene recursively the properties imposed on the variables in terms of the logical operators, and in terms of the sub-formulas. The conditions imposed for each logical operator are not independent. In particular if we call Property one of the properties : FRes, FUnres, FPos or FNeg, and if we respectively call Property one of the properties : FUnres, FRes, FNeg or 2 Another presentation of Evaluable Formulas can be found in [8] or in [10]; the denition in [8] contains a slight error. 10

FPos, we will dene Property(x; :F) in terms of Property(x; F). That is, we impose the conditions : Property(x; :F), Property(x; F) Property(x; :F), Property(x; F) as a consequence we have : Property(x; ::F), Property(x; F) Another consequence of our denitions is that whenever we impose a Property on an existentially quantied variable, we impose Property on a universally quantied variable in the same situation, because we have 8x(F) equivalent to :9x(:F), and Property(x; :F) is equivalent to Property(x; F). We note also that Property(x; F 1 ^ F 2 ) and Property(x; F 1 _ F 2 ) are respectively dened in terms of Property(x; F 1 _ F 2 ) and Property(x; F 1 ^ F 2 ).For example we have : Property(x; F 1 ^ F 2 ), Property(x; (:F 1 ) _ (:F 2 )) because F 1 ^ F 2 is equivalent to :((:F 1 ) _ (:F 2 )). Thus we can get the denition of Property(x; F 1 ^ F 2 ) from the denition of Property(x; (:F 1 ) ^ (:F 2 )) by replacing Property(x; :F 1 ) and Property(x; :F 2 ), by Property(x; F 1 ) and Property(x; F 2 ), respectively. In conclusion the denitions of FUnres and FNeg can be derived from the denitions of FRes and FPos. Then to understand the intuitive meaning of FRes, FPos, FUnres and FNeg we can concentrate on FRes and FPos alone. The denition of FPos is not very intuitive and needs some comments in the case of a subformula of the form F 1 ^ F 2 where the variable x is free in both F 1 and F 2. In that case it is not enough to impose the condition on the variable x that it be Positive either in F 1 or in F 2 (as is the case for the FRes denition). Indeed we may have formulas like : (P(x) _ Q(y)) ^ :R(x) where x is Positive in P(x)_Q(y) and x is not Positive in :R(x), and 9y9x((P(x)_Q(y))^:R(x)) is not Domain Independent. However a formula like :9y9x((P(x)_Q(y))^(R(x)_S(y)), where x is Positive in both formulas P(x) _ Q(y) and R(x) _ S(y), is Domain 11

Independent. Another way to have a Domain Independent formula when x is free in both operands but not Positive in both operands, is to require x to be restricted in one operand. Indeed in that case x is Restricted in F 1 ^ F 2 and thus Positive in F 1 ^ F 2. For example the formula : 9x(P(x) ^ :R(x)) is Domain Independent, though x is free but not Positive in :R(x), because x is Restricted in P(x). Though it is intuitively easier to understand Denition 4 and Denition 5 when we consider only FRes and FPos and the logical operators ^; _; :; 8 and 9, from a technical point of view the proofs of the following theorems T1, T2 and T3 are simpler if we consider only the logical operators _; : and 9. For this reason the denitions (D4) and (D5) are presented with the logical operators _; : and 9, for the formal proofs, and they are extended to the operators ^ and 8, for intuitive understanding and in view of practical implementation. Of course the proofs given for formulas containing _; : or 9 are also valid for formulas containing ^ or 8 since these operators can be rewritten in terms of the former ones. Denition 4 (D4). Restricted and Unrestricted variables : FRes(x; F), Free(x; F) ^ FResCase(x; F) FResCase(x; F 1 _ F 2 ), FRes(x; F 1 ) ^ FRes(x; F 2 ) FResCase(x; 9yF 1 ), FRes(x; F 1 ) FResCase(x; :F 1 ), FUnres(x; F 1 ) (AtomicFormula(F) ) FResCase(x; F), True) FUnres(x; F), Free(x; F) ^ FUnresCase(x; F) FUnresCase(x; F 1 _ F 2 ), FUnres(x; F 1 ) _ FUnres(x; F 2 ) FUnresCase(x; 9yF 1 ), FUnres(x; F 1 ) FUnresCase(x; :F 1 ), FRes(x; F 1 ) (AtomicFormula(F) ) FUnresCase(x; F), False) If we accept in the language the ^ and 8 operators we have : FResCase(x; F 1 ^ F 2 ), FRes(x; F 1 ) _ FRes(x; F 2 ) FResCase(x; 8yF 1 ), FRes(x; F 1 ) FUnresCase(x; F 1 ^ F 2 ), FUnres(x; F 1 ) ^ FUnres(x; F 2 ) 12

FUnresCase(x; 8yF 1 ), FUnres(x; F 1 ) Denition 5 (D5). Positive and Negative variables : FPos(x; F), Free(x; F) ^ FPosCase(x; F) FPosCase(x; F 1 _ F 2 ), (Free(x; F 1 ) ) FPos(x; F 1 )) ^ (Free(x; F 2 ) ) FPos(x; F 2 )) FPosCase(x; 9yF 1 ), FPos(x; F 1 ) FPosCase(x; :F 1 ), FNeg(x; F 1 ) (AtomicFormula(F) ) FPosCase(x; F), True) FNeg(x; F), Free(x; F) ^ FNegCase(x; F) FNegCase(x; F 1 _ F 2 ), (Free(x; F 1 ) ^ Free(x; F 2 ) ) (FNeg(x; F 1 ) ^ FNeg(x; F 2 )) _ FUnres(x; F 1 ) _ FUnres(x; F 2 )) ^ (Free(x; F 1 ) ^ :Free(x; F 2 ) ) FNeg(x; F 1 )) ^ (Free(x; F 2 ) ^ :Free(x; F 1 ) ) FNeg(x; F 2 )) FNegCase(x; 9yF 1 ), FNeg(x; F 1 ) FNegCase(x; :F 1 ), FPos(x; F 1 ) (AtomicFormula(F) ) FNegCase(x; F), False) If we accept in the language the logical operators ^ and 8 we have : FPosCase(x; F 1 ^ F 2 ), (Free(x; F 1 ) ^ Free(x; F 2 ) ) (FPos(x; F 1 ) ^ FPos(x; F 2 )) _ FRes(x; F 1 ) _ FRes(x; F 2 )) ^ (Free(x; F 1 ) ^ :Free(x; F 2 ) ) FPos(x; F 1 )) ^ (Free(x; F 2 ) ^ :Free(x; F 1 ) ) FPos(x; F 2 )) FPosCase(x; 8yF 1 ), FPos(x; F 1 ) FNegCase(x; F 1 ^ F 2 ), (Free(x; F 1 ) ) FNeg(x; F 1 )) ^ (Free(x; F 2 ) ) FNeg(x; F 2 )) FNegCase(x; 8yF 1 ), FNeg(x; F 1 ) 13

Corollaries We can easily prove the following propositions from denitions D4 and D5. 1. FRes(x; F) ) FPos(x; F) 2. FUnres(x; F) ) FNeg(x; F) 3. FUnres(x; F), FRes(x; :F) 4. FNeg(x; F), FPos(x; :F) 5. A variable may be neither Restricted nor Unrestricted in a formula. For example the variable x in the formula : P(x) _ Q(y). 6. A variable may be neither Positive nor Negative in a formula. For example the variable x in the formula : P(x) _ :Q(x). Denition 6 (D6). Evaluable Formulas A formula F is evaluable i : each free variable of F is Restricted in F, each existentially quantied variable is Positive in the subformula which is in the range of its quantier, each universally quantied variable is Negative in the subformula which is in the range of its quantier. It is easy to check that the following formuals are Evaluable : ::Q(x), P(x) ^ :Q(x), (P(x) ^ :Q(x; y)) ^ (R(y) ^ :S(x; y)), P(x) _ Q(x) 9x(P(x) _ Q(a)), 9x9y(P(x) _ Q(y)), 8x(:(P(x) _ Q(a)) _ :R(x)), 8x(:R(x) _ P(x) _ Q(a)), 9y8x(:P(x) _ Q(x; y). The following formulas are not Evaluable : P(x) _ Q(y), 8x(:P(x) _ Q(x; y)), 8x(:(P(x) _ Q(a)) _ R(x)). 14

4 Evaluable Formulas are Domain Independent Before proving that Evaluable formulas are Domain Independent we need to prove two preliminary theorems. Notations : X =< x 1 ; x 2 ; : : : ; x n > A =< a 1 ; a 2 ; : : : ; a n > F(X) denotes that the free variables of F are < x 1 ; x 2 ; : : : ; x n > F(A) denotes F after replacing each variable x i by a i s(f) = number of logical symbols : ; _ ; 9 occurring in F Theorem 1 (T1) Let F(x 1 ; x 2 ; : : :; x n ) be a formula whose free variables are x 1 ; x 2 ; : : : ; x n. For each interpretation I, and for the -extension I* of I, if a tuple A =< a 1 ; a 2 ; : : : ; a n > satises F in I* (i.e. T(F(A); I) = t ), and if the variable x i is Restricted in F, then the value a i of x i is dierent from. Proof The proof is by induction on s(f). Induction hypothesis (H1). H1 can be written : (H1) FRes(x i ; F) and T(F(A); I) = t and s(f) < p + 1 ) a i 6= We have to prove that : (1) FRes(x i ; F) and (2) T(F(A); I) = t and (3) s(f) = p + 1 implies a i 6= We consider all the possible forms of F. Case F = F 1 _ F 2 : (D4),(1) ) (4) FRes(x i ; F 1 ) and FRes(x i ; F 2 ) 15

(2) ) T(F 1 (A 1 ; I)) = t or T(F 2 (A 2 ; I)) = t We suppose, for example, that : (5) T(F 1 (A 1 ; I) = t (4) ) (6) FRes(x i ; F 1 ) (H1); (5); (6); s(f) < p + 1 ) a i 6= Case F = 9x n+1 F 1 (X; x n+1 ) : (D4),(1) ) (4) FRes(x i ; F 1 ) (2) ) exists a n+1 such that (5) T(F 1 (A; a n+1 ; I) = t (H1); (4); (5); s(f 1 ) = p ) a i 6= Case F = :F 0 : Sub-case F 0 = :F 1 : (D4),(1) ) (4) FUnres(x i ; F 0 ) (D4),(4) ) (5) FRes(x i ; F 1 ) (2), F F 1 ) (6) T(F 1 (A; I) = t (H1); (5); (6); s(f 1 ) = p 1 ) a i 6= Sub-case F 0 = F 1 _ F 2 : (D4),(1) ) (4) FUnres(x i ; F 0 ) (D4),(4) ) (5) FUnres(x i ; F 1 ) or FUnres(x i ; F 2 ) (D4),(5) ) (6) FRes(x i ; :F 1 ) or FRes(x i ; :F 2 ) (2) ) (7) T(F 1 (A 1 ; I) = f and T(F 2 (A 2 ; I) = f (7) ) (8) T(:F 1 (A 1 ; I) = t and T(:F 2 (A 2 ; I) = t (H1); (8); (7); s(:f 1 ) < p + 1 and s(:f 2 ) < p + 1 ) a i 6= Sub-case F 0 = 9x n+1 F 1 (X; x n+1 ) : (D4),(1) ) (4) FUnres(x i ; F 1 ) (D4),(4) ) (5) FRes(x i ; :F 1 ) (2) ) (6) 8a n+1 2 D 0 T(F 1 (A; a n+1 ; I) = f (6) ) (7) 8a n+1 2 D 0 T(:F 1 (A; a n+1 ; I) = t (H1); (5); (7); s(:f 1 ) = p ) a i 6= 16

Sub-case F'= atomic formula : We do not have to consider this case because F'= atomic formula and FRes(x i ; :F 0 ) are contradictory. Therefore in all cases where s(f) = p + 1, (1) and (2) imply a i 6=. Basis of induction : For p = 0 F is an atomic formula. By denition of I there is no tuple in the relations of I containing *, therefore T(F(A); I) = t ) a i 6=. Lemma 1 (L1) If F is an open Evaluable Formula then : T(F; I) \ D n = T(F; I). Proof : Let X be the set of free variables of F. From (D6), for each x i in X we have Res(x i ; F). Let A be an element of D 0n. From (T1), if we have T(F(A),I*)=t, all the elements of A are dierent of *, then A 2 D n. Hence the valuation of F in I* is included in D n. Denition 7 (D7). Quasi-Evaluable Formulas A formula F is quasi-evaluable i : each free variable of F is Positive in F, each existentially quantied variable is Positive in the subformula which is in the range of its quantier, each universally quantied variable is Negative in the subformula which is in the range of its quantier. Notations : Let A be a tuple of D 0n. A(D) stands for the set of tuples obtained from A by substituting for any occurrence of * all possible elements of D. For example if D 0 = fa; b; g and A =< a; ; b; > then A(D) = f< a; a; b; a >; < a; b; b; a >; < a; a; b; b >; < a; b; b; b >g. Let X be a tuple of variables,let Y be a sub-tuple of X and let A be a tuple which is a value of X; A[Y] stands for the tuple obtained from A by substituting 17

for the variables of Y their corresponding values in A. For example if X =< x 1 ; x 2 ; x 3 ; x 4 >, Y =< x 1 ; x 3 > and A =< a 1 ; a 2 ; a 3 ; a 4 > then A[Y] =< a 1 ; a 3 >. Theorem 2 (T2) For each interpretation I, if F(X) is a quasi-evaluable formula and T(F(A); I) = t then for each A' in A(D) T(F(A 0 ); I) = t, where I is the -extension of I. Proof : the proof is by induction on s(f). Induction hypothesis (H2). H2 can be written : (H2) F(X) quasi evaluable and T(F(A); I) = t and s(f) < p + 1 ) 8A 0 2 A(D) T(F(A 0 ); I) = t We have to prove that : (1) F(X) quasi-evaluable and (2) T(F(A); I) = t and (3) s(f) = p + 1 implies 8A 0 2 A(D) T(F(A 0 ); I) = t We consider all the possible forms of F : Case F = F 1 _ F 2 : Notations : X 1 : tuple of free variables of F 1 X 2 : tuple of free variables of F 2 A 1 : projection of A on X 1 A 2 : projection of A on X 2 Let x i be a variable of X 1 : (D5),(1) ) (4) FPos(x i ; F 1 ) (4) ) (5) F 1 (X 1 ) quasi-evaluable We prove in the same way that : (6) F 2 (X 2 ) quasi-evaluable (2) ) T(F 1 (A 1 ); I) = t or T(F 2 (A 2 ); I) = t 18

We suppose that : (7) T(F 1 (A 1 ); I) = t (H2),(5),(7), s(f 1 ) < p + 1 ) (8) 8A 0 1 2 A 1(D) T(F 1 (A 0 1 ); I) = t (8) ) (9) 8A 0 2 A(D) T(F 1 (A 0 1); I) = t (9) ) (10) 8A 0 2 A(D) T(F 1 (A 0 1 ) _ F 2(A 0 2 ); I) = t (10) ) (11) 8A 0 2 A(D) T(F(A 0 ); I) = t Case F = 9x n+1 F 1 (X; x n+1 ) : (2) ) 9a n+1 2 D 0 (3) T(F 1 (A; a n+1 ); I) = t (D5),(1) ) (4) F 1 (X; a n+1 ) quasi-evaluable (H2),(3),(4),s(F 1 ) = p ) (5) 8A 0 2 A(D) T(F 1 (A 0 ; a n+1 ); I) = t (5) ) (6) 8A 0 2 A(D) T(9x n+1 F 1 (A 0 ; x n+1 ); I) = t (6) ) (7) 8A 0 2 A(D) T(F(A 0 ); I) = t Case F = :F 0 : Sub-case F 0 = :F 1 : Let x i be a variable of X : (D5),(1) ) (3) FNeg(x i ; F 0 ) (D5),(3) ) (4) FPos(x i ; F 1 ) (2) ) (6) T(F 1 (A); I) = t (H2),(4),(6), s(f 1 ) = p 1 ) (7) 8A 0 2 A(D) T(F 1 (A 0 ); I) = t (7),F F 1 ) (8) 8A 0 2 A(D) T(F(A 0 ); I) = t Sub-case F 0 = F 1 _ F 2 : We will use the notations : X 1 : tuple of free variables of F 1 Y 1 : tuple of negative variables of F 1 Z 1 : tuple of non-negative variables of F 1 A 1 = A[X 1 ] A 0 1 = A 0 [X 1 ] B 1 = A[Y 1 ] B 0 1 = A0 [Y 1 ] C 1 = A[Z 1 ] C 0 1 = A 0 [Z 1 ] 19

Similar notations will be used for F 2. Let x i be a variable of Z 1. From the notations we have : (4) not FNeg(x i ; F 1 ) (D5),(1) ) (5) FNeg(x i ; F 1 _ F 2 ) (D5),(4),(5),(Corollary 2) ) (6) FUnres(x i ; F 2 ) (6) ) (7) FRes(x i ; :F 2 ) (2) ) (8) T(:F 1 (A 1 ); I) = t and (9) T(:F 2 (A 2 ); I) = t (T1),(7),(9) ) (10) a i 6= Since (10) holds for any variable in Z 1 we have : (11) C 1 does not contain * In the same way we have : (12) C 2 does not contain * From the notations we have : (13) :F 1 (Y 1 ; C 1 ) quasi-evaluable (H2),(8),(13), s(:f 1 ) < p + 1 ) (14) 8B 0 1 2 B 1(D) T(:F 1 (B 0 1 ; C 1); I) = t (14),(11) ) (15) 8A 0 1 2 A 1 (D) T(:F 1 (A 0 1); I) = t In the same way we have : (16) 8A 0 2 2 A 2(D) T(:F 2 (A 0 2 ); I) = t (15),(16) ) (17) 8A 0 2 A(D) T((:F 1 (A 0 1)) ^ (:F 2 (A 0 2)); I) = t (17), :F 0 (:F 1 ) ^ (:F 2 ) ) 8A 0 2 A(D) T(:F 0 (A 0 ); I) = t Sub-case F 0 = 9x n+1 F 1 (X; x n+1 ) : (2) ) (4) 8a n+1 2 D 0 T(F 1 (A; a n+1 ); I) = f (4) ) (5) 8a n+1 2 D 0 T(:F 1 (A; a n+1 ); I) = t (D5),(1) ) (6) :F 1 (X; a n+1 ) quasi-evaluable (H2),(5),(6),s(:F 1 ) = p ) (7) 8a n+1 2 D 0 8A 0 2 A(D) T(:F 1 (A 0 ; a n+1 ; I) = t (7) ) (8) 8A 0 2 A(D) 8a n+1 2 D 0 T(F 1 (A 0 ; a n+1 ; I) = f (8) ) 8A 0 2 A(D) T(:9x n+1 F 1 (A 0 ; x n+1 ); I) = t Sub-case F'=atomic formula : we do not have to consider this case because F'= atomic formula and FPos(x i ; :F 0 ) are contradictory. 20

Basis of induction : For p=0, F is an atomic formula, so in all cases the variables of F are positive. If T(F(A); I) = t, by construction of I there can be no occurrence of * in A, therefore A(D)=fAg and T(F(A); I) = t ) 8A 0 2 A(D) T(F(A 0 ); I) = t. Theorem T3 (T3). If F is an Evaluable formula then F is Domain Independent. Proof : the proof is by induction on s(f). Induction hypothesis (H3). (H3) F Evaluable and s(f) < p + 1 ) F Domain Independent We have to prove that : (1) F Evaluable and (2) s(f) = p + 1 implies 8I T(F; I) = T(F; I) We consider all the possible forms of F : Let X be the tuple of variables which are free in F. Let x i be a variable of X. Case F = F 1 _ F 2 : (1), Free(x i ; F) ) (3) FRes(x i ; F) (D4),(3) ) (4) FRes(x i ; F 1 ) and FRes(x i ; F 2 ) (D4),(4) ) (5) Free(x i ; F 1 ) and Free(x i ; F 2 ) Each variable which is free in F is also free in F 1 and F 2 hence, if we call X 1 and X 2 the free variables of F 1 and F 2, then : (5) ) (6) X = X 1 = X 2 (6) ) (7) T(F; I) = T(F 1 ; I) [ T(F 2 ; I) (1),(4) ) (8) F 1 Evaluable (1),(4) ) (9) F 2 Evaluable (H3),(8), s(f 1 ) < p + 1 ) (10) T(F 1 ; I) = T(F 1 ; I) (H3),(9), s(f 2 ) < p + 1 ) (11) T(F 2 ; I) = T(F 2 ; I) (7),(10),(11) ) (12) T(F; I) = T(F 1 ; I) [ T(F 2 ; I) (12) ) (13)T(F; I) = T(F; I) If F is a closed formula, we have a similar proof replacing (7) by T(F; I) = 21

T(F 1 ; I) _ T(F 2 ; I). Case F = 9x n+1 F 1 (X; x n+1 ) : (1) ) (3) FRes(x i ; F) (D4),(3) ) (4) FRes(x i ; F 1 ) Since we have FRes(x i ; F 1 ) for each free variable in F 1 (X; a) we have : (D6),(1),(4) ) (5) F 1 (X; a) Evaluable (H3),(5), s(f 1 ) = p ) (6) T(F 1 (X; a); I) = T(F 1 (X; a); I) Until sentence (17) we assume A 2 D n. From the denition of T we have : (7) T(F(A); I) = t, 9a 2 D T(F 1 (A; a); I) = t (6),(7) ) (8) T(F(A); I) = t, 9a 2 D T(F 1 (A; a); I) = t (D4),(1) ) (9) FPos(x n+1 ; F 1 ) (9) ) (10) F 1 (A; x n+1 ) quasi-evaluable From the denition of T we have : (11) (9a 2 D 0 T(F 1 (A; a); I) = t), ((9a 2 D T(F 1 (A; a); I) = t) or T(F 1 (A; ); I) = t) (T2),(10) ) (12) T(F 1 (A; ); I) = t ) 8a 2 D T(F 1 (A; a); I) = t) (12) ) (13) T(F 1 (A; ); I) = t ) 9a 2 D T(F 1 (A; a); I) = t) (11),(13) ) (14) (9a 2 D 0 T(F 1 (A; a); I) = t), (9a 2 D T(F 1 (A; a); I) = t) (8),(14) ) (15) (T(F(A); I) = t), (9a 2 D 0 T(F 1 (A; a); I) = t) (15) ) (16) (T(F(A); I) = t), (T(F(A); I) = t) (16) ) (17) T(F; I) = T(F; I) \ D n (1),(L1),(17) ) (18)T(F; I) = T(F; I) If F is a closed formula the proof is very similar replacing (10) by F 1 (x n+1 ) evaluable, and F 1 (A; a) by F 1 (a). Case F = :F 0 : Sub-case F 0 = :F 1 : (D6),(1) ) (3) FRes(x i ; F) (D4),(3) ) (4) FUnres(x i ; F 0 ) (D4),(4) ) (5) FRes(x i ; F 1 ) (D6),(1),(5) ) (6) F 1 (X) evaluable (H3),(6), s(f 1 ) = p 1 ) (7) T(F 1 ; I) = T(F 1 ; I) (7), F F 1 ) (8) T(F; I) = T(F; I) 22

Sub-case F 0 = F 1 _ F 2 : We will use notations similar to those in the proof of Theorem 2. X : tuple of free variables of F X 1 : tuple of free variables of F 1 Y 1 : tuple of unrestricted variables of F 1 Z 1 : tuple of non unrestricted variables of F 1 A : instance of X A 1 = A[X 1 ] B 1 = A[Y 1 ] C 1 = A[Z 1 ] We have similar notations for X 2 ; Y 2 ; Z 2 ; A 2 ; B 2 ; and C 2, with respect to F 2. Let x i be a variable of Z 2 : (1) ) (3) FRes(x i ; :(F 1 _ F 2 )) (D4),(3) ) (4) FUnres(x i ; F 1 _ F 2 ) (D4),(4) ) (5) FUnres(x i ; F 1 ) or FUnres(x i ; F 2 ) Z 2 denition ) (6) FUnres(x i ; F 2 ) (6) ) (7) FRes(x i ; :F 2 ) Until sentence (17) we assume A 2 D n. From T denition we have : (8) T(F(A); I) = t, (9) T(:F 1 (B 1 ; C 1 ); I) = t and (10) T(:F 2 (B 2 ; C 2 ); I) = t (T1),(7),(10) ) (11) C 1 does not contain * (D4),(1) ) (12) :F 1 (Y 1 ; C 1 ) evaluable (H3),(12),(11),s(:F 1 ) < p + 1 ) (13) (T(:F 1 (B 1 ; C 1 ); I) = t, T(:F 1 (B 1 ; C 1 ); I) = t) In the same way we have : (14) (T(:F 2 (B 2 ; C 2 ); I) = t), T(:F 2 (B 2 ; C 2 ); I) = t) (8),(13),(14) ) (15) T(F(A); I) = t, T(:F 1 (A 1 ); I) = t and T(:F 2 (A 2 ); I) = t (15) ) (16) (T(F(A); I) = t, T(F(A); I) = t) (16) ) (17) T(F; I) \ D n = T(F; I) (1),(L1),(17) ) (18) T(F; I) = T(F; I) Sub-case F 0 = 9x n+1 F 1 (X; x n+1 ) : 23

(1) ) (3) Pos(x n+1 ; F 1 (X; x n+1 )) (3) ) (4) F 1 (A; x n+1 ) quasi-evaluable (1) ) (5) :F 1 (X; a) evaluable Until sentence (19) we assume A 2 D n. From T denition we have : (6) T(F(A); I) = t, (7)8a 2 D 0 T(:F 1 (A; a); I) = t (7), (8)8a 2 D T(:F 1 (A; a); I) = t and (9) T(:F 1 (A; ); I) = t (H3),(5),s(:F 1 ) = p ) (10) T(:F 1 (A; a); I) = T(:F 1 (A; a); I) Hence: (8), (11) 8a 2 D T(:F 1 (A; a); I) = t Therefore, from T denition : (8), (12) T(F(A); I) = t Let's assume that (9) is false, i.e. : (13) T(:F 1 (A; ); I) = f (13) ) (14) T(F 1 (A; ); I) = t (T2),(4),(14) ) (15) 8a 2 D T(F 1 (A; a); I) = t (15) ) (16) 8a 2 D T(:F 1 (A; a); I) = f (16) ) (17) 9a 2 D T(:F 1 (A; a); I) = f Since (17) contradicts (8), the assumption that (9) is false implies that (8) is false, then we have : (8) ) (9) Therfore we have : (6), (8) and (9), (8), (12) Hence we have : (6), (12), that is : (18) T(F(A); I) = t, T(F(A); I) = t (18) ) (19) T(F; I) \ D n = T(F; I) (1),(L1),(19) ) T(F; I) = T(F; I) In the above sub-cases we have very similar proofs if F is closed formula. Sub-case F'= atomic formula. If F' is an open formula then :F 0 cannot be evaluable and we do not have to consider this case. If F is a closed formula, from the denition of I, T(F 1 ; I) = T(F 1 ; I), therefore T(:F 1 ; I) = T(:F 1 ; I). Basis of induction : For s(f)=0 F is an atomic formula and from the denition of I, T(F; I) = T(F; I). 24

5 Comparison with other classes of formulas 5.1 Comparison with Range Restricted formulas Range Restricted formulas were introduced by J-M.Nicolas in [17] in order to characterize a subset of closed formulas in conjunctive prenex normal form, which are Domain Independent. We will extend this denition to open formulas and then to formulas in disjunctive prenex normal form. Notations : Formulas in prenex normal form will be noted F=QM where Q is the list of quantiers and M the matrix. For formulas in conjunction normal form we will use the notation : M = C 1 ^ C 2 ^ : : : ^ C n C i = L i1 _ L i2 _ : : : _ L ip where L ij is a positive atomic formula (form P ij ) or a negative atomic formula (form :P ij ). Denition 8 (D8). Range Restricted Formulas in conjunctive prenex normal form. A formula in conjunctive prenex normal form is a Range Restricted formula i : for each free variable x there is a conjunct C i such that all its literals L ij contain x and are positive, for each existentially quantied variable x, if x occurs in a negative literal, then there is a conjunct C i such that all its literals L ij contain x and are positive, for each universally quantied variable x, if x occurs in a conjunct C i, then one of its literals L ij contains x and is negative. Theorem 4 (T4) If F is a formula in conjunctive prenex normal form then F is Range Restricted i F is Evaluable. 25

Proof : Let F be a formula in conjunctive prenex normal form. Case of free variables. Let x be a free variable of F. (D6) ) (F Evaluable, FRes(x; F)) (D4) ) (FRes(x; F), FRes(x; M) (D4) ) (FRes(x; M), 9i FRes(x; C i )) (D4) ) (9i FRes(x; C i ), 9i8j FRes(x; L ij )) (D4) ) (9i8j FRes(x; L ij ), 9i(8jL ij is a positive literal in which x occurs)) Therefore with respect to free variables : F Evaluable, F Range Restricted. Case of existentially quantied variables. Let x be an existentially quantied variable of F. We assume F to be of the form Q 1 9xQ 2 M. We will call C 1 ; C 2 ; : : :; C m the C i s in which x occurs, and C m+1 ; : : : ; C n the C i s in which x does not occurs. (D6) ) (F Evaluable, FPos(x; Q 2 M)) (D5) ) (FPos(x; Q 2 M), (1) FPos(x; M)) (D5) ) ((1), (2) FPos(x; C 1 ^ C 2 ^ : : : ^ C m )) (D5) ) ((2), ((3) 8i 2 [1; m] FPos(x; C i ) or (4) 9i 2 [1; m] FRes(x; C i ))) (D5) ) (FPos(x; C i ), 8j (FPos(x; L ij ) or not Free(x; L ij ))) Therefore : (3), (5) 8i 2 [1; m] x occurs only in positive literals of C i By denition of m : (5), x occurs only in positive literals of M (D4) ) (4), (6) 9i 2 [1; m] 8j FRes(x; L ij ) (D4) ) (6), (7) 9i 2 [1; m] 8jL ij is a positive literal in which x occurs Therefore : (3) or (4), (5) or (7), If x occurs in a negative literal then 9i8jL ij is a positive literal in which x occur. Therefore, with respect to existentially quantied variables : F Evaluable, F Range Restricted 26

Case of universally quantied variables. Let x be a universally quanti- ed variable. We assume F to be of the form F = Q 0 1 8xQ0 2 M. The proof is very similar to the case of existentially quantied variables. The denition (D7) can easily be transformed for formulas in disjunctive normal form. We will use the notations : M = D 1 _ D 2 _ : : : _ D n D i = L i1 ^ L i2 ^ : : : ^ L ip Denition 9 (D9). Range Restricted Formulas in disjunctive prenex normal form. A formula in disjunctive prenex normal from is a Range Restricted formula i : for each free variablex, for each disjunct D i there is one of its literals L ij which contains x and is positive, for each existentially quantied variable x, for each disjunct D i, if x occurs in D i, then ther is at least one of its literal L ij which contains x and is positive, for each universally quantied variable x, ther is a disjunct D i such that all its literals L ij contain x and are negative. Theorem 5 (T5). If F is a formula in disjunctive prenex normal form then F is Range Restricted i F is Evaluable. Proof : the proof is quite similar to the proof of Theorem T4. From theorems T4 and T5 we can conclude that in the set of formulas in conjunctive or disjunctive prenex normal form, the subset of Evaluable formulas is the same as the subset of Range Restricted formulas. However there are formulas of this form which are Domain Independent and which are neither Evaluable nor Range Restricted. For example the formula 9x(P(x) _ :P(x)). 27

If we call Prenex Normal Form the formulas in conjunctive or in disjunctive prenex normal form, the structure of these classe of formulas is represented by the relations : Range Restricted Evaluable Domain Independent Range Restricted = Evaluable \ Prenex Normal Form We may note that the denition of Range Restricted formulas cannot be used to recognize Domian Independent formulas which are not in normal form.suppose we extend the denition of Range Restricted formulas as follows : \F is Range Restricted i when F is transformed in prenex normal form, F satises D8 or D9". Let's call C the class of formulas having this property. This class is not solvable because a given formula F may have an innite number of equivalent formulas which are in normal form; some of them may satisfy D8 or D9 and others formulas may not. Therefore denitions D8 and D9 can be used only for formulas which are in normal form. For example the formula : F = 9x9y( P(y) _ ((Q(x) _ R(x)) ^ : S(x)) ) yields the in disjunctive prenex normal form : F N1 = 9x9y( P(y) _ (Q(x) ^ :S(x)) _ (R(x) ^ :S(x)) ) and yields the in conjunctive normal form : F N2 = 9x9y( (P(y) _ Q(x) _ R(x)) ^ (P(y) _ :S(x)) ). The formula F N1 satises D9 but the formula F N2 does not satisfy D8. Note that the formula F is evaluable. 5.2 Comparison with Permissible formulas The denition of Permissible formulas introduced by E.C.Cooper in [7] is very close to the denition D8. However there are some dierences which unfortunately allow it to encompass formulas which are not Domain Independent. The condition imposed on existentially quantied variables is the same as in denition D8, but the same condition is imposed on free variables. That is why a formula like P(x) _ Q(y) satises the denition although it is not Domain Independent. The condition imposed on universally quantied variables is slightly dierent 28

from the denition D8. It can be expressed as follows : \for each universally quantied variable x there exists i such that x occurs in D i and for each j, if x occurs in L ij then L ij is a negative literal." This condition is satised by the formula 8x( (:P(x) ^ Q(a)) _ R(x) ) which is not Doamin Independent. Indeed in the interpretation where Q(a) is false the formula is equivalent to 8xR(x) which is not Domain Independent. 5.3 Comparison with Range Separable formulas E.F.Codd introduced in [6] the deniton of Range Separable formulas for tuple Relational Calculus, but the translation from the tuple Relational Calculus to Predicate Calculus is straightforward and can be found in [20]. In the following we assume that Codd's denition is translated into Predicate Calculus. In the rst step \Proper Range formulas" are dened. They are formulas without quantiers, with only one free variable, and either the : operator does not occur at all or it immediately follows an ^ operator. Moreover comparison predicates =; <; >; ldots are fobidden inthese formulas.the consequence is that all free variables of a Proper Range formula are Restricted. \Range Coupled Quantiers" are then dened; these are of the form : 9x( (x)^ (x)) or 8x(: (x)_(x)), where is a proper range formula. The consequence is that x is Restricted, and therefore Positive in (x) ^ (x), and x is Unrestricted and therefore Negative in : (x) _ (x). So, if all the quantiers are range coupled all the quantied variables satisfy the conditions imposed in the denition of Evaluable formulas. Finally Range Separable formulas are of the form : U 1 ^ U 2 ^ : : : ^ U n ^ V where all the U i are proper range formulas and each variable of V occurs in an U i ; moreover all the quantiers in V are range coupled and all the predicates in V are comparison predicates, save in the case of proper range formulas which are used to restrict the range of quantied variables. The consequence is that each free variable of a Range Separable formula is REstricted because it is Restricted in one of the U i. Therefore each Range Separable formula is Evaluable. The reverse is not true. There are Evaluable formulas, like (P(x; y)^q(x)) _ (R(x; y)^s(y)), which are not Range Separable. 29

5.4 Comparison with Predened Variable Formulas This class of formula was introduced by G.Bossu and P.Siegel in [4] to dene a decidable procedure to deduce information from Data Bases where the informationis is represented with generalized Horn clauses. Denition 10 (D10). Predened Variable Formulas. The Predined Variable Formulas are dened by the following rules : 1. If A 1 ; A 2 ; : : : ; A n are atomic formulas with free variables x 1 ; x 2 ; : : : ; x p, then 8x 1 8x 2 : : :8x p (:A 1 _ :A 2 _ : : : _ :A n ) is a Predened Variable formula. 2. If F 1 ; F 2 ; : : : ; F n are Predened Variable formulas then :(F 1 ^F 2^: : :^F n ) is a Predened Variable formula. 3. If A 1 ; A 2 ; : : : ; A n are atomic formulas with free variables x 1 ; x 2 ; : : : ; x p and F 1 ; F 2 ; : : : ; F m are Predened Variable fromulas, then 8x 1 8x 2 : : : 8x p (:A 1 _ :A 2 _ : : : _ :A n _ :(F 1 ^ F 2 ^ : : : ^ F m )) is a Predened Variable formula. 4. All the Predened Variable formulas are dened by the rules 1, 2 and 3. It is easy to prove that each Predened Variable formula is an Evaluable formula. Indeed formulas dened by rule 1 are Evaluable because all the variables are universally quantied and they are negative in the sub-formula which is in the range of the quantier. The formulas dened by rule 2 are Evaluable if F 1 ; F 2 ; : : :; F n are Evaluable. The formulas dened by the rule 3 are Evaluable if F 1 ; F 2 ; : : : ; F m, because each variable x i is negative in :A 1 _ :A 2 _ : : : _ :A n and is negative too in :A 1 _ :A 2 _ : : : _ :A n _ :(F 1 ^ F 2 ^ : : : ^ F m ) are Evaluable because, F 1 ; F 2 ; : : : ; F n are closed formulas. So all the Predened Variable formulas are Evaluable. However, the reverse is not true. For example the formulas 8x:(P(x) ^ Q(x)) and 9xP(x) are Evaluable but they are not Predened Variable formulas. 6 Expressive power of Evaluable Formulas We will show that Evaluable formulas have he same expressive power as Domain Independent formulas; that is, for each Domain Independent formula there is an equivalent Evaluable formula. 30

We have seen that Domain Independent formulas and Safe formulas have the same expressive power. J.D.Ullman has shown in [20] that Safe formulas and Relational Algebra have the same expressive power. So it is sucient to prove that for each Relational Algebra expression there exists an equivalent Evaluable formula. That can be done using the same proof as that of theorem 5-1 given by J.D.Ullman in [20], with Evaluable formulas playing the same role as Safe formulas. Indeed the only properties of Safe formulas which are used in this proof are the following : 1. Atomic formulas are Safe. 2. If F 1 (X) and F 2 (X) are Safe then F 1 (X) _ F 2 (X) is Safe. 3. If F 1 (X) and F 2 (X) are Safe then F 1 (X) ^ :F 2 (X) is Safe. 4. If F 1 (X) and F 2 (X) are Safe then F 1 (X) ^ F 2 (X) is Safe. 5. If F 1 (X) is Safe then 9x i F 1 (X) is Safe. 6. If F 1 (X) is Safe then F 1 (X) ^ (x i1 ; x i2 ; : : : ; x ip ) is Safe. where x i1 ; x i2 ; : : : ; x ip are variables of X, and is a boolean formula in which the only predicates are comparison predicates. We can easily check that these properties are also satised by Evaluable formulas. For example if F 1 (X) and F 2 (X) are Evaluable formulas then F 1 (X) _ F 2 (X) is also an Evaluable formula. 7 Conclusion We have dened a class of formulas which ahve been proved to be Domain Independent and we have shown that this class is larger than the previous ones encountered in the literature. Moreover, we have proved that this class of formulas has the same expressive power as the Domain Independent formulas. It is interesting to note that the denition of Evaluable formulas does not require the formulas to be put in a normal form. The implementation of a procedure which recognizes Evaluable formulas has been implemented in PROLOG. 31

The program is very simple and fast, and is close to the denitions given in this paper. We think that it is possible to nd a larger syntactic characterization of Domain Independent formulas, but such a characterization would be more complex. In particular, we could distinguish dierent cases of _ operator usage in the formulas. In many cases the operands of an _ operator are not linked to one another; that is the reason why a formula like P(x) _ Q(y) is not Domain Independent. If there an x value such that P(x) is true then y can take any value of the domain. However, in some cases the _ has the same meaning as an ^. Consider for example the formula 9tR(x; t) ^ 8z(:R(x; z) _ S(z; y)). If we denote by R x the set fz R(x,z)g, and by S y the set fz S(z,y)g, then the meaning of this formula can be expressed informally by 9tR(x; t) ^ (R x S y ), which shows that the variables x and y which appear in the operands of the _ are linked to one another as in the case of the operands of an ^. This kind of formula correspond to what we have called \generalised division" in the Relational Algebra [9]. On the other hand we could try to consider tautologies and contradictions in the definition. Indeed formulas like 9x(P(x) _ :P(x)) or 8x(P(x) ^ :P(x)) are Domain Independent but are not Evaluable. However it is well known that the problem of recognizing if a formula is a tautology or a contradiction is not decidable, so we cannot hope to extend the denition signicantly in this direction. Our feeling is that it is not easy to nd a simple and signicant extension to Evaluable formulas because there are relatively few formulas which are equivalent to an Evaluable formula and which are not Evaluable. Indeed we have shown in [10] that, if we consider equivalent formulas obtained from a given formula using the transformation rules corresponding to factorization or ditribution of the operators _; ^; :; 8 or 9, and assiciativity and commutativity of the operators ^ and _, then a variable's property of being Restricted or Unrestricted is preserved. Furthermore, a variable's property of being Positive or Negative is always preserved except in one very particular case. This case, for a Positive variable, is that of transforming F 1 _ (F 2 ^ F 3 ) into (F 1 _ F 2 ) ^ (F 1 _ F 3 ) when the following condition holds : ((FPos(x; F 1 ) ^ :FRes(x; F 1 )) _ :Free(x; F 1 )) ^ Free(x; F 2 ) ^ Free(x; F 3 ) ^ ((FRes(x; F 2 ) ^ :FPos(x; F 3 )) _ (:FPos(x; F 2 ) ^ FRes(x; F 3 ))). For example x is Positive in (P(x) _ Q(a)) _ (R(x) ^ :S(x)) but it is not Positive in (P(x) _ Q(a) _ R(x)) ^ (P(x) _ Q(a) _ :S(x)). There is a similar case for Negative variables. A possible extension to this work could be to consider a language with equality, or to consider specic problems arising with comparison predicates. 32