Armstrong Databases and Reasoning for Functional Dependencies and Cardinality Constraints over Partial Bags
|
|
- Marjorie Cummings
- 6 years ago
- Views:
Transcription
1 Armstrong Databases and Reasoning for Functional Dependencies and Cardinality Constraints over Partial Bags Sven Hartmann 1, Henning Köhler 2, Sebastian Link 3, and Bernhard Thalheim 4 1 Institut für Informatik, Technische Universität Clausthal, Germany 2 N-Squared Software, Palmerston North, New Zealand 3 Department of Computer Science, University of Auckland, New Zealand 4 Institut für Informatik, Christian-Albrechts-University Kiel, Germany Abstract. Data dependencies capture meaningful information about an application domain within the target database. The theory of data dependencies is largely a theory over relations. To make data processing more efficient in practice, partial bags are permitted as database instances to accommodate partial and duplicate information. However, data dependencies interact differently over partial bags than over the idealized special case of relations. In this paper, we study the implication problem of the combined class of functional dependencies and cardinality constraints over partial bags. We establish an axiomatic and an algorithmic characterization of the implication problem. These findings have important applications in database design and data processing. Finally, we investigate structural and computational properties of Armstrong databases for the class of data dependencies under consideration. These results can be utilized to consolidate and communicate the understanding of the application domain between different stake-holders of a database. 1 Introduction Quality database schemata must capture both the structure and semantics of the underlying application domain. Data dependencies are classes of first-order formulae that can model semantically meaningful information in the target database. In the relational model of data, approximately 100 different classes of data dependencies have been studied [24]. Among those, functional dependencies and cardinality constraints represent two classes of data dependencies that are popular in database practice and theory. Cardinality constraints, in particular, have been studied extensively in Chen s Entity-Relationship model. In practice, however, relations represent idealized special cases in which all information is always available and no duplicate information can occur. In relational database management systems, database instances are partial bags. That is, duplicate rows can occur and columns may contain partial information in the form of null marker occurrences, unless they have been specified as NOT NULL. Inthis paper we are concerned with the implication problem of the combined class of functional dependencies, cardinality constraints and NOT NULL constraints over T. Lukasiewicz and A. Sali (Eds.): FoIKS 2012, LNCS 7153, pp , c Springer-Verlag Berlin Heidelberg 2012
2 166 S. Hartmann et al. partial bags. The implication problem is to decide whether every partial bag that satisfies a given set of data dependencies also satisfies another given data dependency. The problem is essential in database design, and has found numerous applications in almost all data processing tasks. While different classes of data dependencies co-occur in practice, this co-occurrence is often the source for the intractability or even infeasibility of the associated implication problem. It is therefore a challenge to identify combined classes of data dependencies that can be reasoned about effectively and efficiently. Example 1. Suppose that in designing an information system for a company the team of data engineers has established the following SQL table definition: CREATE TABLE Employment ( Emp VARCHAR NOT NULL, Dept VARCHAR, Mgr VARCHAR NOT NULL); Here, employees (Emp) work within a department (Dept) under a manager (Mgr). Null marker occurrences are only permitted in the column Dept. Asinterpretation of the null marker we choose the most primitive one as no information, i.e. a total value may not exist or may exist but is currently unknown. The team of data engineers has started to think about the semantics of the application domain. So far, they have acquired the following business rules. Employees can work for at most one department, and departments have at most one manager. Moreover, every employee can be associated with at most 4 combinations of any department and any manager, every manager can be associated with at most 2 combinations of any employee and any department, and every combination of any employee and any manager must be unique. These business rules can be expressed as functional dependencies and cardinality constraints. The team of engineers would like to consult the experts of the application domain to find out whether their current perceptions about the semantics captures all the requirements necessary. In order to validate their own understanding of the application domain and to facilitate the knowledge acquisition from the domain experts the team plans to create test data, in particular. Example 1 illustrates how quality database designs require a deep understanding of the application domain s semantics. In particular, it is necessary to comprehend the interactions between different classes of data dependencies in the presence of partial and duplicate information. Such an advanced understanding can also lead to more efficient ways of data processing. For example, suppose that we want to retrieve all distinct combinations of an employee and a department from the current database instance. Since the business rules above are enforced on all instances, and since the constraint that every combination of an employee and a department is unique is implied by these business rules, it follows that the distinct clause in our query is superfluous. Query optimizers with built-in reasoning abilities for these constraints can therefore detect such opportunities effectively, and depending on the complexity of the associated implication
3 Functional Dependencies and Cardinality Constraints over Partial Bags 167 problem, even efficiently. For these reasons, an in-depth investigation of associated implication problems are both challenging and in high demand. Contributions. So far, the combined class of functional dependencies and cardinality constraints has only been considered over relations, i.e., in the idealized special case where no duplicate rows and no null marker occurrences are permitted. In this paper we make three major contributions. Firstly, we characterize axiomatically the implication problem for the combined class of functional dependencies, cardinality constraints and NOT NULL constraints over partial bags. Secondly, we characterize the implication problem also algorithmically. Our results show how reasoning about this combined class of constraints over partial bags can be done effectively and efficiently. For our third and final contribution we investigate the concept of Armstrong databases for the combined class under discussion. We establish structural and computational properties of Armstrong tables. In particular, we characterize the structure of Armstrong tables, i.e., provide sufficient and necessary conditions that allow us to test whether a given partial bag is Armstrong with respect to a given set of constraints in this class. This characterization enables us to derive further properties. For example, we characterize for which sets of constraints in this class Armstrong tables exist. We show that the problem of computing Armstrong tables for a given set of constraints in this class is precisely exponential in the size of the given set. Nevertheless, we are able to establish an algorithm that always computes an Armstrong table for a given set of constraints whenever it exists and whose number of rows is at most quadratic in the number of rows of a minimum-sized Armstrong table and the number of given constraints. Organization. We discuss related work in Section 2. Subsequently, we introduce the data model in Section 3 which includes a definition of the syntax and semantics used in this paper. In Section 4 we characterize the implication problem axiomatically and algorithmically. The structural and computational properties of Armstrong tables are established in Section 5. Finally, we conclude in Section 6 where we also comment briefly on future work. Due to space limitations we have moved some of the proofs into the appendix. 2 Related Work Data dependencies and Armstrong databases have been studied thoroughly in the relational model of data, cf. [1,9]. Dependencies are essential to the design of the target database, the maintenance of the database during its lifetime, and all major data processing tasks [1,26]. Armstrong databases are a useful design aid for data engineers that can help with the consolidation of data dependencies [16], the design of databases [21] and the creation of concise test data [6]. Armstrong [2] established the first axiomatization for functional dependencies. In general, axiomatizations can be applied by designers and administrators to validate the specification of explicit knowledge, to design and fine-tune databases or to optimize queries. An axiomatization ensures that all opportunities of utilizing implicit knowledge have been exploited. An analysis of the completeness
4 168 S. Hartmann et al. argument can provide invaluable hints for finding algorithms that efficiently decide the implication problem. The implication problem of functional dependencies can be decided in time linear in the input [8]. For relations, the structural and computational properties of Armstrong relations for the class of functional dependencies are well-studied [4,21]. Cardinality constraints have mostly been investigated in conceptual models under a relational semantics [10,17,19,25], and recently in XML [13,22]. One of the most important extensions of the basic relational model [5] is incomplete information [15]. This is mainly due to the high demand for the correct handling of such information in real-world applications. Approaches to deal with incomplete information comprise incomplete relations, or-relations or fuzzy relations. In this paper we focus on incomplete relations. In the literature many kinds of null makers have been proposed; for example, missing or value unknown at present, non-existence, inapplicable, no information and open. Several works on functional dependencies in incomplete relations exist. Levene and Loizou studied classes of functional dependencies with a weak and strong possible world semantics [18]. Atzeni and Morfuni established an axiomatization of functional dependencies in the presence of NOT NULL constraints under the no information interpretation [3]. In this context, Hartmann and Link established an equivalence of the implication problem for this class of functional dependencies and NOT NULL constraints to that of propositional Horn clauses in Cadoli and Schaerf s family of S-3 logics [14]. Both articles consider only instances where functional dependencies subsume uniqueness constraints, but do not consider neither tables with duplicate rows nor cardinality constraints. In [11] structural and computational properties of Armstrong databases have been established for the combined class of functional dependencies and NOT NULL constraints. In the present paper, we draw from this body of research and establish fundamental results for the combined class of functional dependencies, cardinality constraints and NOT NULL constraints over partial bags. 3 The Data Model Let H = {H 1,H 2,...} be a countably infinite set of symbols, called column headers or headers for short. A table schema is a finite non-empty subset T of H. Each header H of a table schema T is associated with a countably infinite domain dom(h) of the possible values that can occur in column H. To encompass partial information every column can have a null marker, denoted by ni dom(h). The intention of ni is to mean no information. We would like to stress that a null marker is different from a domain value. The inclusion of ni into the domain is a syntactic convenience. For header sets X and Y we may write XY for X Y.IfX = {H 1,...,H m }, then we may write H 1 H m for X. In particular, we may write simply H to represent the singleton {H}. Arow over T (T -row or simply row, if T is understood) is a function r : T H T dom(h) with r(h) dom(h) for all H T. The null marker occurrence r(h) =ni associated with a header H in
5 Functional Dependencies and Cardinality Constraints over Partial Bags 169 arowr means that there is no information about r(h). That is, r(h) maynot exist or r(h) exists but is unknown. For X T let r(x) denote the restriction of the row r over T to X. Atable t over T is a finite multi-set (bag) of rows over T. We sometimes use the phrase partial bag to indicate that these bags can contain partial information in the form of null marker occurrences. In this paper, the terms table and partial bag can be used interchangeably. For a row r over T and a set X T, r is said to be X-total if for all H X, r(h) ni. Similar, a table t over T is said to be X-total, if every row r of t is X-total. A table t over T is said to be a total table if it is T -total. Following Atzeni and Morfuni [3] a null-free subschema (NFS) over the table schema T is a an expression nfs(t s )wheret s T.TheNFST s over T is satisfied by a table t over T, denoted by = t nfs(t s ), if and only if t is T s -total. SQL allows the specification of column headers as NOT NULL. NFSs occur in everyday database practice: the set of headers declared NOT NULL forms the single NFS over the underlying table schema. Following Lien [20] a functional dependency (FD) over the table schema T is a statement X Y where X, Y T.TheFDX Y over T is satisfied by a table t over T, denoted by = t X Y, if and only if for all r 1,r 2 t the following holds: if r 1 (X) =r 2 (X) andr 1,r 2 are X-total, then r 1 (Y )=r 2 (Y ). FDs of the form Y are called non-standard, otherwise FDs are called standard. The size σ of an FD σ = X Y is defined as X + Y. We now introduce the concept of a cardinality constraint into databases with partial information. Let N denote the positive integers. A cardinality constraint (CC) over the table schema T is a statement card(x) b where X T and b N. TheCCcard(X) b over T is satisfied by a table t over T, denoted by = t card(x) b, if and only if for all r 1,r 2,...,r b+1 t the following holds: if i, j {1,...,b+1}(r i (X) =r j (X)) and i {1,...,b+1}(r i (X) isx-total), then i, j {1,...,b +1}(r i = r j ). CCs of the form card( ) b are called non-standard, otherwise CCs are called standard. CCs subsume the concept of uniqueness constraints for the special case where card(x) 1. The size σ of a CC σ = card(x) b is defined as X +logb. For a set Σ of constraints over some table schema T, we say that a table t over T satisfies Σ, denoted by = t Σ,ift satisfies every element of Σ. Ifforsome σ Σ the table t does not satisfy σ we sometimes say that t violates σ (in which case t also violates Σ) and write = t σ ( = t Σ). The size Σ of a set Σ of FDs and CCs is defined as the sum of sizes over all elements of Σ. The cardinality Σ of a finite set Σ is defined as the number of its elements. Example 2. The SQL table definition from Example 1 can be captured in our data model as follows. The table schema T = Employment consists of the column headers Emp, Dept and Mgr. TheNFSnfs(T s ) is defined by T s = {Emp, Mgr}. ThesetΣ consists of the FDs Emp Dept and Dept Mgr, and the CCs card(emp) 4, card(mgr) 2andcard(Emp, Mgr) 1. For the design, maintenance and applications of a relational database, data dependencies are identified as semantic constraints on the relations which are intended to be instances of the database schema. During the design process or
6 170 S. Hartmann et al. lifetime of a database one usually needs to determine further dependencies which are logically implied by the given ones. In line with the literature of database constraints, we restrict our attention to the implication of constraints in some fixed class C: FDs and CCs in the presence of an NFS. Let T be a table schema, let nfs(t s ) denote an NFS over T, and let Σ {ϕ} be a set of FDs and CCs over T.WesaythatΣ implies ϕ in the presence of nfs(t s ), denoted by Σ = Ts ϕ,ifeveryt s -total table t over T that satisfies Σ also satisfies ϕ. IfΣ does not imply ϕ in the presence of nfs(t s )wemayalso write Σ = Ts ϕ. The implication problem for functional dependencies and cardinality constraints in the presence of a null-free subschema is to decide, given any table schema T,anyNFSnfs(T s )overt,andanysetσ {ϕ} of FDs and CCs over T, whether Σ = Ts ϕ. For the class of FDs and CCs, the sets Σ {ϕ} over a fixed table schema T are not necessarily always finite. While for a fixed T there are only finitely many FDs, there might be infinitely many CCs by taking arbitrarily large upper bounds b N. However, for a fixed X T only the least b N that occurs is relevant. Therefore, we assume without loss of generality that they are finite. Note that for FDs and CCs (in the presence of an NFS) it does not matter whether we restrict our tables to those that are finite, i.e., the implication problem coincides with the finite implication problem where only finite tables are considered. For this reason, we will only speak of the implication problem. For an FD set Σ over a table schema T and an NFS nfs(t s )overt,letthe FD set ΣT s = {ϕ Σ = Ts ϕ} denote the semantic closure of Σ and nfs(t s ), and for a set X T let XΣ,T s = {H T Σ = Ts X H} denote the closure of X under Σ and nfs(t s ). For a set Σ of FDs and CCs over T let Σ[FD] = {X Y X Y Σ} {X T card(x) 1 Σ}. Foraset Σ {ϕ} of FDs and CCs, an NFS nfs(t s ), and a set R of inference rules let Σ R ϕ denote an inference of ϕ from Σ by R. That is, there is some sequence γ =[σ 1,...,σ n ] of FDs and CCs such that σ n = ϕ and every σ i is an element of Σ or results from an application of an inference rule in R to some elements in {σ 1,...,σ i 1 }. For a finite set Σ of FDs and CCs let Σ + R = {ϕ Σ R ϕ} denote the syntactic closure of Σ under inferences by R. R is said to be sound (complete) for the implication of FDs and CCs in the presence of an NFS if for every table schema T, for every NFS nfs(t s )overt and for every set Σ of FDs and CCs over T we have Σ + R Σ T s (ΣT s Σ + R ). The (finite) set R is said to be a (finite) axiomatization for the implication of FDs and CCs in the presence of an NFS if R is both sound and complete for the implication of FDs and CCs in the presence of an NFS. Example 3. Consider the set Σ with NFS nfs(t s )overtableschemat from Example 2. Then the following are examples of CCs implied by Σ in the presence of nfs(t s ): card(dept) 2andcard(Emp, Dept) 1. However, neither the CC card(emp) 2northeFDEmp Mgr are implied by Σ in the presence of nfs(t s ). Indeed, the T s -total table
7 Functional Dependencies and Cardinality Constraints over Partial Bags 171 Table 1. Axiomatization S of FDs and CCs in the presence of an NFS X YZ X Y X Z XY X X Y X YZ (reflexivity) (decomposition) (union) X Y Y Z card(x) b card(x) 1 Y XT s X Z card(x) b +1 X T (null transitivity) (weakening) (demotion) X Y card(y ) b Y XT s card(x) b (null pullback) Emp Dept Mgr Sisyphus ni Trump Sisyphus ni Gates Sisyphus ni Jobs satisfies Σ, but violates card(emp) 2andEmp Mgr. 4 Characterizations of the Implication Problem The first target in our analysis is the establishment of an axiomatization for the implication of FDs and CCs in the presence of an NFS. The insights from the completeness proof will enable us to characterize the implication problem algorithmically, subsequently. 4.1 Axiomatic Characterization Let S denote the set of inference rules from Table 1. It is our goal to show that S forms a finite axiomatization. In our proof we will use the result by Atzeni and Morfuni that the set M, consisting of the reflexivity axiom, and the decomposition, union and null transitivity rule, forms a finite axiomatization for the implication of FDs [3]. Lemma 1. The weakening, demotion and null pullback rules are sound for the implication of FDs and CCs in the presence of an NFS. Note that the soundness of the reflexivity axiom and the null pullback rule also card(x) b imply the soundness of the superset rule. Indeed, the trivial FD card(xy ) b XY Y and the CC card(x) b allow us to infer the CC card(xy ) b by an application of the null pullback rule since Y XYT s.
8 172 S. Hartmann et al. Example 4. Consider the set Σ with NFS nfs(t s )overtableschemat from Example 2. Then the following are examples of inferences from Σ and nfs(t s ) by S. An application of the null pullback rule to Dept Mgr, card(mgr) 2, and Mgr T s results in the CC card(dept) 2. That is, Dept Mgr card(mgr) 2 card(dept) 2. We now outline an inference of card(emp, Dept) 1 from Σ and nfs(t s ) by S. Applications of the reflexivity axiom result in Emp,Dept Emp and Emp,Dept Dept. An application of the null transitivity rule to Emp,Dept Dept, anddept Mgr as well as Dept {Emp,Dept,Mgr} results in the FD Emp,Dept Mgr. An application of the union rule to Emp,Dept Emp and Emp,Dept Mgr results in the FD Emp,Dept Emp,Mgr. Finally, an application of the null pullback rule to Emp,Dept Emp,Mgr, card(emp,mgr) 1, and {Emp, Mgr} {Emp,Dept,Mgr} results in card(emp,dept) 1. The tree Emp,Dept Dept Dept Mgr Emp,Dept Emp Emp,Dept Mgr Emp,Dept Emp,Mgr card(emp,mgr) 1 card(emp, Dept) 1 illustrates this inference. Before we turn to the completeness argument, we want to emphasize that any set of FDs alone can never imply any cardinality constraint. Proposition 1. Let T be a table schema, nfs(t s ) an NFS, and Σ asetoffds over T. Then for all cardinality constraints card(x) b over T we have Σ = Ts card(x) b. Proof. Let t denote the table over T that consists of b + 1 rows which have for every column header of T the same non-null value, i.e., t consists of b+1 duplicate total rows. Clearly, t satisfies Σ and nfs(t s ). Since t violates card(x) b it follows that Σ = Ts card(x) b. Corollary 1. Let T be a table schema. Then the FD X T over T does not imply the cardinality constraint card(x) 1. For the completeness of S the following lemma is central. Lemma 2. Let Σ be a set of FDs and CCs, and nfs(t s ) be an NFS over table schema T. Then the following hold: 1. Σ = Ts X Y if and only if Σ[FD] = Ts X Y,and 2. Σ = Ts card(x) b if and only if there is some card(y ) b Σ such that b b and Y XT s XΣ[FD],T s.
9 Functional Dependencies and Cardinality Constraints over Partial Bags 173 For the second part of Lemma 2 consider the special case where Σ consists of FDs only. Then no cardinality constraint can be implied by Σ in the presence of the NFS, in consistency with Proposition 1. We have now the means to verify that S is a finite axiomatization for the implication of FDs and CCs in the presence of an NFS. Note that S is indeed finite, since the rules apply to any given table schema T,anygivensetsX, Y, Z, T s T of column headers, and any given b N. In particular, the weakening rule applies to every given b N. Theorem 1. The set S is a finite axiomatization for the implication of FDs andccsinthepresenceofannfs. Proof. The soundness of S follows from Lemma 1 and the soundness of the rules in M, established in previous work [3]. Let Σ {ϕ} denote a set of FDs and CCs, and nfs(t s ) denote an NFS over table schema T. For the completeness of S we need to show that Σ = Ts ϕ implies Σ S ϕ. We distinguish between two cases. Firstly, let ϕ denote the FD X Y.FromΣ = Ts X Y we conclude that Σ[FD] = Ts X Y holds by the first part of Lemma 2. The completeness of M for the implication of FDs in the presence of an NFS shows that Σ[FD] M X Y holds. Since the demotion rule is part of S it follows that Σ S σ holds for every σ Σ[FD]. From M S we therefore conclude that Σ S X Y holds indeed. Secondly, let ϕ denote the CC card(x) b. From the second part of Lemma 2 it follows that Σ[FD] = Ts X Y,andthatcard(Y ) b Σ for some Y XT s and some b b. The first case of this completeness proof shows that Σ S X Y. An application of the null pullback rule yields Σ S card(x) b. Finally, applications of the weakening rule result in Σ S card(x) b. 4.2 Algorithmic Characterization In many situations it is not necessary to compute the set of all constraints implied by a given set. Instead, the question is whether a given fixed candidate constraint is implied by the given set of constraints. We will now investigate an algorithmic characterization of the implication problem for the combined class of functional dependencies and cardinality constraints in the presence of an NFS. Lemma 2 reduces the implication problem for the combined class of FDs and CCs in the presence of an NFS to the implication problem for the class of FDs in the presence of an NFS. Indeed, Σ = Ts X Y if and only if Y XΣ[FD],T s, and Σ = Ts card(x) b if and only if Y XΣ[FD],T s for some card(y ) b Σ such that b b and Y XT s. Therefore, the implication problem under consideration has been reduced to the computation of the closure XΣ[FD],T s of a given set X of column headers with respect to a given FD set Σ[FD]. This, however, has been done in previous work [3]. For reasons of completeness, we re-state the algorithm here.
10 174 S. Hartmann et al. Algorithm 2 (NFSClosure(X,Σ,T s,t )) Input: set X of column headers, FD set Σ, NFSnfs(T s )overtableschemat Output: closure XΣ,T s of X with respect to Σ and nfs(t s ) Method: (A0) CLOSURE := X; (A1) repeat OLDCLOSURE := CLOSURE; for all U V Σ do if U CLOSURE XT s then CLOSURE := CLOSURE V ; endif; enddo; until OLDCLOSURE = CLOSURE; (A2) return CLOSURE; Theorem 3. The implication problem Σ = Ts decided in time O( T Σ ). ϕ over table schema T can be Example 5. Consider the set Σ with NFS nfs(t s )overtableschemat from Example 2. We have shown in Example 4 that Σ = Ts card(emp, Dept) 1. Alternatively, we could confirm this fact by using the second part of Lemma 2 and Algorithm 2. Indeed, it is true that card(emp, Mgr) 1 Σ and {Emp, Mgr} is a subset of the union of {Emp, Dept} and {Emp, Mgr}, aswellasasubsetofthe closure of {Emp, Dept} under Σ[FD] and nfs(t s ). In fact, {Emp, Dept} Σ[FD],T s = {Emp, Dept, Mgr}. 5 Armstrong Tables In this section we explore the concept of Armstrong databases for the combined class of FDs, CCs and NOT NULL constraints over partial bags. C-Armstrong databases are sample data that perfectly represent the set Σ of constraints from the class C currently perceived meaningful. Indeed, they satisfy Σ and violate every constraint in C not implied by Σ. As such, Armstrong databases are an effective means to consolidate and communicate the current perceptions of an application domain s semantics between various stake-holders of the database [11,21]. We will now extend recent results on Armstrong tables for the combined class of FDs and NOT NULL constraints over partial bags [11] by the class of cardinality constraints. Note that these results also extend early work on Armstrong relations for the class of FDs, pioneered by Demetrovics, Mannila, Räihä, Beeri, Dowd, Fagin and Statman [4,7,21]. 5.1 Central Concepts In a first step we will fix various notions required to establish results on the structural and computational properties of Armstrong tables. We begin with the concept most central to this section.
11 Functional Dependencies and Cardinality Constraints over Partial Bags 175 Definition 1. Let T be a table schema, nfs(t s ) an NFS, and Σ asetoffds and CCs over T.Atablet over T is said to be Armstrong for Σ and nfs(t s ) if and only if for every FD and CC ϕ over T : t satisfies ϕ if and only if Σ = Ts ϕ,and for every nfs(t s ) over T : t satisfies nfs(t s ) if and only if T s T s. Example 6. Consider the set Σ with NFS nfs(t s )overtableschemat from Example 2. Then the following table Emp Dept Mgr Sisyphus ni Trump Sisyphus ni Gates Sisyphus ni Jobs Sisyphus ni Zuckerberg Gödel Computer Science Hilbert Church Computer Science Hilbert Newton Physics Gauss Leibniz Mathematics Gauss is an Armstrong table for Σ and nfs(t s ). For characterising the structure of Armstrong tables we need different notions of agreement between rows of a table. The different versions are motivated by the potential occurrence of null markers on the one hand, and the different classes of constraints we consider on the other hand. For functional dependencies it suffices to compare all pairs of distinct rows. Cardinality constraints, however, require us to compare any finite number of distinct rows, essentially up to the maximum bound that occurs in the given set of constraints. Definition 2. Let T be a table schema, t a table over T,andr 1,r 2 two rows over T. The agree set of r 1 and r 2 is defined as ag(r 1,r 2 ) = (X, Y ) where X = {H T r 1 (H) =r 2 (H) r 1 (H) ni}, andy = {H T r 1 (H) = r 2 (H)}. Thestrong agree set of r 1 and r 2 is defined as ag s (r 1,r 2 )=X where ag(r 1,r 2 )=(X, Y ). Theweak agree set of r 1 and r 2 is defined as ag w (r 1,r 2 )=Y where ag(r 1,r 2 )=(X, Y ). Theagree set of t is defined as ag(t) ={ag(r 1,r 2 ) r 1,r 2 t r 1 r 2 }.Thestrong agree set of t is defined as ag s (t) ={X (X, Y ) ag(t)}. Theweak agree set of t is defined as ag w (t) ={Y (X, Y ) ag(t)}. For X ag s (t) we define w(x) = {Y (X, Y ) ag(t)}. For every positive integer b>1 we define ag s b (t) ={ 1 i<j b ags (r i,r j ) r 1,...,r b t( 1 i<j b(r i r j ))}, ag s 1 (t) ={T } and ags (t) =. Example 7. Let t denote the table from Example 6 that is Armstrong for the set Σ and the NFS nfs(t s )overtableschemat from Example 2. Let r 1,r 2 denote the first two rows of t, respectively. Then ag(r 1,r 2 )=({Emp}, {Emp, Dept}), w(emp) ={Emp, Dept} and ag s (t) =ag s 2(t) ={, {Emp}, {Mgr}, {Dept, Mgr}}. Furthermore, ag s 3(t) =ag s 4(t) ={, {Emp}}.
12 176 S. Hartmann et al. An Armstrong table must violate all the cardinality constraints not implied by the given set. It suffices, however, for any non-empty set X to violate the cardinality constraint card(x) b X 1whereb X denotes the minimum positive integer for which card(x) b X is implied. Moreover, if there are two cardinality constraints card(x) b X and card(y ) b Y such that b X = b Y and Y X,then it suffices to violate card(x) b X 1. This motivates the following definitions. Definition 3. Let T be a table schema, nfs(t s ) an NFS, and Σ asetoffds and CCs over T.For X T let { min{b N Σ =Ts card(x) b}, if {b N Σ = b X = Ts card(x) b},else. The set dup Σ,Ts (T ) of duplicate sets is defined as dup Σ,Ts (T )={X T H T X(b XH <b X )}. Note that by Lemma 2 we have b X = min{b card(y ) b Σ Y XT s X + Σ[FD],T s },if{b N Σ = Ts card(x) b}. Example 8. Consider the set Σ with NFS nfs(t s )overtableschemat from Example 2. Then we have b Emp = 4, b Dept = b Mgr = b Dept,Mgr = 2, b Emp,Dept = b Emp,Mgr = b Emp,Dept,Mgr = 1. Therefore, dup Σ,Ts (T ) = {{Emp}, {Dept, Mgr}, {Emp, Dept, Mgr}}. An Armstrong table must also violate all the functional dependencies not implied by the given set. However, it suffices for each column header H to violate all FDs X H where X is maximal with the property that X H is not implied. Furthermore, if X is maximal for some H in this sense, X dup Σ,Ts (T )andx is not maximal for any H T T s, then we can violate card(x) b X 1such that for all H T s X, X H is also violated. These arguments motivate the following definitions. Definition 4. Let Σ be a set of FDs and let nfs(t s ) be an NFS over table schema T. For a column header H T we define the maximal sets max Σ,Ts (H) of H with respect to Σ and nfs(t s ) as follows: max Σ,Ts (H) :={ X T Σ = Ts X H H T X(Σ = Ts XH H)}. The maximal sets of T with respect to Σ and nfs(t s ) are defined as max Σ,Ts (T )= H T max Σ,T s (H). IfΣ and nfs(t s ) are clear from the context we may simply write max(h) and max(t ), respectively. Finally, max red Σ,T s (T ):=max Σ,Ts (T ) {X dup Σ,Ts (T ) H T T s (X/ max Σ,Ts (H))}. Example 9. Consider the set Σ with NFS nfs(t s ) over table schema T from Example 2. Recall from Example 8 that dup Σ,Ts (T ) = {{Emp}, {Dept, Mgr}, {Emp, Dept, Mgr}}. As maximal sets we compute max Σ,Ts (Emp) = {{Dept, Mgr}}, max Σ,Ts (Dept) = {{Mgr}} and max Σ,Ts (Mgr) ={{Emp}}. Therefore, max red Σ,T s (T )={{Mgr}}.
13 Functional Dependencies and Cardinality Constraints over Partial Bags Characterization We are now in a position to establish sufficient and necessary conditions when a given table is Armstrong for a given set Σ of FDs, CCs and an NFS nfs(t s ). This generalises recent work from the special case where Σ consists of FDs only [11]. In turn, that result had generalised a well-known result by Mannila, Räihä, Beeri, Dowd, Fagin and Statman for FDs over total relations [4,21]. In the following theorem, the first (third) condition ensures that all FDs (CCs) not implied by the given set are violated; and the second (fourth) condition ensures that all implied FDs (CCs) are satisfied. The final condition handles the NFS. Theorem 4. Let T be a table schema, Σ a set of standard FDs and standard CCs, and nfs(t s ) an NFS over T. Then for all tables t over T, t is an Armstrong table for Σ and nfs(t s ) if and only if all of the following conditions hold: 1. H T X max Σ,Ts (H)(X ag s (t) H/ w(x)), 2. X ag s (t)(xσ,t s w(x)), 3. X dup Σ,Ts (T ) Z ag s b X (t)(x Z), 4. card(x) b Σ Z ag s b+1 (t)(x Z), 5. total(t) =T s. Example 10. The previous examples show that the table t in Example 6 is Armstrong for the set Σ and the NFS nfs(t s )overtableschemat from Example 2. Indeed, the conditions of Theorem 4 are all satisfied by t. 5.3 Computation For the computation of an Armstrong table, Theorem 4 suggests that the maximal and duplicate sets need to be computed. The computation of the maximal sets with respect to a set of standard FDs and an NFS nfs(t s )overtableschema T has been studied in [11]. For the computation of duplicate sets we now outline an algorithm that is exponential in the size of Σ. While we leave optimizations of this algorithm for future work, we note that there are sets of FDs and there are sets of CCs, respectively, where every Armstrong table for this set is exponential in the size of Σ, cf.theorem7. Let Σ denote a set of standard FDs and standard CCs, nfs(t s )annfs, and X a set of column headers over table schema T. We first compute b X.We start with b X := and compute XΣ[FD],T s using Algorithm 2. Then, for each card(y ) b Σ such that Y XT s XΣ[FD],T s and b<b X we redefine b X := b. Next we compute the duplicate sets dup Σ,Ts (T ). We start with dup Σ,Ts (T )= {X X T }. ThenforeachX dup Σ,Ts (T )andeachh T X such that b XH = b X we redefine dup Σ,Ts (T ):=dup Σ,Ts (T ) {X}. Therefore, the time to compute dup Σ,Ts (T )andb X for each X dup Σ,Ts (T )isino(2 T T Σ ). Algorithm 5 shows the computation of an Armstrong table. The first two steps consist of the computations of the duplicate sets X and their associated b X,and the computation of the maximal sets covered in previous work. In step (A4), the algorithm generates for each duplicate set X ablockofb X rows that satisfies
14 178 S. Hartmann et al. card(x) b X, violates card(x) b X 1, and violates every FD X H where H T s X. Note that in this case there cannot be any H T s X such that X H is implied by Σ and nfs(t s ). Otherwise, since X is a duplicate set it wouldholdthatcard(xh) b<b X is an implied cardinality constraint. Due to the soundness of the null pullback rule, card(x) b<b X would be implied, too. This is a contradiction. In step (A5), the algorithm computes for each maximal set X the set Z of column headers in T T s for which X is maximal, and produces two rows which strongly agree on X and disagree on each column header in Z; unless X is also a duplicate set and each column header for which X is maximal is in T s. Finally, step (A7) introduces null marker occurrences in every column that is not in T s, unless such a column already features a null marker occurrence. Algorithm 5 (Armstrong table computation) Input: set Σ of standard FDs and standard CCs, an NFS nfs(t s )overtable schema T such that H T b N(Σ = Ts card(h) b) Output: Armstrong table t for Σ and nfs(t s ) Method: let c H,1,c H,2,... dom(h) be distinct (A0) compute dup Σ,Ts (T ) by the procedure outlined above; (A1) compute H T (max Σ[FD],Ts (H)) by Algorithm 8 in [11]; (A2) t := ; (A3) i := 1; (A4) for all X dup Σ,Ts (T )whereb X > 1 do t := t {r i,...,r i + b X 1} where j = i,...,i+ b X 1and H T c H,i, if H X r j (H) := c H,j, if H T s X ; ni, else i := i + b X ; (A5) for all X max red Σ,T s (T ) do Z := {H T T s X max Σ,Ts (H)}; t := t {r i,r{ i+1 } where H T ch,i, if H XZT r i (H) := s ;and ni, else c H,i, if H X r i+1 (H) := c H,i+1, if H Z(T s X) ; ni, else i := i +2; (A6) total(t) :={H T r t(r(h) ni)}; if total(t) T s, then return { t := t {r i } where for all H T, ni, if H total(t) Ts r i (H) := ; c H,i, else else return t; endif; Algorithm 5 works correctly.
15 Functional Dependencies and Cardinality Constraints over Partial Bags 179 Theorem 6. For every input (T,Σ,nfs(T s )), whereσ is a set of standard FDs and standard CCs, and nfs(t s ) is an NFS over table schema T such that for all H T there is some b N such that Σ = Ts card(h) b, Algorithm 5 computes an Armstrong table for Σ and nfs(t s ). Corollary 2. Let Σ be a set of standard FDs and CCs, and nfs(t s ) an NFS over table schema T. Then there is a table over T that is Armstrong for Σ and nfs(t s ) if and only if for all H T there is some b N such that Σ = Ts card(h) b. Proof. We show first that the conditions is necessary for the existence of some Armstrong table. Assume, to the contrary, that there is some H T such that for all b N we have Σ = Ts card(h) b. Then there is some H T such that b H =. Notethatdup Σ,Ts (T ) since T dup Σ,Ts (T ), and ag s (t) =. That is, the third condition of Theorem 4 is always violated. Hence, no Armstrong table over T exists for Σ and nfs(t s ). The condition is also sufficient. Indeed, under the hypothesis that the condition holds, Algorithm 5 produces an Armstrong table for Σ and nfs(t s ), as verified by Theorem 6. Example 11. Consider the set Σ with NFS nfs(t s )overtableschemat from Example 2 as input to Algorithm 5. Then the algorithm generates the following Armstrong table Emp Dept Mgr c Emp,1 ni c Mgr,1 c Emp,1 ni c Mgr,2 c Emp,1 ni c Mgr,3 c Emp,1 ni c Mgr,4 c Emp,5 c Dept,5 c Mgr,5 c Emp,6 c Dept,5 c Mgr,5 c Emp,7 c Dept,7 c Mgr,7 c Emp,8 c Dept,8 c Mgr,7 for Σ and nfs(t s ). Note that after suitable substitutions, this is the Armstrong table given in Example Complexity Considerations Corollary 3. Let Σ be a set of standard FDs and CCs, and nfs(t s ) an NFS over table schema T. It can be decided in time O( T 2 Σ ) whether there is an Armstrong table for Σ and nfs(t s ). Proof. For each H T we need to check that there is some card(x) b Σ such that X HT s HΣ[FD],T s, by Lemma 2. This condition can be verified in time O( T Σ ). Now we will analyse how well Algorithm 5 does in terms of how well one could potentially do in general. We say that an Armstrong table t for Σ and nfs(t s )
16 180 S. Hartmann et al. is said to be minimum-sized if there is no Armstrong table t for Σ and nfs(t s ) such that t < t. First of all, the problem of computing an Armstrong table for a given set Σ of standard FDs and standard CCs and an NFS over some table schema is precisely exponential in the size of Σ. IfΣ consists of a set of standard FDs only, then this result is known, cf. [11, Proposition 2]. We recall what we mean by precisely exponential [4]. Firstly, it means that there is an algorithm for computing an Armstrong table, given a set Σ of standard FDs and standard CCs and an NFS nfs(t s ), where the running time of the algorithm is exponential in Σ. Secondly, it means that there is a set Σ of standard FDs and CCs and an NFS nfs(t s )in which the number of rows in each minimum-sized Armstrong table for Σ and nfs(t s ) is exponential in Σ thus, an exponential amount of time is required in this case simply to write down the table. Theorem 7. The problem of computing an Armstrong table for a given set Σ of standard CCs and an NFS nfs(t s ) over table schema T is precisely exponential in the size of Σ. For the remainder of this paper we show that Algorithm 5 is quite conservative in its use of time and space, despite the problem of computing Armstrong tables is computationally hard. We will show that Algorithm 5 always computes an Armstrong tables whose number of rows is at most quadratic in the number of rows of a minimum-sized Armstrong table and the cardinality of the given constraint set. Let Σ denote a set of standard FDs and standard CCs, and nfs(t s )annfs over table schema T.Wesaythatasets X of rows over T is X-agreeing if all rows in s X strongly agree on X and s X = b X. It follows from Theorem 4 that for every Armstrong table t over T for Σ and nfs(t s ) and every duplicate set X dup Σ,Ts (T )thereisanx-agreeing set s X t. Lemma 3. Let Σ denote a set of standard FDs and standard CCs, and nfs(t s ) an NFS over table schema T.Lett be a table over T satisfying Σ, andx, Y dup Σ,Ts (T ) with X Y and b X = b Y = b X Y.Letfurthers X,s Y be X- and Y -agreeing subsets of t, respectively. Then s X s Y =. Proof. Assume r s X s Y.Thens X,s Y both strongly agree with r on X Y, so s X s Y strongly agrees on X Y.Sinceb X = b Y = b X Y and t satisfies Σ we must have s X s Y b X = b Y, and hence s X = s Y. This in turn implies that s X = s Y strongly agrees on X Y, and using again that t satisfies Σ we get b X Y b X = b Y.FromX Y it follows that X Y is a proper superset of X and/or Y. Together with b X Y b X = b Y this contradicts X, Y dup Σ,Ts (T ). Let card(y ) b Σ and X Y Σ[FD] T s with Y XT s.inparticular, card(x) b can be derived using the null-pullback rule. We say that card(y ) b is a source of card(x) b. ForasetX T we call card(y ) b asourceofx if card(y ) b is a source of card(x) b X.
17 Functional Dependencies and Cardinality Constraints over Partial Bags 181 Corollary 4. Let Σ denote a set of standard FDs and standard CCs, and nfs(t s ) an NFS over table schema T. Every cardinality constraint over T of the form card(x) b X has a source in Σ. Proof. By Lemma 2 there is some card(y ) b Σ with b b X and Y XT s XΣ[FD],T s. The latter condition is equivalent to X Y Σ[FD] T s and Y XT s. From this and card(y ) b Σ we can derive card(x) b using the null-pullback rule, and by definition of b X we have b b X. This shows b = b X, so card(y ) b X is a source of X in Σ. We denote by dup Σ,Ts (card(y ) b) the set of all duplicate sets for which card(y ) b is a source dup Σ,Ts (card(y ) b) := { X dup Σ,Ts (T ) card(y ) b is a source of X }. Lemma 4. Let Σ denote a set of standard FDs and standard CCs, and nfs(t s ) an NFS over table schema T.Lett be an Armstrong table for Σ and nfs(t s ), and card(y ) b Σ. Then t dup Σ,Ts (card(y ) b) b. Proof. (1) For each X dup Σ,Ts (card(y ) b) wehavey XT s and X Y Σ[FD] T s. This implies XY XT s and X XY Σ[FD] T s. By definition of b XY we have Σ = Ts card(xy ) b XY. Using null-pullback we can derive card(x) b XY,sob X b XY.SinceX is a duplicate set, we must have Y X. (2) Table t contains an X-agreeing set s X for every X dup Σ,Ts (T )by Theorem 4, so in particular for every X dup Σ,Ts (card(y ) b). For every pair of distinct duplicate sets X 1,X 2 dup Σ,Ts (card(y ) b) wehavey X 1 X 2 by (1), and hence b X1 = b X2 = b Y = b X1 X 2.Thus,s X1 and s X2 are disjoint by Lemma 3. This gives us dup Σ,Ts (card(y ) b) disjoint sets s X t, eachof which contains b tuples, and shows the bound on t. Corollary 5. Let Σ denote a set of standard FDs and standard CCs, and nfs(t s ) an NFS over table schema T.Lett be an Armstrong table for Σ and nfs(t s ) over T and D := dup Σ,Ts (T ). Then X D t. Σ Proof. By Corollary 4 every X Dhas a source in Σ, so b X b X X D card(y ) b Σ b X X dup Σ,Ts (card(y ) b) By Lemma 4, t X dup Σ,Ts (card(y ) b) b X for any card(y ) b Σ, and thus t Σ b X. X D card(y ) b Σ X dup Σ,Ts (card(y ) b) b X
18 182 S. Hartmann et al. Theorem 8. Let Σ denote a set of standard FDs and standard CCs, and nfs(t s ) an NFS over table schema T.Lett be an Armstrong table for Σ and nfs(t s ) and t c the Armstrong table for Σ and nfs(t s ) constructed in Algorithm 5. Then t c t ( t + Σ ). Proof. Denote by t A4, t A5 and t A6 the subsets of t c constructed in steps (A4), (A5) and (A6) of Algorithm 5, respectively. By Corollary 5 we have t A4 = b X t Σ. X dup Σ,Ts (T ) Steps (A5) and (A6) together construct a sub-table of that computed by Algorithm 10 in [11], thus giving us the bound (Corollary 5 in [11]) t A5 t A6 t 2. Combining these results yields the theorem. Corollary 6. Algorithm 5 computes an Armstrong table for Σ and nfs(t s ) whose number of rows is at most quadratic in the number of rows of a minimumsized Armstrong table for Σ and nfs(t s ) and the cardinality of Σ. Finally, we show that, in general, there is no most concise way of representing the information inherent in a set of standard CCs and a null-free subschema. In fact, there are cases in which the size of a minimum-sized Armstrong table is exponential in the size of the constraint set, and there are other cases in which the size of an optimal cover of a constraint set is exponential in the size of a minimum-sized Armstrong table. Theorem 9. Let C denote the class of FDs and CCs. There is some table schema T,somesetΣ of CCs and some NFS nfs(t s ) over T such that Σ has size O(n), and the size of a minimum-sized C-Armstrong table for Σ and nfs(t s ) is O(2 n ). There is some table schema T,somesetΣ of CCs and some NFS T s over T such that there is a C-Armstrong table for Σ and nfs(t s ) where the number of rows is in O(n), and the optimal cover of Σ with respect to nfs(t s ) has size O(2 n ). Proof. Let T = H 1 H 2n, T s = T and let Σ consist of the following standard CCs: for all i =1,...,n, card(h 2i 1 H 2i ) 1, and for all i =1,...,2n, card(h i ) 2. Then dup Σ,Ts (T ) contains the 2 n sets X T where for each i =1,...,n either H 2i 1 X or H 2i X. According to Theorem 4 every Armstrong table for Σ and nfs(t s ) contains a number of rows that is exponential in Σ. A similar construction was used in [4] to show that the size of a minimum-sized Armstrong relation can be exponential in the size of a given FD set. Let T = H 1 H 1 H nh n, T s = T, and let Σ consist of the following standard CCs: for all i =1,...,n, card(h i ) 3andcard(H i ) 3, and for all X = X 1 X n where X i {H i,h i }, card(x) 2. Then Σ is its own optimal cover,
19 Functional Dependencies and Cardinality Constraints over Partial Bags 183 i.e. there is no equivalent set Σ of standard FDs and standard CCs such that Σ < Σ. Thesize Σ is in O(2 n ). Furthermore, dup Σ,Ts (T ) consists of the n sets T H i H i for i =1,...,n,andthesetT,andmax Σ,T s (T ) consists of the 2n sets T H i and T H i for i =1,...,n. Thus, Algorithm 5 computes an Armstrong table for Σ and nfs(t s ) whose number of rows is in O(n). For these reasons we recommend the use of both representations. Indeed, the representation in form of constraint sets enables design teams to identify constraints they currently incorrectly perceive as semantically meaningful; and the representation in form of an Armstrong table enables design teams to identify constraints they currently incorrectly perceive as semantically meaningless. 6 Conclusion and Future Work We have investigated the combined class of functional dependencies, cardinality constraints and NOT NULL constraints over partial bags. This framework applies to the structure of SQL tables. We have characterized the associated implication problem of this class axiomatically and algorithmically. Our results show how reasoning about this expressive class of constraints can be done effectively and efficiently. Moreover, we have established several structural and computational properties of Armstrong tables for this class of constraints. Our results show how Armstrong tables can be used effectively to consolidate the semantics of an application domain expressed by the class of constraints studied. In future work we would like to implement our algorithms within a design tool. Such a tool may also be used to conduct empirical studies on the usefulness of Armstrong tables for the acquisition of semantically meaningful constraints in our class studied, very much along the lines of [16]. It seems desirable to extend our results to even more expressive classes of constraints, e.g. classes of multivalued and inclusion dependencies. Another challenging problem would be an extension to classes of cardinality constraints that also enforce lower bounds. Properties of Armstrong databases should also be studied in probabilistic and graph databases, and the concept of informative Armstrong databases should be investigated in non-relational models [6]. It would also be interesting to analyse interactions of cardinality constraints and functional dependencies under different interpretations of the null marker [12,18,23]. Acknowledgement. This research is supported by the Marsden fund council from Government funding, administered by the Royal Society of New Zealand. References 1. Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley (1995) 2. Armstrong, W.W.: Dependency structures of database relationships. Information Processing 74, (1974)
20 184 S. Hartmann et al. 3. Atzeni, P., Morfuni, N.: Functional dependencies and constraints on null values in database relations. Information and Control 70(1), 1 31 (1986) 4. Beeri, C., Dowd, M., Fagin, R., Statman, R.: On the structure of Armstrong relations for functional dependencies. J. ACM 31(1), (1984) 5. Codd, E.F.: A relational model of data for large shared data banks. Commun. ACM 13(6), (1970) 6. De Marchi, F., Petit, J.-M.: Semantic sampling of existing databases through informative Armstrong databases. Inf. Syst. 32(3), (2007) 7. Demetrovics, J.: On the equivalence of candidate keys with Sperner systems. Acta Cybern. 4, (1980) 8. Diederich, J., Milton, J.: New methods and fast algorithms for database normalization. ACM Trans. Database Syst. 13(3), (1988) 9. Fagin, R.: Armstrong databases. Technical Report RJ3440(40926), IBM Research Laboratory, San Jose, California, USA (1982) 10. Hartmann, S.: On the implication problem for cardinality constraints and functional dependencies. Ann. Math. Art. Intell. 33, (2001) 11. Hartmann, S., Kirchberg, M., Link, S.: Design by example for SQL table definitions with functional dependencies. The VLDB Journal (2011), doi: / s Hartmann, S., Leck, U., Link, S.: On Codd families of keys over incomplete relations. The Computer Journal 54(7), (2011) 13. Hartmann, S., Link, S.: Numerical constraints on XML data. Inf. Comput. 208(5), (2010) 14. Hartmann, S., Link, S.: When data dependencies over SQL tables meet the Logics of Paradox and S-3. In: PODS Conference (2010) 15. Imielinski, T., Lipski Jr., W.: Incomplete information in relational databases. J. ACM 31(4), (1984) 16. Langeveldt, W.-D., Link, S.: Empirical evidence for the usefulness of Armstrong relations in the acquisition of meaningful functional dependencies. Inf. Syst. 35(3), (2010) 17. Lenzerini, M., Nobili, P.: On the satisfiability of dependency constraints in entityrelationship schemata. Inf. Syst. 15(4), (1990) 18. Levene, M., Loizou, G.: Axiomatisation of functional dependencies in incomplete relations. Theor. Comput. Sci. 206(1-2), (1998) 19. Liddle, S., Embley, D., Woodfield, S.: Cardinality constraints in semantic data models. Data Knowl. Eng. 11, (1993) 20. Lien, E.: On the equivalence of database models. J. ACM 29(2), (1982) 21. Mannila, H., Räihä, K.-J.: Design by example: An application of Armstrong relations. J. Comput. Syst. Sci. 33(2), (1986) 22. Sali, A., Schewe, K.-D.: Keys and Armstrong databases in trees with restructuring. Acta Cybern. 18(3), (2008) 23. Thalheim, B.: On semantic issues connected with keys in relational databases permitting null values. Elektronische Informationsverarbeitung und Kybernetik 25(1-2), (1989) 24. Thalheim, B.: Dependencies in relational databases. Teubner (1991) 25. Thalheim, B.: Fundamentals of Cardinality Constraints. In: Pernul, G., Tjoa, A.M. (eds.) ER LNCS, vol. 645, pp Springer, Heidelberg (1992) 26. Thalheim, B.: Entity-Relationship modeling. Springer, Heidelberg (2000)
Design by Example for SQL Tables with Functional Dependencies
VLDB Journal manuscript No. (will be inserted by the editor) Design by Example for SQL Tables with Functional Dependencies Sven Hartmann Markus Kirchberg Sebastian Link Received: date / Accepted: date
More informationOn the logical Implication of Multivalued Dependencies with Null Values
On the logical Implication of Multivalued Dependencies with Null Values Sebastian Link Department of Information Systems, Information Science Research Centre Massey University, Palmerston North, New Zealand
More informationGuaranteeing No Interaction Between Functional Dependencies and Tree-Like Inclusion Dependencies
Guaranteeing No Interaction Between Functional Dependencies and Tree-Like Inclusion Dependencies Mark Levene Department of Computer Science University College London Gower Street London WC1E 6BT, U.K.
More informationJournal of Computer and System Sciences
Journal of Computer and System Sciences 78 (2012) 1026 1044 Contents lists available at SciVerse ScienceDirect Journal of Computer and System Sciences www.elsevier.com/locate/jcss Characterisations of
More informationOn a problem of Fagin concerning multivalued dependencies in relational databases
Theoretical Computer Science 353 (2006) 53 62 www.elsevier.com/locate/tcs On a problem of Fagin concerning multivalued dependencies in relational databases Sven Hartmann, Sebastian Link,1 Department of
More informationA CORRECTED 5NF DEFINITION FOR RELATIONAL DATABASE DESIGN. Millist W. Vincent ABSTRACT
A CORRECTED 5NF DEFINITION FOR RELATIONAL DATABASE DESIGN Millist W. Vincent Advanced Computing Research Centre, School of Computer and Information Science, University of South Australia, Adelaide, Australia
More informationRelational Database Design
Relational Database Design Jan Chomicki University at Buffalo Jan Chomicki () Relational database design 1 / 16 Outline 1 Functional dependencies 2 Normal forms 3 Multivalued dependencies Jan Chomicki
More informationPlan of the lecture. G53RDB: Theory of Relational Databases Lecture 10. Logical consequence (implication) Implication problem for fds
Plan of the lecture G53RDB: Theory of Relational Databases Lecture 10 Natasha Alechina School of Computer Science & IT nza@cs.nott.ac.uk Logical implication for functional dependencies Armstrong closure.
More informationSebastian Link University of Auckland, Auckland, New Zealand Henri Prade IRIT, CNRS and Université de Toulouse III, Toulouse, France
CDMTCS Research Report Series Relational Database Schema Design for Uncertain Data Sebastian Link University of Auckland, Auckland, New Zealand Henri Prade IRIT, CNRS and Université de Toulouse III, Toulouse,
More informationConceptual Treatment of Multivalued Dependencies
Conceptual Treatment of Multivalued Dependencies Bernhard Thalheim Computer Science Institute, Brandenburg University of Technology at Cottbus, PostBox 101344, D-03013 Cottbus thalheim@informatik.tu-cottbus.de
More informationFrom Constructibility and Absoluteness to Computability and Domain Independence
From Constructibility and Absoluteness to Computability and Domain Independence Arnon Avron School of Computer Science Tel Aviv University, Tel Aviv 69978, Israel aa@math.tau.ac.il Abstract. Gödel s main
More informationCopyright is owned by the Author of the thesis. Permission is given for a copy to be downloaded by an individual for the purpose of research and
Copyright is owned by the Author of the thesis. Permission is given for a copy to be downloaded by an individual for the purpose of research and private study only. The thesis may not be reproduced elsewhere
More informationConstraints: Functional Dependencies
Constraints: Functional Dependencies Fall 2017 School of Computer Science University of Waterloo Databases CS348 (University of Waterloo) Functional Dependencies 1 / 42 Schema Design When we get a relational
More informationarxiv: v1 [cs.db] 17 Apr 2014
On Independence Atoms and Keys Miika Hannula 1, Juha Kontinen 1, and Sebastian Link 2 arxiv:14044468v1 [csdb] 17 Apr 2014 1 University of Helsinki, Department of Mathematics and Statistics, Helsinki, Finland
More informationArmstrong Relations for Ontology Design and Evaluation
Armstrong Relations for Ontology Design and Evaluation Henriette Harmse 1, Katarina Britz 2, and Aurona Gerber 1 1 CSIR Meraka Institute and Department of Informatics, University of Pretoria, South Africa
More informationOn Inferences of Weak Multivalued Dependencies
Fundamenta Informaticae 92 (2009) 83 102 83 DOI 10.3233/FI-2009-0067 IOS Press On Inferences of Weak Multivalued Dependencies Sven Hartmann Department of Informatics, Clausthal University of Technology
More informationOn Multivalued Dependencies in Fixed and Undetermined Universes
On Multivalued Dependencies in Fixed and Undetermined Universes Sebastian Link Information Science Research Centre, Dept. of Information Systems, Massey University, Palmerston North, New Zealand s.link@massey.ac.nz
More informationSchema Refinement and Normal Forms. Chapter 19
Schema Refinement and Normal Forms Chapter 19 1 Review: Database Design Requirements Analysis user needs; what must the database do? Conceptual Design high level descr. (often done w/er model) Logical
More informationSchema Refinement: Other Dependencies and Higher Normal Forms
Schema Refinement: Other Dependencies and Higher Normal Forms Spring 2018 School of Computer Science University of Waterloo Databases CS348 (University of Waterloo) Higher Normal Forms 1 / 14 Outline 1
More informationarxiv: v1 [cs.db] 21 Sep 2016
Ladan Golshanara 1, Jan Chomicki 1, and Wang-Chiew Tan 2 1 State University of New York at Buffalo, NY, USA ladangol@buffalo.edu, chomicki@buffalo.edu 2 Recruit Institute of Technology and UC Santa Cruz,
More informationSchema Refinement & Normalization Theory
Schema Refinement & Normalization Theory Functional Dependencies Week 13 1 What s the Problem Consider relation obtained (call it SNLRHW) Hourly_Emps(ssn, name, lot, rating, hrly_wage, hrs_worked) What
More informationUVA UVA UVA UVA. Database Design. Relational Database Design. Functional Dependency. Loss of Information
Relational Database Design Database Design To generate a set of relation schemas that allows - to store information without unnecessary redundancy - to retrieve desired information easily Approach - design
More informationFunctional Dependencies and Normalization
Functional Dependencies and Normalization There are many forms of constraints on relational database schemata other than key dependencies. Undoubtedly most important is the functional dependency. A functional
More information5 Set Operations, Functions, and Counting
5 Set Operations, Functions, and Counting Let N denote the positive integers, N 0 := N {0} be the non-negative integers and Z = N 0 ( N) the positive and negative integers including 0, Q the rational numbers,
More informationFunctional. Dependencies. Functional Dependency. Definition. Motivation: Definition 11/12/2013
Functional Dependencies Functional Dependency Functional dependency describes the relationship between attributes in a relation. Eg. if A and B are attributes of relation R, B is functionally dependent
More informationRQL: a Query Language for Implications
RQL: a Query Language for Implications Jean-Marc Petit (joint work with B. Chardin, E. Coquery and M. Pailloux) INSA Lyon CNRS and Université de Lyon Dagstuhl Seminar 12-16 May 2014 Horn formulas, directed
More informationCorrigendum to On the undecidability of implications between embedded multivalued database dependencies [Inform. and Comput. 122 (1995) ]
Information and Computation 204 (2006) 1847 1851 www.elsevier.com/locate/ic Corrigendum Corrigendum to On the undecidability of implications between embedded multivalued database dependencies [Inform.
More informationTrichotomy Results on the Complexity of Reasoning with Disjunctive Logic Programs
Trichotomy Results on the Complexity of Reasoning with Disjunctive Logic Programs Mirosław Truszczyński Department of Computer Science, University of Kentucky, Lexington, KY 40506, USA Abstract. We present
More informationTree sets. Reinhard Diestel
1 Tree sets Reinhard Diestel Abstract We study an abstract notion of tree structure which generalizes treedecompositions of graphs and matroids. Unlike tree-decompositions, which are too closely linked
More informationGlobal Database Design based on Storage Space and Update Time Minimization
Journal of Universal Computer Science, vol. 15, no. 1 (2009), 195-240 submitted: 11/1/08, accepted: 15/8/08, appeared: 1/1/09 J.UCS Global Database Design based on Storage Space and Update Time Minimization
More informationOn the Intractability of Computing the Duquenne-Guigues Base
Journal of Universal Computer Science, vol 10, no 8 (2004), 927-933 submitted: 22/3/04, accepted: 28/6/04, appeared: 28/8/04 JUCS On the Intractability of Computing the Duquenne-Guigues Base Sergei O Kuznetsov
More information2. Prime and Maximal Ideals
18 Andreas Gathmann 2. Prime and Maximal Ideals There are two special kinds of ideals that are of particular importance, both algebraically and geometrically: the so-called prime and maximal ideals. Let
More informationConstraints: Functional Dependencies
Constraints: Functional Dependencies Spring 2018 School of Computer Science University of Waterloo Databases CS348 (University of Waterloo) Functional Dependencies 1 / 32 Schema Design When we get a relational
More informationIntroduction to Metalogic
Philosophy 135 Spring 2008 Tony Martin Introduction to Metalogic 1 The semantics of sentential logic. The language L of sentential logic. Symbols of L: Remarks: (i) sentence letters p 0, p 1, p 2,... (ii)
More informationDesign Theory: Functional Dependencies and Normal Forms, Part I Instructor: Shel Finkelstein
Design Theory: Functional Dependencies and Normal Forms, Part I Instructor: Shel Finkelstein Reference: A First Course in Database Systems, 3 rd edition, Chapter 3 Important Notices CMPS 180 Final Exam
More informationEquational Logic. Chapter Syntax Terms and Term Algebras
Chapter 2 Equational Logic 2.1 Syntax 2.1.1 Terms and Term Algebras The natural logic of algebra is equational logic, whose propositions are universally quantified identities between terms built up from
More informationKRIPKE S THEORY OF TRUTH 1. INTRODUCTION
KRIPKE S THEORY OF TRUTH RICHARD G HECK, JR 1. INTRODUCTION The purpose of this note is to give a simple, easily accessible proof of the existence of the minimal fixed point, and of various maximal fixed
More informationEnhancing the Updatability of Projective Views
Enhancing the Updatability of Projective Views (Extended Abstract) Paolo Guagliardo 1, Reinhard Pichler 2, and Emanuel Sallinger 2 1 KRDB Research Centre, Free University of Bozen-Bolzano 2 Vienna University
More informationSchema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1
Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1 Background We started with schema design ER model translation into a relational schema Then we studied relational
More informationHandbook of Logic and Proof Techniques for Computer Science
Steven G. Krantz Handbook of Logic and Proof Techniques for Computer Science With 16 Figures BIRKHAUSER SPRINGER BOSTON * NEW YORK Preface xvii 1 Notation and First-Order Logic 1 1.1 The Use of Connectives
More informationAxiomatizing Conditional Independence and Inclusion Dependencies
Axiomatizing Conditional Independence and Inclusion Dependencies Miika Hannula University of Helsinki 6.3.2014 Miika Hannula (University of Helsinki) Axiomatizing Conditional Independence and Inclusion
More informationAxiomatic set theory. Chapter Why axiomatic set theory?
Chapter 1 Axiomatic set theory 1.1 Why axiomatic set theory? Essentially all mathematical theories deal with sets in one way or another. In most cases, however, the use of set theory is limited to its
More informationPairing Transitive Closure and Reduction to Efficiently Reason about Partially Ordered Events
Pairing Transitive Closure and Reduction to Efficiently Reason about Partially Ordered Events Massimo Franceschet Angelo Montanari Dipartimento di Matematica e Informatica, Università di Udine Via delle
More informationSchema Refinement and Normal Forms
Schema Refinement and Normal Forms Chapter 19 Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1 The Evils of Redundancy Redundancy is at the root of several problems associated with relational
More informationA Unit Resolution Approach to Knowledge Compilation. 2. Preliminaries
A Unit Resolution Approach to Knowledge Compilation Arindama Singh and Manoj K Raut Department of Mathematics Indian Institute of Technology Chennai-600036, India Abstract : Knowledge compilation deals
More informationCMPS Advanced Database Systems. Dr. Chengwei Lei CEECS California State University, Bakersfield
CMPS 4420 Advanced Database Systems Dr. Chengwei Lei CEECS California State University, Bakersfield CHAPTER 15 Relational Database Design Algorithms and Further Dependencies Slide 15-2 Chapter Outline
More informationPairing Transitive Closure and Reduction to Efficiently Reason about Partially Ordered Events
Pairing Transitive Closure and Reduction to Efficiently Reason about Partially Ordered Events Massimo Franceschet Angelo Montanari Dipartimento di Matematica e Informatica, Università di Udine Via delle
More informationSchema Refinement. Feb 4, 2010
Schema Refinement Feb 4, 2010 1 Relational Schema Design Conceptual Design name Product buys Person price name ssn ER Model Logical design Relational Schema plus Integrity Constraints Schema Refinement
More informationMathematics 114L Spring 2018 D.A. Martin. Mathematical Logic
Mathematics 114L Spring 2018 D.A. Martin Mathematical Logic 1 First-Order Languages. Symbols. All first-order languages we consider will have the following symbols: (i) variables v 1, v 2, v 3,... ; (ii)
More informationRelational Design Theory
Relational Design Theory CSE462 Database Concepts Demian Lessa/Jan Chomicki Department of Computer Science and Engineering State University of New York, Buffalo Fall 2013 Overview How does one design a
More informationFunctional Dependencies & Normalization. Dr. Bassam Hammo
Functional Dependencies & Normalization Dr. Bassam Hammo Redundancy and Normalisation Redundant Data Can be determined from other data in the database Leads to various problems INSERT anomalies UPDATE
More informationThe Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Example (Contd.
The Evils of Redundancy Schema Refinement and Normal Forms Chapter 19 Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1 Redundancy is at the root of several problems associated with relational
More informationRelational-Database Design
C H A P T E R 7 Relational-Database Design Exercises 7.2 Answer: A decomposition {R 1, R 2 } is a lossless-join decomposition if R 1 R 2 R 1 or R 1 R 2 R 2. Let R 1 =(A, B, C), R 2 =(A, D, E), and R 1
More informationThe Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Refining an ER Diagram
Schema Refinement and Normal Forms Chapter 19 Database Management Systems, R. Ramakrishnan and J. Gehrke 1 The Evils of Redundancy Redundancy is at the root of several problems associated with relational
More informationTopology Proceedings. COPYRIGHT c by Topology Proceedings. All rights reserved.
Topology Proceedings Web: http://topology.auburn.edu/tp/ Mail: Topology Proceedings Department of Mathematics & Statistics Auburn University, Alabama 36849, USA E-mail: topolog@auburn.edu ISSN: 0146-4124
More informationMath 4603: Advanced Calculus I, Summer 2016 University of Minnesota Notes on Cardinality of Sets
Math 4603: Advanced Calculus I, Summer 2016 University of Minnesota Notes on Cardinality of Sets Introduction In this short article, we will describe some basic notions on cardinality of sets. Given two
More informationBoolean Algebras. Chapter 2
Chapter 2 Boolean Algebras Let X be an arbitrary set and let P(X) be the class of all subsets of X (the power set of X). Three natural set-theoretic operations on P(X) are the binary operations of union
More informationTORIC WEAK FANO VARIETIES ASSOCIATED TO BUILDING SETS
TORIC WEAK FANO VARIETIES ASSOCIATED TO BUILDING SETS YUSUKE SUYAMA Abstract. We give a necessary and sufficient condition for the nonsingular projective toric variety associated to a building set to be
More informationHerbrand Theorem, Equality, and Compactness
CSC 438F/2404F Notes (S. Cook and T. Pitassi) Fall, 2014 Herbrand Theorem, Equality, and Compactness The Herbrand Theorem We now consider a complete method for proving the unsatisfiability of sets of first-order
More informationCOSC 430 Advanced Database Topics. Lecture 2: Relational Theory Haibo Zhang Computer Science, University of Otago
COSC 430 Advanced Database Topics Lecture 2: Relational Theory Haibo Zhang Computer Science, University of Otago Learning objectives and references You should be able to: define the elements of the relational
More informationConstraint Acquisition You Can Chase but You Cannot Find
Constraint Acquisition You Can Chase but You Cannot Find Sven Hartmann Sebastian Link Thu Trinh Department of Information Systems, Information Science Research Centre Massey University, Palmerston North,
More informationData Dependencies in the Presence of Difference
Data Dependencies in the Presence of Difference Tsinghua University sxsong@tsinghua.edu.cn Outline Introduction Application Foundation Discovery Conclusion and Future Work Data Dependencies in the Presence
More informationThe Evils of Redundancy. Schema Refinement and Normal Forms. Functional Dependencies (FDs) Example: Constraints on Entity Set. Example (Contd.
The Evils of Redundancy Schema Refinement and Normal Forms INFO 330, Fall 2006 1 Redundancy is at the root of several problems associated with relational schemas: redundant storage, insert/delete/update
More informationSchema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) [R&G] Chapter 19
Schema Refinement and Normal Forms [R&G] Chapter 19 CS432 1 The Evils of Redundancy Redundancy is at the root of several problems associated with relational schemas: redundant storage, insert/delete/update
More informationSCHEMA NORMALIZATION. CS 564- Fall 2015
SCHEMA NORMALIZATION CS 564- Fall 2015 HOW TO BUILD A DB APPLICATION Pick an application Figure out what to model (ER model) Output: ER diagram Transform the ER diagram to a relational schema Refine the
More informationCharacterization of Semantics for Argument Systems
Characterization of Semantics for Argument Systems Philippe Besnard and Sylvie Doutre IRIT Université Paul Sabatier 118, route de Narbonne 31062 Toulouse Cedex 4 France besnard, doutre}@irit.fr Abstract
More informationAN EXTENSION OF THE PROBABILITY LOGIC LP P 2. Tatjana Stojanović 1, Ana Kaplarević-Mališić 1 and Zoran Ognjanović 2
45 Kragujevac J. Math. 33 (2010) 45 62. AN EXTENSION OF THE PROBABILITY LOGIC LP P 2 Tatjana Stojanović 1, Ana Kaplarević-Mališić 1 and Zoran Ognjanović 2 1 University of Kragujevac, Faculty of Science,
More informationIntroduction to Data Management. Lecture #6 (Relational Design Theory)
Introduction to Data Management Lecture #6 (Relational Design Theory) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Announcements v HW#2 is
More informationOn Incomplete XML Documents with Integrity Constraints
On Incomplete XML Documents with Integrity Constraints Pablo Barceló 1, Leonid Libkin 2, and Juan Reutter 2 1 Department of Computer Science, University of Chile 2 School of Informatics, University of
More informationExercises 1 - Solutions
Exercises 1 - Solutions SAV 2013 1 PL validity For each of the following propositional logic formulae determine whether it is valid or not. If it is valid prove it, otherwise give a counterexample. Note
More informationSchema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) CIS 330, Spring 2004 Lecture 11 March 2, 2004
Schema Refinement and Normal Forms CIS 330, Spring 2004 Lecture 11 March 2, 2004 1 The Evils of Redundancy Redundancy is at the root of several problems associated with relational schemas: redundant storage,
More informationMetainduction in Operational Set Theory
Metainduction in Operational Set Theory Luis E. Sanchis Department of Electrical Engineering and Computer Science Syracuse University Syracuse, NY 13244-4100 Sanchis@top.cis.syr.edu http://www.cis.syr.edu/
More informationChapter 2 Axiomatic Set Theory
Chapter 2 Axiomatic Set Theory Ernst Zermelo (1871 1953) was the first to find an axiomatization of set theory, and it was later expanded by Abraham Fraenkel (1891 1965). 2.1 Zermelo Fraenkel Set Theory
More informationData Bases Data Mining Foundations of databases: from functional dependencies to normal forms
Data Bases Data Mining Foundations of databases: from functional dependencies to normal forms Database Group http://liris.cnrs.fr/ecoquery/dokuwiki/doku.php?id=enseignement: dbdm:start March 1, 2017 Exemple
More information1. Propositional Calculus
1. Propositional Calculus Some notes for Math 601, Fall 2010 based on Elliott Mendelson, Introduction to Mathematical Logic, Fifth edition, 2010, Chapman & Hall. 2. Syntax ( grammar ). 1.1, p. 1. Given:
More informationIntroduction to Real Analysis Alternative Chapter 1
Christopher Heil Introduction to Real Analysis Alternative Chapter 1 A Primer on Norms and Banach Spaces Last Updated: March 10, 2018 c 2018 by Christopher Heil Chapter 1 A Primer on Norms and Banach Spaces
More informationSPECIAL ATTRIBUTES FOR DATABASE NORMAL FORMS DETERMINATION
STUDIA UNIV. BABEŞ BOLYAI, INFORMATICA, Volume LVII, Number 1, 2012 SPECIAL ATTRIBUTES FOR DATABASE NORMAL FORMS DETERMINATION VITALIE COTELEA Abstract. The article deals with the relational schemes defined
More informationFunctional Dependencies. Getting a good DB design Lisa Ball November 2012
Functional Dependencies Getting a good DB design Lisa Ball November 2012 Outline (2012) SEE NEXT SLIDE FOR ALL TOPICS (some for you to read) Normalization covered by Dr Sanchez Armstrong s Axioms other
More informationBCNF revisited: 40 Years Normal Forms
Full set of slides BCNF revisited: 40 Years Normal Forms Faculty of Computer Science Technion - IIT, Haifa janos@cs.technion.ac.il www.cs.technion.ac.il/ janos 1 Full set of slides Acknowledgements Based
More informationRestricted versions of the Tukey-Teichmüller Theorem that are equivalent to the Boolean Prime Ideal Theorem
Restricted versions of the Tukey-Teichmüller Theorem that are equivalent to the Boolean Prime Ideal Theorem R.E. Hodel Dedicated to W.W. Comfort on the occasion of his seventieth birthday. Abstract We
More informationA New and Useful Syntactic Restriction on Rule Semantics for Tabular Data
A New and Useful Syntactic Restriction on Rule Semantics for Tabular Data Marie Agier 1,2, Jean-Marc Petit 3 1 DIAGNOGENE SA, 15000 Aurillac, FRANCE 2 LIMOS, UMR 6158 CNRS, Univ. Clermont-Ferrand II, FRANCE
More informationEquivalence of SQL Queries In Presence of Embedded Dependencies
Equivalence of SQL Queries In Presence of Embedded Dependencies Rada Chirkova Department of Computer Science NC State University, Raleigh, NC 27695, USA chirkova@csc.ncsu.edu Michael R. Genesereth Department
More informationComputational Tasks and Models
1 Computational Tasks and Models Overview: We assume that the reader is familiar with computing devices but may associate the notion of computation with specific incarnations of it. Our first goal is to
More informationSchema Refinement and Normalization
Schema Refinement and Normalization Schema Refinements and FDs Redundancy is at the root of several problems associated with relational schemas. redundant storage, I/D/U anomalies Integrity constraints,
More informationTECHNISCHE UNIVERSITEIT EINDHOVEN Faculteit Wiskunde en Informatica. Final exam Logic & Set Theory (2IT61) (correction model)
TECHNISCHE UNIVERSITEIT EINDHOVEN Faculteit Wiskunde en Informatica Final exam Logic & Set Theory (2IT61) (correction model) Thursday November 4, 2016, 9:00 12:00 hrs. (2) 1. Determine whether the abstract
More informationGeneralized hashing and applications to digital fingerprinting
Generalized hashing and applications to digital fingerprinting Noga Alon, Gérard Cohen, Michael Krivelevich and Simon Litsyn Abstract Let C be a code of length n over an alphabet of q letters. An n-word
More informationComposing Schema Mappings: Second-Order Dependencies to the Rescue
Composing Schema Mappings: Second-Order Dependencies to the Rescue RONALD FAGIN IBM Almaden Research Center PHOKION G. KOLAITIS IBM Almaden Research Center LUCIAN POPA IBM Almaden Research Center WANG-CHIEW
More informationCONSTRUCTION OF THE REAL NUMBERS.
CONSTRUCTION OF THE REAL NUMBERS. IAN KIMING 1. Motivation. It will not come as a big surprise to anyone when I say that we need the real numbers in mathematics. More to the point, we need to be able to
More informationProducts, Relations and Functions
Products, Relations and Functions For a variety of reasons, in this course it will be useful to modify a few of the settheoretic preliminaries in the first chapter of Munkres. The discussion below explains
More informationComputability Theoretic Properties of Injection Structures
Computability Theoretic Properties of Injection Structures Douglas Cenzer 1, Valentina Harizanov 2 and Jeffrey B. Remmel 3 Abstract We study computability theoretic properties of computable injection structures
More informationFirst-Order Theorem Proving and Vampire
First-Order Theorem Proving and Vampire Laura Kovács 1,2 and Martin Suda 2 1 TU Wien 2 Chalmers Outline Introduction First-Order Logic and TPTP Inference Systems Saturation Algorithms Redundancy Elimination
More informationSchema Refinement and Normal Forms Chapter 19
Schema Refinement and Normal Forms Chapter 19 Instructor: Vladimir Zadorozhny vladimir@sis.pitt.edu Information Science Program School of Information Sciences, University of Pittsburgh Database Management
More informationChapter 2 Background. 2.1 A Basic Description Logic
Chapter 2 Background Abstract Description Logics is a family of knowledge representation formalisms used to represent knowledge of a domain, usually called world. For that, it first defines the relevant
More informationChapter 2. Assertions. An Introduction to Separation Logic c 2011 John C. Reynolds February 3, 2011
Chapter 2 An Introduction to Separation Logic c 2011 John C. Reynolds February 3, 2011 Assertions In this chapter, we give a more detailed exposition of the assertions of separation logic: their meaning,
More informationProbabilistic and Truth-Functional Many-Valued Logic Programming
Probabilistic and Truth-Functional Many-Valued Logic Programming Thomas Lukasiewicz Institut für Informatik, Universität Gießen Arndtstraße 2, D-35392 Gießen, Germany Abstract We introduce probabilistic
More informationNested Epistemic Logic Programs
Nested Epistemic Logic Programs Kewen Wang 1 and Yan Zhang 2 1 Griffith University, Australia k.wang@griffith.edu.au 2 University of Western Sydney yan@cit.uws.edu.au Abstract. Nested logic programs and
More informationThe Evils of Redundancy. Schema Refinement and Normalization. Functional Dependencies (FDs) Example: Constraints on Entity Set. Refining an ER Diagram
The Evils of Redundancy Schema Refinement and Normalization Chapter 1 Nobody realizes that some people expend tremendous energy merely to be normal. Albert Camus Redundancy is at the root of several problems
More informationFunctional Dependencies
Functional Dependencies Functional Dependencies Framework for systematic design and optimization of relational schemas Generalization over the notion of Keys Crucial in obtaining correct normalized schemas
More informationChapter 3. Cartesian Products and Relations. 3.1 Cartesian Products
Chapter 3 Cartesian Products and Relations The material in this chapter is the first real encounter with abstraction. Relations are very general thing they are a special type of subset. After introducing
More informationInformation Flow on Directed Acyclic Graphs
Information Flow on Directed Acyclic Graphs Michael Donders, Sara Miner More, and Pavel Naumov Department of Mathematics and Computer Science McDaniel College, Westminster, Maryland 21157, USA {msd002,smore,pnaumov}@mcdaniel.edu
More information