Data Dependencies in the Presence of Difference

Size: px
Start display at page:

Download "Data Dependencies in the Presence of Difference"

Transcription

1 Data Dependencies in the Presence of Difference Tsinghua University

2 Outline Introduction Application Foundation Discovery Conclusion and Future Work

3 Data Dependencies in the Presence of Difference Introduction 1/31 Motivation Data Dependencies traditionally for quality of Schema: schema design, integrity constraints, query optimization, etc. Data Dependencies recently for quality of Data: data cleaning, data repairing, record matching, etc. Table: Example instance of Empolyee name institute title salary ssn t 1 John Depp Tech. Univ. Professor t 2 J. Depp Technical Univ. Professor t 3 J.C. Depp Tech. University Prof t 4 R. Depp Western Univ. Lecturer

4 Data Dependencies in the Presence of Difference Introduction 2/31 Motivation Identification function in schema-oriented issues, in conventional dependencies, e.g., fds title salary t 1 [title] : Professor = t 2 [title] : Professor t 1 [salary] : 60 = t 2 [salary] : 60 Difference semantics in data-oriented practice, on numerical values or text values, e.g., similar or dissimilar. title : Professor Prof. salary: 60k v.s. 3k

5 Data Dependencies in the Presence of Difference Introduction 3/31 Differential Dependencies: Syntax We propose a novel type of dependencies differential dependencies (dds) in the form of φ L [X] φ R [Y] φ L [X] and φ R [Y] are differential functions, which specify distance constraints on attributes X and Y of R, respectively. Constraints on difference for any two tuples (t 1,t 2 ) from an instance of R if their value differences (measured by certain distance metric) on attributes X agree with the differential function φ L [X], (t 1,t 2 ) φ L [X] then their value differences on Y should also agree with the differential function φ R [Y], (t 1,t 2 ) φ R [Y]

6 Data Dependencies in the Presence of Difference Introduction 4/31 Example A dd in a credit card transaction database can be dd 1 [cardno(= 0) position( 60)] [transtime( 20)] cardno(= 0) states that two transactions have the same credit card no (the difference on attribute cardno is 0) position( 60), transtime( 20) are differential functions specified on attribute position, transtime, respectively Constraints on difference If the distance of two transaction positions of a same cardno is 60 km (e.g., two different cities) they are probably two transactions happening at different time the difference between transtime should be 20 mins. If two card transactions do not satisfy dd 1, one of the transactions could be a fraud.

7 Data Dependencies in the Presence of Difference Introduction 5/31 Example A dd in a price database of a flight, in decision support systems dd 2 [date( 7)] [price( 100)] states that the price difference of any two days in a week length should be less than 100 $ Instead of a week length, another dd may specify dd 3 [date(> 7, 30)] [price(> 100, 900)] the price difference constraint of two days not in a week length but in a month length Both dd 2 and dd 3 specify on the same embedded attributes date price but with different constraint semantics, i.e., week and month.

8 Data Dependencies in the Presence of Difference Introduction 6/31 Related Work Conditional functional dependencies (cfds) (X A,t p ) make the fds, originally hold for the whole table, valid only for a set of tuples specified by the conditions ([country, zip] [street], < Finland, >) Metric functional dependencies (mfds) X δ A similarity metrics in the right-hand-side, for violation detection name 2 address Matching dependencies (mds) [X ] [A ] similar semantics in the left-hand-side, for record matching [name ] [addr ] [tel ]

9 Data Dependencies in the Presence of Difference Introduction 7/31 Comparison cfds introduce condition extension, which is still on identification semantics. mfds, mds consider the similar semantics, on either determinant attributes X or dependent attributes Y Our differential dependencies dds φ L [X] φ R [Y] address more general difference constraints with various semantics similar (e.g., price( 100) in dd 2 ) dissimilar/different (e.g., transtime( 20) in dd 1 ), or even more complicated ones (e.g., date(> 7, 30) in dd 3 ) allow setting difference constraints on both determinant attributes X and dependent attributes Y

10 Outline Introduction Application Foundation Discovery Conclusion and Future Work

11 Data Dependencies in the Presence of Difference Application 8/31 Example: Violation Detection To find the tuples that violate dependencies according to dd 2 [date( 7)] [price( 100)] t 3,t 4 are detected as violations to dd 2 fds cannot express such constraints on difference t 3,t 4 cannot be detected by a fd date price t 1,t 2 are detected as violations to fd by mistake Table: Example of a price database Tuple Date Price t ,000 t ,050 t ,000 t ,000

12 Data Dependencies in the Presence of Difference Application 9/31 Evaluation: Violation Detection dds compared with fds with identification functions differential functions in the right-hand-side Y detect violations more accurately the detection precision is higher than fds differential functions in the left-hand-side X address more tuples with violations the detection recall by using dds is higher than fds Cora FDs DDs Cora Precision Recall FDs DDs 0 I1 I2 I3 I4 I5 I6 Data instance I1 I2 I3 I4 I5 I6 Data instance

13 Data Dependencies in the Presence of Difference Application 10/31 Example: Data Partition To optimize data partition queries Integrity constraints (e.g., fds or candidate keys) can be utilized to optimize the evaluation of queries known as the semantic query optimization Consider a group-by query on distance conditions SELECT * FROM Employee GROUP BY institute( 5) title( 6) according to [institute( 5)] [institute( 5) title( 6)] rewrite the query by using institute( 5) only SELECT * FROM Employee GROUP BY institute( 5)

14 Data Dependencies in the Presence of Difference Application 11/31 Evaluation: Data Partition Using candidate differential key dependencies, cdk dependencies In x-axis, each element a/b corresponds to a pair of reduced/original differential functions for partitioning queries a denotes the cardinality of cdk b denotes the cardinality of original partition scheme the smaller the rate a/b is, the more the performance can be improved Time cost (s) CiteSeer original CDKs Time cost (s) Cora original CDKs 0 1/5 1/4 2/5 1/3 2/4 2/3 Partition scheme 0 2/5 3/4 1/3 2/3 3/4 2/3 Partition scheme

15 Data Dependencies in the Presence of Difference Application 12/31 Example: Record Linkage To identify duplicate record, a.k.a. record matching, merge-purge use dds as matching rules dd 1 [name( 5) institute( 7)] [ssn(= 0)] t 1,t 2, whose name distance is 5, and institute distance is 7, probably denote the same employee with identical ssn another valid matching rule on same attributes dd 2 [name( 3) institute( 15)] [ssn(= 0)] t 2,t 3 detected as duplicates by dd 2, not detected by dd 1 Table: Example instance of Empolyee name institute title salary ssn t 1 John Depp Tech. Univ. Professor t 2 J. Depp Technical Univ. Professor t 3 J.C. Depp Tech. University Prof t 4 R. Depp Western Univ. Lecturer

16 Data Dependencies in the Presence of Difference Application 13/31 Evaluation: Record Linkage dds compared mds, mds associate only one differential function on each attribute dds can specify various differential functions on one attribute dds address more matching rules recall of dds is significantly higher dds have comparable precision as mds, both are valid matching rules 1 Restaurant 1 Restaurant Precision Recall MDs DDs 0 I1 I2 I3 I4 I5 I6 Data instance 0.2 MDs DDs 0 I1 I2 I3 I4 I5 I6 Data instance

17 Outline Introduction Application Foundation Discovery Conclusion and Future Work

18 Data Dependencies in the Presence of Difference Foundation 14/31 Differential Function: Intersection The intersection of φ 1 [Z] and φ 2 [Z] on the same attributes Z is φ 3 [Z] = φ 1 [Z] φ 2 [Z] any (t 1,t 2 ) φ 1 [Z] and (t 1,t 2 ) φ 2 [Z], then (t 1,t 2 ) φ 3 [Z] any (t 1,t 2 ) φ 1 [Z] or (t 1,t 2 ) φ 2 [Z], then (t 1,t 2 ) φ 3 [Z] [name( 9)] [name( 7)] = [name( 7)] Apply intersection between φ 1 [X] and φ 2 [Y] on different attributes X and Y Let Z = X Y φ 1 [X] φ 2 [Y] =(φ 1 [X \Z] φ 1 [Z]) (φ 2 [Z] φ 2 [Y \Z]) =φ 1 [X \Z] (φ 1 [Z] φ 2 [Z]) φ 2 [Y \Z]. [name( 5) address( 12)] [address( 10)] = [name( 5) address( 10)]

19 Data Dependencies in the Presence of Difference Foundation 15/31 Differential Function: Subsumption Intuitively, the semantics of similar subsumes identification any two values that are identical (with distance = 0) can always be interpreted as similar (with distance 9) Definition Let φ 1 [Z] and φ 2 [Z] be two differential functions on attributes Z If any tuple pair (t 1,t 2 ) φ 2 [Z] always agree (t 1,t 2 ) φ 1 [Z] we say that φ 1 [Z] subsumes φ 2 [Z], written φ 1 [Z] φ 2 [Z] For example φ 1 [name] = [name( 9)] subsumes φ 2 [name] = [name( 7)] denoted by [name( 9)] [name( 7)] a distance value of name that agrees 7 will always agree 9 [date( 30)] [date(> 7, 30)]; [addr( 9)] [addr(= 0)]

20 Data Dependencies in the Presence of Difference Foundation 16/31 Differential Dependency Consider an instance I of relation R (t 1,t 2 ) φ L [X] denotes tuples (t 1,t 2 ) having distance agreeing φ L [X] I satisfies a dd, I φ L [X] φ R [Y], if any two tuples t 1 and t 2 in I having metric distances (t 1,t 2 ) φ L [X] must agree (t 1,t 2 ) φ R [Y] I satisfies a set Σ of dds, I Σ if I φ L [X] φ R [Y] for each φ L [X] φ R [Y] Σ. Proposition For two differential functions φ L [X] and φ R [Y], if Y X and φ R [Y] φ L [Y], then φ L [X] φ R [Y]. a trivial dd, always holds [name( 5) address( 10)] [address( 12)]

21 Data Dependencies in the Presence of Difference Foundation 17/31 Logical Implication Example Consider two dds, dd 4 [name( 7)] [address( 1)], dd 5 [address( 5)] [salary( 50)]. any two tuples t 1 and t 2 having name distance 7, according to dd 4, their distance on address should be 1, (t 1,t 2 ) agree address( 5) as well. the salary distance of t 1 and t 2 should be 50 according to dd 5 We can imply another dd, dd 6 [name( 7)] [salary( 50)].

22 Data Dependencies in the Presence of Difference Foundation 18/31 Implication Problem Let Σ 1 and Σ 2 be two sets of dds. Σ 1 logically implies Σ 2, Σ 1 Σ 2 if for all relation instance I, I Σ 1 implies I Σ 2 Σ 1 and Σ 2 are equivalent, Σ 1 Σ 2 if Σ 1 Σ 2 and Σ 2 Σ 1 The implication problem given a consistent set Σ of dds and another dd φ L [X] φ R [Y] to decide whether Σ can imply this dd, Σ φ L [X] φ R [Y] For example, {dd 4,dd 5 } dd 6

23 Data Dependencies in the Presence of Difference Foundation 19/31 Implication based-on Subsumption Given a dd φ L [X] φ R [Y] φ 1 [Z] φ R [Y] can be implied, if X Z,φ L [X] φ 1 [X] φ L [X] φ 1 [Z] can be implied, if Z Y,φ 1 [Z] φ R [Z] For example, consider a dd [name( 7)] [address( 1)], it implies [name( 5)] [address( 1)] [name( 7)] [address( 2)]

24 Data Dependencies in the Presence of Difference Foundation 20/31 Differential Key Key: t 1 [R] = t 2 [R] according to t 1 [K] = t 2 [K] on a key K R A differential key φ 2 [K] relative to φ 1 [R] is a differential function that can determine φ 1 [R] a differential key dependency φ 2 [K] φ 1 [R] with K R and φ 2 [K] φ 1 [K] For example, [position( 20)] is a differential key relative to [position( 20) area( 5)] according to the following differential key dependency, [position( 20)] [position( 20) area( 5)]

25 Data Dependencies in the Presence of Difference Foundation 21/31 Candidate Differential Key A naïve key relative to φ 1 [R] is φ 1 [R] itself A candidate differential key (cdk) φ c [K] is A cdk an irreducible differential key relative to φ 1 [R], there does not exist any φ 2 [L] such that L K, φ 2 [L] φ c [L] and φ 2 [L] φ 1 [R]. not only has a minimal cardinality as candidate keys on fds, but also should be the one not subsumed by others. cdks are useful in applications like data partition

26 Outline Introduction Application Foundation Discovery Conclusion and Future Work

27 Data Dependencies in the Presence of Difference Discovery 22/31 Discovery Problem Discovery from data given a relation instance I discover candidate differential keys and minimal cover of differential dependencies that hold in I The hardness a minimal cover of fds, that hold in a relation instance I, can be exponentially large in the number of attributes fds are considered as special cases of dds where all the differential constraints are set to = 0 dds subsume fds, could be exponentially large as well

28 Data Dependencies in the Presence of Difference Discovery 23/31 Negative Pruning Motivation: pruning candidates of dds, in order to avoid evaluating all possible φ L [X] φ R [Y] in I Lemma For any φ 1 [V],φ 2 [Z] having V Z,φ 1 [V] φ 2 [V], if I φ 2 [Z] φ R [Y], then I φ 1 [V] φ R [Y] Example: if [name( 5)] φ R [Y] not hold in I, then [name( 7)] φ R [Y] not hold either without evaluation in I Worst case: all the candidates hold in the given instance I

29 Data Dependencies in the Presence of Difference Discovery 24/31 Positive Pruning Lemma For any φ 1 [V],φ 2 [W] having W V,φ 2 [W] φ 1 [W], if I φ 2 [W] φ R [Y], then I φ 1 [V] φ R [Y] Example: if [name( 7)] φ R [Y] holds in I, then [name( 5)] φ R [Y] must hold without evaluation in I Worst case: all the candidates do not hold in the given instance I Hybrid approach with both positive and negative pruning, used by turns.

30 Data Dependencies in the Presence of Difference Discovery 25/31 Instance Exclusion Motivation: avoiding evaluating the entire I. one differential function subsumes another the set of tuples agreeing on the former one should be a super set of the latter one Considers all the pairs of tuples in I. D(I) = {(t i,t j ) t i,t j I}. Given any dd φ L [X] φ R [Y], we define D(I,φ L [X], φ R [Y]) = {(t i,t j ) D(I) (t i,t j ) φ L [X],(t i,t j ) φ R [Y]}, that is, the tuple pairs agreeing φ L [X] but not agreeing φ R [Y].

31 Data Dependencies in the Presence of Difference Discovery 26/31 Instance Exclusion Lemma An instance I satisfies a dd, I φ L [X] φ R [Y], iff D(I,φ L [X], φ R [Y]) =. During the discovery, for a candidate φ L [X] φ R [Y], have to evaluate whether D(I,φ L [X], φ R [Y]) =.

32 Data Dependencies in the Presence of Difference Discovery 27/31 Instance Exclusion Lemma For any φ 1 [V],φ 2 [W] having W V,φ 2 [W] φ 1 [W], we have D(I,φ 1 [V], φ R [Y]) D(I,φ 2 [W], φ R [Y]). Suppose that a current D(I,φ 2 [W], φ R [Y]) instead of considering the entire D(I) use D(I,φ 2 [W], φ R [Y]) to compute D(I,φ 1 [V], φ R [Y])

33 Data Dependencies in the Presence of Difference Discovery 28/31 Experiments Evaluate the time performance of discovery approaches scale well with the increase of tuples in an instance I O(n 2 ) with respect to the number of tuples n in the instance I instance exclusion performs well Time cost (s) BF BU-Ne TD-Po Hybird IE-TD-Po IE-Hybird CiteSeer 0 I1 I2 I3 I4 I5 I6 Data instance Time cost (s) BF BU-Ne TD-Po Hybird IE-TD-Po IE-Hybird Cora 0 I1 I2 I3 I4 I5 I6 Data instance Figure: DDs discovery performance on various instance I

34 Data Dependencies in the Presence of Difference Discovery 29/31 Experiments discovery cost increases exponentially in the number of attributes in a schema can achieve several orders of magnitude improvement compared with brute-force one Time cost (s) BF BU-Ne TD-Po Hybird IE-TD-Po IE-Hybird CiteSeer Time cost (s) BF BU-Ne TD-Po Hybird IE-TD-Po IE-Hybird Cora R1 R2 R3 R4 R5 R6 Relation schema 0.01 R1 R2 R3 R4 R5 R6 Relation schema Figure: DDs discovery performance on various schema R

35 Outline Introduction Application Foundation Discovery Conclusion and Future Work

36 Data Dependencies in the Presence of Difference Conclusion and Future Work 30/31 Conclusions We propose a novel class of dependencies, differential dependencies (dds), which specify constraints on distance. Theory Practice formal definitions of dds and differential keys subsumption order relation of differential functions reasoning about dds consistency of dds, np-complete implication of dds, co-np-complete closure of a differential function a sound and complete inference system, proof minimal cover for dds discovery of dds and differential keys from data. application of dds and differential keys.

37 Data Dependencies in the Presence of Difference Conclusion and Future Work 31/31 Future Work Approximate differential dependencies almost hold in a data instance evaluation measure, efficient computation Implication of approximate differential dependencies Hardness analysis of computing error measure Approximation algorithms computing error measure Experiments of approximation validation Further extensions data repairing with dds conditioning dds in a subset of tuples integrity rules in dataspaces

38 Data Dependencies in the Presence of Difference Tsinghua University

39 Data Dependencies in the Presence of Difference 31/31 Closure The closure of φ L [X] under Σ, (φ L [X]) + is also a differential function the intersection of the set of differential functions that can be determined by φ L [X] according to dds in Σ (φ L [X]) + = {φ R [Y] Σ φ L [X] φ R [Y]} the closure of [name( 7)] under {dd 4,dd 5 } is [name( 7) address( 1) salary( 50)] It is natural that φ L [X] (φ L [X]) +.

40 Data Dependencies in the Presence of Difference 31/31 Closure To imply a dd is essentially to compute the corresponding closure (φ L [X]) + of φ L [X] Lemma Let Σ be a set of dds and φ 1 [Z] = (φ L [X]) + be the closure of φ L [X] with respect to Σ. Consider a dd φ L [X] φ R [Y], Σ φ L [X] φ R [Y] iff Y Z and φ R [Y] φ 1 [Y]. For example, [salary( 50)] subsumes the projection on salary of the closure of [name( 7)] under {dd 4,dd 5 } it implies dd 6 [name( 7)] [salary( 50)]

41 Data Dependencies in the Presence of Difference 31/31 Inference System A1. If Y X and φ L [Y] = φ R [Y], then Σ I φ L [X] φ R [Y]. A2. If Σ I φ L [X] φ R [Y], then Σ I φ L [X] φ 1 [Z] φ R [Y] φ 1 [Z]. A3. If Σ I φ L [X] φ 1 [Z], φ 1 [Z] φ 2 [Z] and Σ I φ 2 [Z] φ R [Y], then Σ I φ L [X] φ R [Y]. A4. If Σ I φ L [X] φ i [B] φ R [Y],1 i k, and (Σ,φ 1 [B] φ k [B]) is inconsistent, then Σ I φ L [X] φ R [Y]. Theorem The set I of inference rules is (sound), if Σ I φ L [X] φ R [Y] then Σ φ L [X] φ R [Y], (complete), if Σ φ L [X] φ R [Y] then Σ I φ L [X] φ R [Y], for logical implication of dds.

42 Data Dependencies in the Presence of Difference 31/31 Example: Inference Example We consider a set Σ of dds as follows: dd 7 [d( 1, 7) p(< 10)] [a( 150)], dd 8 [p( 10)] [a( 100)]. Let dd 9 be another dd dd 9 [d( 1, 7)] [a( 150)]. We show that Σ I dd 9 can be proved by the following steps. 1. [d( 1, 7) p( 10)] [d( 1, 7) a( 100)] by A2, dd 8 2. [d( 1, 7) a( 150)] [a( 150)] by A1 3. [d( 1, 7) p( 10)] [a( 150)] by A3, [d( 1, 7)] [a( 150)] by A4, 3. dd 7

43 Data Dependencies in the Presence of Difference 31/31 Minimal Cover A minimal cover Σ c for Σ is a set of dds such that Σ c is logically equivalent to Σ, i.e., Σ c Σ is minimal according to the following properties: C1. (left-reduced), for any φ L [X] φ R [Y] Σ c, there does not exist any φ 1 [W] such that W X, φ 1 [W] φ L [W] and Σ c φ 1 [W] φ R [Y]. C2. (right-subsumed), for any φ L [X] φ R [Y] Σ c, there does not exist any φ 1 [W] such that Y W, φ 1 [Y] φ R [Y] and Σ c φ L [X] φ 1 [W]. C3. (non-redundant), there does not exist a cover Σ of Σ such that Σ Σ c.

44 Data Dependencies in the Presence of Difference 31/31 Example: Minimal Cover Consider Σ = {dd 4,dd 5,dd 6 } in Example 2 a minimal cover can be Σ c = {dd 4,dd 5 } Σ c can imply dd 6 by removing dd 4 or dd 5 from Σ c, it is no longer a cover of Σ Consider Σ = {dd 7,dd 8,dd 9 } in Example 9 a minimal cover can be Σ c = {dd 8,dd 9 } Σ = {dd 7,dd 8 } is not a minimal cover since dd 7 is not left-reduced and can be implied by dd 9 by augmentation rule A2.

UVA UVA UVA UVA. Database Design. Relational Database Design. Functional Dependency. Loss of Information

UVA UVA UVA UVA. Database Design. Relational Database Design. Functional Dependency. Loss of Information Relational Database Design Database Design To generate a set of relation schemas that allows - to store information without unnecessary redundancy - to retrieve desired information easily Approach - design

More information

Functional. Dependencies. Functional Dependency. Definition. Motivation: Definition 11/12/2013

Functional. Dependencies. Functional Dependency. Definition. Motivation: Definition 11/12/2013 Functional Dependencies Functional Dependency Functional dependency describes the relationship between attributes in a relation. Eg. if A and B are attributes of relation R, B is functionally dependent

More information

Relational Database Design

Relational Database Design Relational Database Design Jan Chomicki University at Buffalo Jan Chomicki () Relational database design 1 / 16 Outline 1 Functional dependencies 2 Normal forms 3 Multivalued dependencies Jan Chomicki

More information

CS54100: Database Systems

CS54100: Database Systems CS54100: Database Systems Keys and Dependencies 18 January 2012 Prof. Chris Clifton Functional Dependencies X A = assertion about a relation R that whenever two tuples agree on all the attributes of X,

More information

Design Theory: Functional Dependencies and Normal Forms, Part I Instructor: Shel Finkelstein

Design Theory: Functional Dependencies and Normal Forms, Part I Instructor: Shel Finkelstein Design Theory: Functional Dependencies and Normal Forms, Part I Instructor: Shel Finkelstein Reference: A First Course in Database Systems, 3 rd edition, Chapter 3 Important Notices CMPS 180 Final Exam

More information

Functional Dependencies. Chapter 3

Functional Dependencies. Chapter 3 Functional Dependencies Chapter 3 1 Halfway done! So far: Relational algebra (theory) SQL (practice) E/R modeling (theory) HTML/Flask/Python (practice) Midterm (super fun!) Next up: Database design theory

More information

Constraints: Functional Dependencies

Constraints: Functional Dependencies Constraints: Functional Dependencies Spring 2018 School of Computer Science University of Waterloo Databases CS348 (University of Waterloo) Functional Dependencies 1 / 32 Schema Design When we get a relational

More information

Data Bases Data Mining Foundations of databases: from functional dependencies to normal forms

Data Bases Data Mining Foundations of databases: from functional dependencies to normal forms Data Bases Data Mining Foundations of databases: from functional dependencies to normal forms Database Group http://liris.cnrs.fr/ecoquery/dokuwiki/doku.php?id=enseignement: dbdm:start March 1, 2017 Exemple

More information

Schema Refinement and Normal Forms Chapter 19

Schema Refinement and Normal Forms Chapter 19 Schema Refinement and Normal Forms Chapter 19 Instructor: Vladimir Zadorozhny vladimir@sis.pitt.edu Information Science Program School of Information Sciences, University of Pittsburgh Database Management

More information

Functional Dependencies and Normalization

Functional Dependencies and Normalization Functional Dependencies and Normalization There are many forms of constraints on relational database schemata other than key dependencies. Undoubtedly most important is the functional dependency. A functional

More information

LOGICAL DATABASE DESIGN Part #1/2

LOGICAL DATABASE DESIGN Part #1/2 LOGICAL DATABASE DESIGN Part #1/2 Functional Dependencies Informally, a FD appears when the values of a set of attributes uniquely determines the values of another set of attributes. Example: schedule

More information

Schema Refinement. Feb 4, 2010

Schema Refinement. Feb 4, 2010 Schema Refinement Feb 4, 2010 1 Relational Schema Design Conceptual Design name Product buys Person price name ssn ER Model Logical design Relational Schema plus Integrity Constraints Schema Refinement

More information

Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1

Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1 Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1 Background We started with schema design ER model translation into a relational schema Then we studied relational

More information

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms Schema Refinement and Normal Forms Yanlei Diao UMass Amherst April 10 & 15, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke 1 Case Study: The Internet Shop DBDudes Inc.: a well-known database consulting

More information

Functional Dependencies. Getting a good DB design Lisa Ball November 2012

Functional Dependencies. Getting a good DB design Lisa Ball November 2012 Functional Dependencies Getting a good DB design Lisa Ball November 2012 Outline (2012) SEE NEXT SLIDE FOR ALL TOPICS (some for you to read) Normalization covered by Dr Sanchez Armstrong s Axioms other

More information

Design Theory for Relational Databases

Design Theory for Relational Databases Design Theory for Relational Databases FUNCTIONAL DEPENDENCIES DECOMPOSITIONS NORMAL FORMS 1 Functional Dependencies X ->Y is an assertion about a relation R that whenever two tuples of R agree on all

More information

Constraints: Functional Dependencies

Constraints: Functional Dependencies Constraints: Functional Dependencies Fall 2017 School of Computer Science University of Waterloo Databases CS348 (University of Waterloo) Functional Dependencies 1 / 42 Schema Design When we get a relational

More information

Schema Refinement and Normal Forms. Why schema refinement?

Schema Refinement and Normal Forms. Why schema refinement? Schema Refinement and Normal Forms Why schema refinement? Consider relation obtained from Hourly_Emps: Hourly_Emps (sin,rating,hourly_wages,hourly_worked) Problems: Update Anomaly: Can we change the wages

More information

Schema Refinement and Normal Forms. Case Study: The Internet Shop. Redundant Storage! Yanlei Diao UMass Amherst November 1 & 6, 2007

Schema Refinement and Normal Forms. Case Study: The Internet Shop. Redundant Storage! Yanlei Diao UMass Amherst November 1 & 6, 2007 Schema Refinement and Normal Forms Yanlei Diao UMass Amherst November 1 & 6, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke 1 Case Study: The Internet Shop DBDudes Inc.: a well-known database consulting

More information

Data Cleaning and Query Answering with Matching Dependencies and Matching Functions

Data Cleaning and Query Answering with Matching Dependencies and Matching Functions Data Cleaning and Query Answering with Matching Dependencies and Matching Functions Leopoldo Bertossi 1, Solmaz Kolahi 2, and Laks V. S. Lakshmanan 2 1 Carleton University, Ottawa, Canada. bertossi@scs.carleton.ca

More information

Schema Refinement & Normalization Theory

Schema Refinement & Normalization Theory Schema Refinement & Normalization Theory Functional Dependencies Week 13 1 What s the Problem Consider relation obtained (call it SNLRHW) Hourly_Emps(ssn, name, lot, rating, hrly_wage, hrs_worked) What

More information

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms Schema Refinement and Normal Forms UMass Amherst Feb 14, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke, Dan Suciu 1 Relational Schema Design Conceptual Design name Product buys Person price name

More information

Discovering Conditional Functional Dependencies

Discovering Conditional Functional Dependencies 1 Discovering Conditional Functional Dependencies Wenfei Fan, Floris Geerts, Jianzhong Li, Ming Xiong Abstract This paper investigates the discovery of conditional functional dependencies (CFDs). CFDs

More information

Introduction to Data Management. Lecture #6 (Relational DB Design Theory)

Introduction to Data Management. Lecture #6 (Relational DB Design Theory) Introduction to Data Management Lecture #6 (Relational DB Design Theory) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Announcements v Homework

More information

INF1383 -Bancos de Dados

INF1383 -Bancos de Dados INF1383 -Bancos de Dados Prof. Sérgio Lifschitz DI PUC-Rio Eng. Computação, Sistemas de Informação e Ciência da Computação Projeto de BD e Formas Normais Alguns slides são baseados ou modificados dos originais

More information

Reasoning about Record Matching Rules

Reasoning about Record Matching Rules Reasoning about Record Matching Rules Wenfei Fan 1,2,3 Xibei Jia 1 Jianzhong Li 3 Shuai Ma 1 1 University of Edinburgh 2 Bell Laboratories 3 Harbin Institute of Technology {wenfei@inf.,xibei.jia@,shuai.ma@}ed.ac.uk

More information

Chapter 8: Relational Database Design

Chapter 8: Relational Database Design Chapter 8: Relational Database Design Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 8: Relational Database Design Features of Good Relational Design Atomic Domains

More information

Database Design and Implementation

Database Design and Implementation Database Design and Implementation CS 645 Schema Refinement First Normal Form (1NF) A schema is in 1NF if all tables are flat Student Name GPA Course Student Name GPA Alice 3.8 Bob 3.7 Carol 3.9 Alice

More information

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms Schema Refinement and Normal Forms Chapter 19 Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1 The Evils of Redundancy Redundancy is at the root of several problems associated with relational

More information

SCHEMA NORMALIZATION. CS 564- Fall 2015

SCHEMA NORMALIZATION. CS 564- Fall 2015 SCHEMA NORMALIZATION CS 564- Fall 2015 HOW TO BUILD A DB APPLICATION Pick an application Figure out what to model (ER model) Output: ER diagram Transform the ER diagram to a relational schema Refine the

More information

Functional Dependencies and Normalization. Instructor: Mohamed Eltabakh

Functional Dependencies and Normalization. Instructor: Mohamed Eltabakh Functional Dependencies and Normalization Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu 1 Goal Given a database schema, how do you judge whether or not the design is good? How do you ensure it does

More information

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Example (Contd.

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Example (Contd. The Evils of Redundancy Schema Refinement and Normal Forms Chapter 19 Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1 Redundancy is at the root of several problems associated with relational

More information

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Refining an ER Diagram

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Refining an ER Diagram Schema Refinement and Normal Forms Chapter 19 Database Management Systems, R. Ramakrishnan and J. Gehrke 1 The Evils of Redundancy Redundancy is at the root of several problems associated with relational

More information

Schema Refinement and Normal Forms. The Evils of Redundancy. Schema Refinement. Yanlei Diao UMass Amherst April 10, 2007

Schema Refinement and Normal Forms. The Evils of Redundancy. Schema Refinement. Yanlei Diao UMass Amherst April 10, 2007 Schema Refinement and Normal Forms Yanlei Diao UMass Amherst April 10, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke 1 The Evils of Redundancy Redundancy is at the root of several problems associated

More information

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms Schema Refinement and Normal Forms Chapter 19 Quiz #2 Next Thursday Comp 521 Files and Databases Fall 2012 1 The Evils of Redundancy v Redundancy is at the root of several problems associated with relational

More information

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) CIS 330, Spring 2004 Lecture 11 March 2, 2004

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) CIS 330, Spring 2004 Lecture 11 March 2, 2004 Schema Refinement and Normal Forms CIS 330, Spring 2004 Lecture 11 March 2, 2004 1 The Evils of Redundancy Redundancy is at the root of several problems associated with relational schemas: redundant storage,

More information

Detecting Functional Dependencies Felix Naumann

Detecting Functional Dependencies Felix Naumann Detecting Functional Dependencies 21.5.2013 Felix Naumann Overview 2 Functional Dependencies TANE Candidate sets Pruning Algorithm Dependency checking Approximate FDs FD_Mine Conditional FDs Definition

More information

ECE473 Lecture 15: Propositional Logic

ECE473 Lecture 15: Propositional Logic ECE473 Lecture 15: Propositional Logic Jeffrey Mark Siskind School of Electrical and Computer Engineering Spring 2018 Siskind (Purdue ECE) ECE473 Lecture 15: Propositional Logic Spring 2018 1 / 23 What

More information

The Evils of Redundancy. Schema Refinement and Normalization. Functional Dependencies (FDs) Example: Constraints on Entity Set. Refining an ER Diagram

The Evils of Redundancy. Schema Refinement and Normalization. Functional Dependencies (FDs) Example: Constraints on Entity Set. Refining an ER Diagram The Evils of Redundancy Schema Refinement and Normalization Chapter 1 Nobody realizes that some people expend tremendous energy merely to be normal. Albert Camus Redundancy is at the root of several problems

More information

Propagating Functional Dependencies with Conditions

Propagating Functional Dependencies with Conditions Propagating Functional Dependencies with Conditions Wenfei Fan 1,2,3 Shuai Ma 1 Yanli Hu 1,5 Jie Liu 4 Yinghui Wu 1 1 University of Edinburgh 2 Bell Laboratories 3 Harbin Institute of Technologies 4 Chinese

More information

Knowledge Compilation and Theory Approximation. Henry Kautz and Bart Selman Presented by Kelvin Ku

Knowledge Compilation and Theory Approximation. Henry Kautz and Bart Selman Presented by Kelvin Ku Knowledge Compilation and Theory Approximation Henry Kautz and Bart Selman Presented by Kelvin Ku kelvin@cs.toronto.edu 1 Overview Problem Approach Horn Theory and Approximation Computing Horn Approximations

More information

Information Systems for Engineers. Exercise 8. ETH Zurich, Fall Semester Hand-out Due

Information Systems for Engineers. Exercise 8. ETH Zurich, Fall Semester Hand-out Due Information Systems for Engineers Exercise 8 ETH Zurich, Fall Semester 2017 Hand-out 24.11.2017 Due 01.12.2017 1. (Exercise 3.3.1 in [1]) For each of the following relation schemas and sets of FD s, i)

More information

A Goal-Oriented Algorithm for Unification in EL w.r.t. Cycle-Restricted TBoxes

A Goal-Oriented Algorithm for Unification in EL w.r.t. Cycle-Restricted TBoxes A Goal-Oriented Algorithm for Unification in EL w.r.t. Cycle-Restricted TBoxes Franz Baader, Stefan Borgwardt, and Barbara Morawska {baader,stefborg,morawska}@tcs.inf.tu-dresden.de Theoretical Computer

More information

Introduction to Data Management. Lecture #6 (Relational Design Theory)

Introduction to Data Management. Lecture #6 (Relational Design Theory) Introduction to Data Management Lecture #6 (Relational Design Theory) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Announcements v HW#2 is

More information

CSC 261/461 Database Systems Lecture 10 (part 2) Spring 2018

CSC 261/461 Database Systems Lecture 10 (part 2) Spring 2018 CSC 261/461 Database Systems Lecture 10 (part 2) Spring 2018 Announcement Read Chapter 14 and 15 You must self-study these chapters Too huge to cover in Lectures Project 2 Part 1 due tonight Agenda 1.

More information

Propositional Logic: Models and Proofs

Propositional Logic: Models and Proofs Propositional Logic: Models and Proofs C. R. Ramakrishnan CSE 505 1 Syntax 2 Model Theory 3 Proof Theory and Resolution Compiled at 11:51 on 2016/11/02 Computing with Logic Propositional Logic CSE 505

More information

10/12/10. Outline. Schema Refinements = Normal Forms. First Normal Form (1NF) Data Anomalies. Relational Schema Design

10/12/10. Outline. Schema Refinements = Normal Forms. First Normal Form (1NF) Data Anomalies. Relational Schema Design Outline Introduction to Database Systems CSE 444 Design theory: 3.1-3.4 [Old edition: 3.4-3.6] Lectures 6-7: Database Design 1 2 Schema Refinements = Normal Forms 1st Normal Form = all tables are flat

More information

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) [R&G] Chapter 19

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) [R&G] Chapter 19 Schema Refinement and Normal Forms [R&G] Chapter 19 CS432 1 The Evils of Redundancy Redundancy is at the root of several problems associated with relational schemas: redundant storage, insert/delete/update

More information

Edinburgh Research Explorer

Edinburgh Research Explorer Edinburgh Research Explorer Conditional functional dependencies for capturing data inconsistencies Citation for published version: Fan, W, Geerts, F, Jia, X & Kementsietsidis, A 2008, 'Conditional functional

More information

The Evils of Redundancy. Schema Refinement and Normal Forms. Functional Dependencies (FDs) Example: Constraints on Entity Set. Example (Contd.

The Evils of Redundancy. Schema Refinement and Normal Forms. Functional Dependencies (FDs) Example: Constraints on Entity Set. Example (Contd. The Evils of Redundancy Schema Refinement and Normal Forms INFO 330, Fall 2006 1 Redundancy is at the root of several problems associated with relational schemas: redundant storage, insert/delete/update

More information

Guaranteeing No Interaction Between Functional Dependencies and Tree-Like Inclusion Dependencies

Guaranteeing No Interaction Between Functional Dependencies and Tree-Like Inclusion Dependencies Guaranteeing No Interaction Between Functional Dependencies and Tree-Like Inclusion Dependencies Mark Levene Department of Computer Science University College London Gower Street London WC1E 6BT, U.K.

More information

CSIT5300: Advanced Database Systems

CSIT5300: Advanced Database Systems CSIT5300: Advanced Database Systems L05: Functional Dependencies Dr. Kenneth LEUNG Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong SAR, China

More information

Neighborhood Semantics for Modal Logic Lecture 3

Neighborhood Semantics for Modal Logic Lecture 3 Neighborhood Semantics for Modal Logic Lecture 3 Eric Pacuit ILLC, Universiteit van Amsterdam staff.science.uva.nl/ epacuit August 15, 2007 Eric Pacuit: Neighborhood Semantics, Lecture 3 1 Plan for the

More information

Information Systems (Informationssysteme)

Information Systems (Informationssysteme) Information Systems (Informationssysteme) Jens Teubner, TU Dortmund jens.teubner@cs.tu-dortmund.de Summer 2015 c Jens Teubner Information Systems Summer 2015 1 Part VII Schema Normalization c Jens Teubner

More information

Translatable Updates of Selection Views under Constant Complement

Translatable Updates of Selection Views under Constant Complement Translatable Updates of Selection Views under Constant Complement Enrico Franconi and Paolo Guagliardo Free University of Bozen-Bolzano, Italy 4 th September 2014 DEXA 2014, Munich (Germany) KRDB Research

More information

Knowledge base (KB) = set of sentences in a formal language Declarative approach to building an agent (or other system):

Knowledge base (KB) = set of sentences in a formal language Declarative approach to building an agent (or other system): Logic Knowledge-based agents Inference engine Knowledge base Domain-independent algorithms Domain-specific content Knowledge base (KB) = set of sentences in a formal language Declarative approach to building

More information

Outline. Complexity Theory. Introduction. What is abduction? Motivation. Reference VU , SS Logic-Based Abduction

Outline. Complexity Theory. Introduction. What is abduction? Motivation. Reference VU , SS Logic-Based Abduction Complexity Theory Complexity Theory Outline Complexity Theory VU 181.142, SS 2018 7. Logic-Based Abduction Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien

More information

Database Design: Normal Forms as Quality Criteria. Functional Dependencies Normal Forms Design and Normal forms

Database Design: Normal Forms as Quality Criteria. Functional Dependencies Normal Forms Design and Normal forms Database Design: Normal Forms as Quality Criteria Functional Dependencies Normal Forms Design and Normal forms Design Quality: Introduction Good conceptual model: - Many alternatives - Informal guidelines

More information

Relational-Database Design

Relational-Database Design C H A P T E R 7 Relational-Database Design Exercises 7.2 Answer: A decomposition {R 1, R 2 } is a lossless-join decomposition if R 1 R 2 R 1 or R 1 R 2 R 2. Let R 1 =(A, B, C), R 2 =(A, D, E), and R 1

More information

Chapter 7: Relational Database Design

Chapter 7: Relational Database Design Chapter 7: Relational Database Design Chapter 7: Relational Database Design! First Normal Form! Pitfalls in Relational Database Design! Functional Dependencies! Decomposition! Boyce-Codd Normal Form! Third

More information

Plan of the lecture. G53RDB: Theory of Relational Databases Lecture 10. Logical consequence (implication) Implication problem for fds

Plan of the lecture. G53RDB: Theory of Relational Databases Lecture 10. Logical consequence (implication) Implication problem for fds Plan of the lecture G53RDB: Theory of Relational Databases Lecture 10 Natasha Alechina School of Computer Science & IT nza@cs.nott.ac.uk Logical implication for functional dependencies Armstrong closure.

More information

Inverting Proof Systems for Secrecy under OWA

Inverting Proof Systems for Secrecy under OWA Inverting Proof Systems for Secrecy under OWA Giora Slutzki Department of Computer Science Iowa State University Ames, Iowa 50010 slutzki@cs.iastate.edu May 9th, 2010 Jointly with Jia Tao and Vasant Honavar

More information

Functional Dependencies & Normalization. Dr. Bassam Hammo

Functional Dependencies & Normalization. Dr. Bassam Hammo Functional Dependencies & Normalization Dr. Bassam Hammo Redundancy and Normalisation Redundant Data Can be determined from other data in the database Leads to various problems INSERT anomalies UPDATE

More information

Data Cleaning and Query Answering with Matching Dependencies and Matching Functions

Data Cleaning and Query Answering with Matching Dependencies and Matching Functions Data Cleaning and Query Answering with Matching Dependencies and Matching Functions Leopoldo Bertossi Carleton University Ottawa, Canada bertossi@scs.carleton.ca Solmaz Kolahi University of British Columbia

More information

Functional Dependencies

Functional Dependencies Functional Dependencies CS 186, Fall 2002, Lecture 5 R&G Chapter 15 Science is the knowledge of consequences, and dependence of one fact upon another. Thomas Hobbes (1588-1679) Administrivia Most admissions

More information

Pairing Transitive Closure and Reduction to Efficiently Reason about Partially Ordered Events

Pairing Transitive Closure and Reduction to Efficiently Reason about Partially Ordered Events Pairing Transitive Closure and Reduction to Efficiently Reason about Partially Ordered Events Massimo Franceschet Angelo Montanari Dipartimento di Matematica e Informatica, Università di Udine Via delle

More information

Functional Dependencies and Normalization

Functional Dependencies and Normalization Functional Dependencies and Normalization 5DV119 Introduction to Database Management Umeå University Department of Computing Science Stephen J. Hegner hegner@cs.umu.se http://www.cs.umu.se/~hegner Functional

More information

Logic for Computer Science - Week 4 Natural Deduction

Logic for Computer Science - Week 4 Natural Deduction Logic for Computer Science - Week 4 Natural Deduction 1 Introduction In the previous lecture we have discussed some important notions about the semantics of propositional logic. 1. the truth value of a

More information

Applied Logic. Lecture 1 - Propositional logic. Marcin Szczuka. Institute of Informatics, The University of Warsaw

Applied Logic. Lecture 1 - Propositional logic. Marcin Szczuka. Institute of Informatics, The University of Warsaw Applied Logic Lecture 1 - Propositional logic Marcin Szczuka Institute of Informatics, The University of Warsaw Monographic lecture, Spring semester 2017/2018 Marcin Szczuka (MIMUW) Applied Logic 2018

More information

On the Relative Trust between Inconsistent Data and Inaccurate Constraints

On the Relative Trust between Inconsistent Data and Inaccurate Constraints On the Relative Trust between Inconsistent Data and Inaccurate Constraints George Beskales 1 Ihab F. Ilyas 1 Lukasz Golab 2 Artur Galiullin 2 1 Qatar Computing Research Institute 2 University of Waterloo

More information

arxiv: v2 [cs.db] 23 Aug 2016

arxiv: v2 [cs.db] 23 Aug 2016 Effective and Complete Discovery of Order Dependencies via Set-based Axiomatization arxiv:1608.06169v2 [cs.db] 23 Aug 2016 Jaroslaw Szlichta #, Parke Godfrey, Lukasz Golab $, Mehdi Kargar, Divesh Srivastava

More information

On Terminological Default Reasoning about Spatial Information: Extended Abstract

On Terminological Default Reasoning about Spatial Information: Extended Abstract On Terminological Default Reasoning about Spatial Information: Extended Abstract V. Haarslev, R. Möller, A.-Y. Turhan, and M. Wessel University of Hamburg, Computer Science Department Vogt-Kölln-Str. 30,

More information

Functional Dependencies. Applied Databases. Not all designs are equally good! An example of the bad design

Functional Dependencies. Applied Databases. Not all designs are equally good! An example of the bad design Applied Databases Handout 2a. Functional Dependencies and Normal Forms 20 Oct 2008 Functional Dependencies This is the most mathematical part of the course. Functional dependencies provide an alternative

More information

Lectures 6. Lecture 6: Design Theory

Lectures 6. Lecture 6: Design Theory Lectures 6 Lecture 6: Design Theory Lecture 6 Announcements Solutions to PS1 are posted online. Grades coming soon! Project part 1 is out. Check your groups and let us know if you have any issues. We have

More information

Chapter 7: Relational Database Design. Chapter 7: Relational Database Design

Chapter 7: Relational Database Design. Chapter 7: Relational Database Design Chapter 7: Relational Database Design Chapter 7: Relational Database Design First Normal Form Pitfalls in Relational Database Design Functional Dependencies Decomposition Boyce-Codd Normal Form Third Normal

More information

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen LECTURE 4: DESIGN THEORIES (FUNCTIONAL DEPENDENCIES) Design theory E/R diagrams are high-level design Formal theory for

More information

Design Theory for Relational Databases

Design Theory for Relational Databases Design Theory for Relational Databases Keys: formal definition K is a superkey for relation R if K functionally determines all attributes of R K is a key for R if K is a superkey, but no proper subset

More information

Propositional Dynamic Logic

Propositional Dynamic Logic Propositional Dynamic Logic Contents 1 Introduction 1 2 Syntax and Semantics 2 2.1 Syntax................................. 2 2.2 Semantics............................... 2 3 Hilbert-style axiom system

More information

P Q1 Q2 Q3 Q4 Q5 Tot (60) (20) (20) (20) (60) (20) (200) You are allotted a maximum of 4 hours to complete this exam.

P Q1 Q2 Q3 Q4 Q5 Tot (60) (20) (20) (20) (60) (20) (200) You are allotted a maximum of 4 hours to complete this exam. Exam INFO-H-417 Database System Architecture 13 January 2014 Name: ULB Student ID: P Q1 Q2 Q3 Q4 Q5 Tot (60 (20 (20 (20 (60 (20 (200 Exam modalities You are allotted a maximum of 4 hours to complete this

More information

Relational Design: Characteristics of Well-designed DB

Relational Design: Characteristics of Well-designed DB Relational Design: Characteristics of Well-designed DB 1. Minimal duplication Consider table newfaculty (Result of F aculty T each Course) Id Lname Off Bldg Phone Salary Numb Dept Lvl MaxSz 20000 Cotts

More information

Lecture 2: A Las Vegas Algorithm for finding the closest pair of points in the plane

Lecture 2: A Las Vegas Algorithm for finding the closest pair of points in the plane Randomized Algorithms Lecture 2: A Las Vegas Algorithm for finding the closest pair of points in the plane Sotiris Nikoletseas Professor CEID - ETY Course 2017-2018 Sotiris Nikoletseas, Professor Randomized

More information

Schema Refinement and Normal Forms. Chapter 19

Schema Refinement and Normal Forms. Chapter 19 Schema Refinement and Normal Forms Chapter 19 1 Review: Database Design Requirements Analysis user needs; what must the database do? Conceptual Design high level descr. (often done w/er model) Logical

More information

Reasoning with Inconsistent and Uncertain Ontologies

Reasoning with Inconsistent and Uncertain Ontologies Reasoning with Inconsistent and Uncertain Ontologies Guilin Qi Southeast University China gqi@seu.edu.cn Reasoning Web 2012 September 05, 2012 Outline Probabilistic logic vs possibilistic logic Probabilistic

More information

Chapter 3 Design Theory for Relational Databases

Chapter 3 Design Theory for Relational Databases 1 Chapter 3 Design Theory for Relational Databases Contents Functional Dependencies Decompositions Normal Forms (BCNF, 3NF) Multivalued Dependencies (and 4NF) Reasoning About FD s + MVD s 2 Our example

More information

L13: Normalization. CS3200 Database design (sp18 s2) 2/26/2018

L13: Normalization. CS3200 Database design (sp18 s2)   2/26/2018 L13: Normalization CS3200 Database design (sp18 s2) https://course.ccs.neu.edu/cs3200sp18s2/ 2/26/2018 274 Announcements! Keep bringing your name plates J Page Numbers now bigger (may change slightly)

More information

Equivalence of SQL Queries In Presence of Embedded Dependencies

Equivalence of SQL Queries In Presence of Embedded Dependencies Equivalence of SQL Queries In Presence of Embedded Dependencies Rada Chirkova Department of Computer Science NC State University, Raleigh, NC 27695, USA chirkova@csc.ncsu.edu Michael R. Genesereth Department

More information

11/6/11. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design)

11/6/11. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design) Relational Schema Design Introduction to Management CSE 344 Lectures 16: Database Design Conceptual Model: Relational Model: plus FD s name Product buys Person price name ssn Normalization: Eliminates

More information

Journal of Computer and System Sciences

Journal of Computer and System Sciences Journal of Computer and System Sciences 78 (2012) 1026 1044 Contents lists available at SciVerse ScienceDirect Journal of Computer and System Sciences www.elsevier.com/locate/jcss Characterisations of

More information

CSE 132B Database Systems Applications

CSE 132B Database Systems Applications CSE 132B Database Systems Applications Alin Deutsch Database Design and Normal Forms Some slides are based or modified from originals by Sergio Lifschitz @ PUC Rio, Brazil and Victor Vianu @ CSE UCSD and

More information

Ordering, Indexing, and Searching Semantic Data: A Terminology Aware Index Structure

Ordering, Indexing, and Searching Semantic Data: A Terminology Aware Index Structure Ordering, Indexing, and Searching Semantic Data: A Terminology Aware Index Structure by Jeffrey Pound A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree

More information

CSE 303: Database. Outline. Lecture 10. First Normal Form (1NF) First Normal Form (1NF) 10/1/2016. Chapter 3: Design Theory of Relational Database

CSE 303: Database. Outline. Lecture 10. First Normal Form (1NF) First Normal Form (1NF) 10/1/2016. Chapter 3: Design Theory of Relational Database CSE 303: Database Lecture 10 Chapter 3: Design Theory of Relational Database Outline 1st Normal Form = all tables attributes are atomic 2nd Normal Form = obsolete Boyce Codd Normal Form = will study 3rd

More information

Extremal problems in logic programming and stable model computation Pawe l Cholewinski and Miros law Truszczynski Computer Science Department Universi

Extremal problems in logic programming and stable model computation Pawe l Cholewinski and Miros law Truszczynski Computer Science Department Universi Extremal problems in logic programming and stable model computation Pawe l Cholewinski and Miros law Truszczynski Computer Science Department University of Kentucky Lexington, KY 40506-0046 fpaweljmirekg@cs.engr.uky.edu

More information

Review: Keys. What is a Functional Dependency? Why use Functional Dependencies? Functional Dependency Properties

Review: Keys. What is a Functional Dependency? Why use Functional Dependencies? Functional Dependency Properties Review: Keys Superkey: set of attributes whose values are unique for each tuple Note: a superkey isn t necessarily minimal. For example, for any relation, the entire set of attributes is always a superkey.

More information

Normal Forms. Dr Paolo Guagliardo. University of Edinburgh. Fall 2016

Normal Forms. Dr Paolo Guagliardo. University of Edinburgh. Fall 2016 Normal Forms Dr Paolo Guagliardo University of Edinburgh Fall 2016 Example of bad design BAD Title Director Theatre Address Time Price Inferno Ron Howard Vue Omni Centre 20:00 11.50 Inferno Ron Howard

More information

CS 186, Fall 2002, Lecture 6 R&G Chapter 15

CS 186, Fall 2002, Lecture 6 R&G Chapter 15 Schema Refinement and Normalization CS 186, Fall 2002, Lecture 6 R&G Chapter 15 Nobody realizes that some people expend tremendous energy merely to be normal. Albert Camus Functional Dependencies (Review)

More information

Database System Concepts, 5th Ed.! Silberschatz, Korth and Sudarshan See for conditions on re-use "

Database System Concepts, 5th Ed.! Silberschatz, Korth and Sudarshan See   for conditions on re-use Database System Concepts, 5th Ed.! Silberschatz, Korth and Sudarshan See www.db-book.com for conditions on re-use " Features of Good Relational Design! Atomic Domains and First Normal Form! Decomposition

More information

CSC 261/461 Database Systems Lecture 8. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

CSC 261/461 Database Systems Lecture 8. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101 CSC 261/461 Database Systems Lecture 8 Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101 Agenda 1. Database Design 2. Normal forms & functional dependencies 3. Finding functional dependencies

More information

Comp487/587 - Boolean Formulas

Comp487/587 - Boolean Formulas Comp487/587 - Boolean Formulas 1 Logic and SAT 1.1 What is a Boolean Formula Logic is a way through which we can analyze and reason about simple or complicated events. In particular, we are interested

More information

Disjunctive Databases for Representing Repairs

Disjunctive Databases for Representing Repairs Noname manuscript No (will be inserted by the editor) Disjunctive Databases for Representing Repairs Cristian Molinaro Jan Chomicki Jerzy Marcinkowski Received: date / Accepted: date Abstract This paper

More information

arxiv: v2 [cs.db] 24 Jul 2012

arxiv: v2 [cs.db] 24 Jul 2012 On the Relative Trust between Inconsistent Data and Inaccurate Constraints George Beskales 1 Ihab F. Ilyas 1 Lukasz Golab 2 Artur Galiullin 2 1 Qatar Computing Research Institute 2 University of Waterloo

More information