Topics in Probabilistic and Statistical Databases. Lecture 2: Representation of Probabilistic Databases. Dan Suciu University of Washington

Size: px
Start display at page:

Download "Topics in Probabilistic and Statistical Databases. Lecture 2: Representation of Probabilistic Databases. Dan Suciu University of Washington"

Transcription

1 Topics in Probabilistic and Statistical Databases Lecture 2: Representation of Probabilistic Databases Dan Suciu University of Washington 1

2 Review: Definition The set of all possible database instances: Inst = {I 1, I 2, I 3,..., I N } Sample space (Ω) Definition A probabilistic database PDB = (Inst, Pr) is a discrete probability distribution: Pr : Inst! [0,1] s.t. i=1,n Pr(I i ) = 1 Definition A possible world is I s.t. Pr(I) > 0 A possible tuple is a tuple t 2 I, for a possible world I 2

3 Representation System Informally: it is a syntax + semantics that allows us to represent a probabilistic database concisely Definition A representation system for ProbDB is (S,Rep), where S is a set of representations and Rep: S (set of PDBs) assigns a probabilistic database to each representation 3

4 Review: Disjoint-Independent Databases Definition A PDB is disjoint-independent if for any set T of possible tuples one of the following holds: T is an independent set, or T contains two disjoint tuples A disjoint-independent database can be fully specified by: all marginal tuple probabilities an indication of which tuples are disjoint or independent 4

5 Representations Systems for D/I- Databases MystiQ s representation Trio s xor-, maybe- tuples Attribute-level probabilities 5

6 D/I Relations in MystiQ At the schema level: Possible worlds key = set of attributes Probability = expression on attributes Constraint at the instance level: For each key value, sum of probabilities 1 6

7 Possible worlds key =? Attribute expression =? Constraint =? S = Object Time Person P LaptopX77 9:07 John p 1 LaptopX77 9:07 Jim p 2 Book302 9:18 Mary p 3 Book302 9:18 John p 4 Rep(S) = Book302 9:18 Fred p 5 { Object Time Person Object Time Person LaptopX77 9:07 John } Object Time Person LaptopX77 9:07 John Book302 9:18 Object Mary Time Person LaptopX77 Object 9:07 John Time Person Book302 LaptopX77 9:18 John Object 9:07 Jim Time Person Book302 LaptopX77 9:18 Fred 9:07 Jim Book302 LaptopX77 9:18 Object Mary 9:07 Time Jim Person Book302 9:18 Object John Time Person Book302 LaptopX77 9:18 Object 9:07 Fred John Time Person LaptopX77 Object 9:07 Jim Time Person Book302 Object 9:18 Mary Time Person Book302 Object 9:18 John Time Person p 1 (1- p 3 -p 4 -p 5 ) Book302 9:18 Fred p 1 p 3 p 1 p 4 7

8 X-Relations in Trio Maybe tuple is t?, meaning it may be missing X-tuple is <t1, t2, > meaning it may be any of t1, t2, An X-relation is a collection of X-tuples and maybe-tuples 8

9 Maybe- and X-tuples in Trio S = Rep(S) =? 9

10 Attribute-level Uncertainty For some attribute A, give a probability distribution on possible values Note: sum of probabilities must be 1 More generally, for a set of attributes A1, A2, give a probability distribution on possible values 10

11 [Barbara 92] Attribute-Level Uncertainty S = Rep(S) =? 11

12 Review Query Semantics Semantics 1: Possible Sets of Answers A probability distributions on sets of tuples 8 A. Pr(Q = A) = I 2 Inst. Q(I) = A Pr(I) Semantics 2: Possible Tuples A probability function on tuples 8 t. Pr(t 2 Q) = I 2 Inst. t2 Q(I) Pr(I) 12

13 The Representation Problem How do we represent correlations between tuples in a probdb? How do we represent query answers (views)? 13

14 Main Techniques Incomplete databases Concerned with representing possible worlds, i.e. no probabilities Probabilistic Networks Concerned with representing correlations, i.e. no databases 14

15 Incomplete Databases 15

16 [Imilelinski&Lipsi 1984, Green&Tannen 2006] Incomplete Databases Let Inst = the set of all possible instances over domain D Definition An incomplete database is a set of possible worlds I = {I 1, I 2, I 3,...} Inst Definition A representation system is (S, Rep) where S is a set of representations described by some syntax, and: Rep : S! 2 Inst 16

17 Discussion We often denote an incomplete database {I 1, I 2, } with <I 1, I 2, > This is called an OR-set: the true state can be any I <I 1, I 2, > 17

18 Rule of Thumb #1 A ProbDB = an Incomplete DB + probabilities 18

19 Discussion Question: what does < > mean (the empty set of worlds)? 19

20 Discussion Question: what does < > mean (the empty set of worlds)? Answer: inconsistency! We do not allow < > 20

21 Examples Every traditional database instance I is also an incomplete database I = <I> The no-information, or zero-information incomplete database is: N = < I I Inst> (in other words: N = Inst) 21

22 Four Representation Systems Codd-tables V-tables C-tables OR-sets 22

23 Codd-Tables T = NULL or Rep(T) =? 23

24 V-Tables T = Rep(T) =? marked nulls : 1, 2, 24

25 C-Tables T = Any Boolean condition Rep(T) =? 25

26 Boolean C-Tables: Vars only in Cond T = A B Cond a1 b1 (x=1) (y=2) (z=1) a1 b2 x=2 a2 b2 z=1 We often state explicitly the domain of each variable: Dom(X) = {1, 2} Dom(Y) = {1,2,3} Dom(Z) = {0,1}... Rep(T) =? In G&T s definition all vars are Boolean; minor distinction. 26

27 OR-Sets Review of Nested Relational Algebra NRA: Types: T ::= basetype T T { T } Operations (in class): 27

28 OR-Sets Types: T ::= basetype T T { T } < T > Operations (in class): 28

29 OR-Sets Examples What are their types? What do they mean? {[Gizmo, <99,110>], [Camera, <10,12,19>]} {<3,5>,<3,7>,<5,8,12>} {<{<1,2>,<3>},{<4>}>,<{<1>},{<2,3>,<4>}>} 29

30 Discussion Which concepts from incomplete databases where borrowed by the three simple representation systems for ProbDB? MystiQ s disjoint/independent tables Trio s X-tuples Attribute level probabilities 30

31 Two Important Properties Completeness: can a representation system represent all incomplete databases? Closure: can the result of a query also be represented in the same system? 31

32 Closure and Completeness Assume the domain D is finite Why? Definition A representation system is complete if for any incomplete database I there exists a representation S s.t. Rep(S) = I. 32

33 Which Are Complete? Why? Codd-Tables? V-Tables? C-Tables? OR-Sets? 33

34 Which Are Complete? Why? Codd-Tables? NO: constant cardinality V-Tables? NO: constant cardinality C-Tables? YES OR-Sets? YES 34

35 Closure Definition A representation system is closed w.r.t. a query language Q, if for every representation S and query q in Q, there exists S s.t. Rep(S ) = q(rep(s) q S S Rep Rep(S)={I 1, I 2, } q Rep Rep(S )={q(i 1 ), q(i 2 ), } 35

36 Which Are Closed w.r.t. RA? Codd-Tables? V-Tables? C-Tables? OR-Sets? 36

37 Which Are Closed w.r.t. RA? Codd-Tables? NO in class V-Tables? NO in class C-Tables? YES in class OR-Sets? YES in class 37

38 Completeness Closure Fact: every complete system is closed w.r.t. the Relational Algebra Why? 38

39 Completeness Closure? Challenge: give an example of a system that is closed w.r.t. Relational Algebra but is not complete! 39

40 [Das Sarma (unpublished) 2008] Completeness Closure? Consider a representation system S s.t. S can represent any deterministic instance <I> S can represent <{0}, {1}> S is closed under {Π,,σ} Then S is complete PROOF: in class 40

41 Lineage Lineage = a Boolean expression annotating a tuple that explains why the tuple is there Technically: lineage = condition in a Boolean C-table A.k.a provenance 41

42 [Benjelloun, VLDBJ 2008] Example Start from Trio s X-relations: ID Saw(Witness, Car) X1 <[Amy, Mazda], [Amy, Toyota]>? X2 <[Betty, Honda]> ID Drives(Person, Car) Y1 <[Jimmy, Mazda], [Jimmy, Toyota]> Y2 <[Billy, Mazda], [Billy, Honda]> 42

43 [Benjelloun, VLDBJ 2008] Example Convert to Boolean C-Tables Saw Drives Person Car Cond Witness Car Cond Jimmy Mazda Y1=1 Amy Mazda X1=1 Jimmy Toyota Y1=2 Amy Toyota X1=2 Billy Mazda Y2=1 Betty Honda X2=1 Billy Honda Y2=2 Q: How do we say Amy is a maybe tuple, Betty is a certain tuple? 43

44 Example Answer: Dom(X1) = {0,1,2} Dom(X2) = {1} Dom(Y1) = {1,2} Dom(Y2) = {1,2} 44

45 [Benjelloun, VLDBJ 2008] Saw Example Compute the query q(w,p) :- Saw(w,c),Drives(c,p) Drives Person Car Cond Witness Car Cond Jimmy Mazda Y1=1 Amy Mazda X1=1 Jimmy Toyota Y1=2 Amy Toyota X1=2 Billy Mazda Y2=1 Betty Honda X2=1 Billy Honda Y2=2 Witness Person Cond Amy Jimmy X1=1 Y1=1 X1=2 Y1=2 Betty Billy X2=1 Y2=2 45

46 Uniform Lineage Call a lineage expression uniform if: It is in DNF All conjuncts have the same number k of literals Each literal is of the form X=v Representation of uniform lineage: add 2k columns! 46

47 Saw Example Compute the query q(w,p) :- Saw(w,c),Drives(c,p) Drives Person Car Y W Witness Car X V Jimmy Mazda Y1 1 Amy Mazda X1 1 Jimmy Toyota Y1 2 Amy Toyota X1 2 Billy Mazda Y2 1 Betty Honda X2 1 Billy Honda Y2 2 Witness Person X V Y W Amy Jimmy X1 1 Y1 1 Amy Jimmy X1 2 Y1 2 Betty Billy X2 1 Y2 2 X1=1 Y1=1 X1=2 Y1=2 47

48 Summary of Incomplete DBs Main goal: specify a set of possible worlds Very relevant to ProbDBs! Key concepts: closure and completeness Turn out to be equivalent, except trivial cases Boolean C-tables closed&complete; others not Lineage: important tool in ProbDB, is derived from C-tables Some open questions next Want to tell & show? by Wed. 48

49 Open Questions in Incomplete/ Probabilistic Databases Variables OWA v.s. CWA Certain tuples, possible tuples Partial information order Strong v.s. weak representation systems Homework: pick one and apply to probdbs. Tell us next time your thoughts 49

50 Variables as Attribute Values T = A B b1 Rep(T) = A large, or infinite set a2 b1 ProbDBs don t use variables as attribute values What if we allow ProbDBs to have variables? 50

51 OWA v.s. CWA CWA: if R(a,b,c) is not mentioned in the knowledge/data base, then R(a,b,c) is assumed OWA: R(a,b,c) is inferred only if it is explicitly stated in the knowledge/data base Which one is standard semantics in DB? And in KR? 51

52 OWA v.s. CWA in Incomplete DB T = A B b1 a2 b1 CWA: Rep(T) = A B a1 b1 A B a1 b1 A B a3 b1 A B a4 b1... a2 b1 a2 b1 a2 b1 OWA: Rep(T) = A B a1 b1 a2 b1 A B... A B A B a1 b1 a1 b1 a1 b1 a2 b What would OWA mean for ProbDBs?... 52

53 Certain v.s. Possible Tuples Consider an incomplete database I = {I 1, I 2, I 3,...} Definition The certain tuples I =I 1 I 2 I 3 Definition The possible tuples I =I 1 I 2 I 3 What do certain/possible tuples correspond to in ProbDB? 53

54 Certain v.s. Possible Tuples Two incomplete databases I, J are equivalent if I = J Two incomplete databases I, J are equivalent w.r.t. a query language Q if forall q in Q, q(i) = q(j) These notions are used to define weak representation systems (see [AHV]).. Are there similar notions of equivalences between ProbDBs 54?

55 Partial Information Order Let (D, ) be an ordered set. Consider two sets A={a 1,,a m }, B={b 1,,b n }. When can we say A B? Definition [Smythe or upper] A B if b j a i : a i b j Definition [Hoare or lower] A B if a i b j : a i b j Definition [Plotkin or convex] A B if A B and A B 55

56 Partial Information Order Let s order OR-sets Values of base type: a a and a Records: [x,y] [u,v] when? OR-sets <a1,a2,a3> <b1,b2> when? Sets: {a1,a2,a3} {b1, b2} when? 56

57 Partial Information Order Let s order OR-sets Values of base type: a a and a Records: [x,y] [u,v] when? When x u and y v OR-sets <a1,a2,a3> <b1,b2> when? Smythe Sets: {a1,a2,a3} {b1, b2} when? Hoare (for OWA), or equality (for CWA) 57

58 Partial Information Order What is the partial information order on ProbDBs? 58

59 Probabilistic Networks 59

60 Probabilistic Models Problem setting: Given m random variables V 1,, V m Each with domain D (same for all V i ) A probability space over D m : Pr: D m [0,1], st v1,,vm in D Pr(v 1,,v m ) = 1 Called the joint probability distribution Problem: give a compact representation of Pr 60

61 Example V1 V2 P A disjoint probabilistic database Here m=2, but in general m is large, need compact rep. 61

62 Background Marginal probability: Pr(V i =a) = v1,,vm in D, vi = a Pr(v 1,,v m ) What does this mean? Pr(V i ), Pr(V i V j ) Conditional prob: Pr(E F) = Pr(EF)/Pr(F) Independence: Pr(EF) = Pr(E) * Pr(F) Equivalently: Pr(E F) = Pr(E) Conditional indep: Pr(E,F G)= Pr(E G)*Pr(F G) 62

63 Independence V1 V2 P V1 P V2 P Marginal probabilities Are V1, V2 independent, i.e. Pr(V1,V2) = Pr(V1)*Pr(V2)? Note: need to check 4 equalities (why?) 63

64 Factored Representation More general: if P(V1=a,V2=b) = p(a) * q(b) forall a,b then V1, V2 are independent V1 V2 P a 1 b 1 p 1 q 1 V1 P V2 P a 1 b 2 p 1 q 2 a 2 b 1 p 2 q 1 a 2 b 2 p 2 q = a 1 p 1 a 2 p 2... Factors b 1 q 1 b 2 q

65 1 st Connection to ProbDBs: Variable Attribute Every joint distribution over variables V 1,, V m corresponds to a trivial probabilistic table with attributes V 1,, V m What s trivial about it? 65

66 1 st Connection to ProbDBs: Variable Attribute Conversely: each block in a disjoint/ independent relation defines a joint distribution on the values of its attribute Object Time Person P LaptopX77 9:07 John 0.5 Jim 0.5 Mary 0.2 Book302 9:18 John 0.4 Fred

67 Independence 67

68 Independence Employee Department Quality Bonus Sales P John Smith Toy Great Yes 30k-34k 0.4*0.3 John Smith Toy Good Yes 30k-34k 0.5*0.3 John Smith Toy Fair No 30k-34k 0.1*0.3 John Smith Toy Great Yes 35k-39k 0.4*0.7 John Smith Toy Good Yes 35k-39k 0.5*0.7 John Smith Toy Fair No 35k-39k 0.1*0.7 Fre Jones Houseware Good Yes 20k-24k 0.5 Fre Jones Houseware Good Yes 24k-29k

69 Independence Factors are disjoint/indep. Tables! Employee Department Quality Bonus P John Smith Toy Great Yes 0.4 John Smith Toy Good Yes 0.5 John Smith Toy Fair No 0.1 Fre Jones Houseware Good Yes 1 Employee Department Sales P John Smith Toy 30k-34k 0.3 John Smith Toy 35k-39k 0.7 Fre Jones Houseware 20k-24k 0.5 Fre Jones Houseware 24k-29k

70 Rule of Thumb #2 Correlated table = Independent tables + join Same principle as in traditional schema normalization Decompose big table into small tables with independent attributes Big table = a view (single join!) over the small tables 70

71 Database 101 Assume that (Quality, Bonus) is independent from (Sales) Drop the P column a traditional table (no more probabilities) Question What functional dependency holds in this table? 71

72 Database 101 Assume that (Quality, Bonus) is independent from (Sales) Drop the P column a traditional table (no more probabilities) Question What functional dependency holds in this table? A Multivalued FD!: Emp, Dept Quality, Bonus Sales 72

73 Rule of Thumb #3 Factor decomposition = MVD decomposition + probability identities 73

74 Conditional Independence V1 V2 W P 1 1 a a a a b b b b Are V1, V2 independent given W, i.e. Pr(V1,V2 W) = Pr(V1 W) * Pr(V2 W)? 74

75 Factors Conditional Independence V1 V2 W P 1 1 a a a a b b b b They: they are conditional independent P(W=a)=0.5; P(-- W=a): V1 V2 P P(W=b)=0.5; P(-- W=b): V1 V2 P

76 Conditional Independence V1 V2 W P 1 1 a a a a 0.28 = W P a 0.5 b b b b b V1 W P 1 a a b b 0.5 V2 W P 1 a a b b

77 Representing Conditional Independent Vars More general: if P(V1=a,V2=b,W=c) = p(a,c) * q(b,c) forall a,b,c then V1, V2 are independent given c V1 V2 W P = W P c 1 r 1 a 1 b 1 c 1 p 11 q 11 c 2 r 2 a 1 b 2 c 1 p 11 q a 2 b 1 c 1 p 21 q 11 V1 W P V2 W P a 2 b 2 c 1 p 21 q c 2 P 12 q 12 a 1 c 1 p 11 /P 1 b 1 c 1 q 11 /Q a 2 c 1 p 21 /P 1.. c 2 p 12 /P 2 b 2 c 1 q 21 /Q 1.. c 2 q 12 /Q

78 Summary of Prob Networks (Conditional) indep. = MVDs + prob ident. Factored decomposition = base tables + joins Next time: Discuss the networks in probabilistic networks; WSdecomposition, U-relations Partial/approximate representations Begin query evaluation Send by Wed. if want to show&tell 78

Instructor: Sudeepa Roy

Instructor: Sudeepa Roy CompSci 590.6 Understanding Data: Theory and Applications Lecture 13 Incomplete Databases Instructor: Sudeepa Roy Email: sudeepa@cs.duke.edu 1 Today s Reading Alice Book : Foundations of Databases Abiteboul-

More information

Database design and implementation CMPSCI 645

Database design and implementation CMPSCI 645 Database design and implementation CMPSCI 645 Lectures 20: Probabilistic Databases *based on a tutorial by Dan Suciu Have we seen uncertainty in DB yet? } NULL values Age Height Weight 20 NULL 200 NULL

More information

Query Evaluation on Probabilistic Databases. CSE 544: Wednesday, May 24, 2006

Query Evaluation on Probabilistic Databases. CSE 544: Wednesday, May 24, 2006 Query Evaluation on Probabilistic Databases CSE 544: Wednesday, May 24, 2006 Problem Setting Queries: Tables: Review A(x,y) :- Review(x,y), Movie(x,z), z > 1991 name rating p Movie Monkey Love good.5 title

More information

Relational Database Design

Relational Database Design Relational Database Design Jan Chomicki University at Buffalo Jan Chomicki () Relational database design 1 / 16 Outline 1 Functional dependencies 2 Normal forms 3 Multivalued dependencies Jan Chomicki

More information

Probabilistic Databases

Probabilistic Databases Probabilistic Databases Amol Deshpande University of Maryland Goal Introduction to probabilistic databases Focus on an overview of: Different possible representations Challenges in using them Probabilistic

More information

UVA UVA UVA UVA. Database Design. Relational Database Design. Functional Dependency. Loss of Information

UVA UVA UVA UVA. Database Design. Relational Database Design. Functional Dependency. Loss of Information Relational Database Design Database Design To generate a set of relation schemas that allows - to store information without unnecessary redundancy - to retrieve desired information easily Approach - design

More information

CMPT 354: Database System I. Lecture 9. Design Theory

CMPT 354: Database System I. Lecture 9. Design Theory CMPT 354: Database System I Lecture 9. Design Theory 1 Design Theory Design theory is about how to represent your data to avoid anomalies. Design 1 Design 2 Student Course Room Mike 354 AQ3149 Mary 354

More information

Provenance Semirings. Todd Green Grigoris Karvounarakis Val Tannen. presented by Clemens Ley

Provenance Semirings. Todd Green Grigoris Karvounarakis Val Tannen. presented by Clemens Ley Provenance Semirings Todd Green Grigoris Karvounarakis Val Tannen presented by Clemens Ley place of origin Provenance Semirings Todd Green Grigoris Karvounarakis Val Tannen presented by Clemens Ley place

More information

10/12/10. Outline. Schema Refinements = Normal Forms. First Normal Form (1NF) Data Anomalies. Relational Schema Design

10/12/10. Outline. Schema Refinements = Normal Forms. First Normal Form (1NF) Data Anomalies. Relational Schema Design Outline Introduction to Database Systems CSE 444 Design theory: 3.1-3.4 [Old edition: 3.4-3.6] Lectures 6-7: Database Design 1 2 Schema Refinements = Normal Forms 1st Normal Form = all tables are flat

More information

CSC 261/461 Database Systems Lecture 10 (part 2) Spring 2018

CSC 261/461 Database Systems Lecture 10 (part 2) Spring 2018 CSC 261/461 Database Systems Lecture 10 (part 2) Spring 2018 Announcement Read Chapter 14 and 15 You must self-study these chapters Too huge to cover in Lectures Project 2 Part 1 due tonight Agenda 1.

More information

Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1

Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1 Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1 Background We started with schema design ER model translation into a relational schema Then we studied relational

More information

Schema Refinement. Feb 4, 2010

Schema Refinement. Feb 4, 2010 Schema Refinement Feb 4, 2010 1 Relational Schema Design Conceptual Design name Product buys Person price name ssn ER Model Logical design Relational Schema plus Integrity Constraints Schema Refinement

More information

Database Design and Implementation

Database Design and Implementation Database Design and Implementation CS 645 Schema Refinement First Normal Form (1NF) A schema is in 1NF if all tables are flat Student Name GPA Course Student Name GPA Alice 3.8 Bob 3.7 Carol 3.9 Alice

More information

11/1/12. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design)

11/1/12. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design) Relational Schema Design Introduction to Management CSE 344 Lectures 16: Database Design Conceptual Model: Relational Model: plus FD s name Product buys Person price name ssn Normalization: Eliminates

More information

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms Schema Refinement and Normal Forms UMass Amherst Feb 14, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke, Dan Suciu 1 Relational Schema Design Conceptual Design name Product buys Person price name

More information

11/6/11. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design)

11/6/11. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design) Relational Schema Design Introduction to Management CSE 344 Lectures 16: Database Design Conceptual Model: Relational Model: plus FD s name Product buys Person price name ssn Normalization: Eliminates

More information

COSC 430 Advanced Database Topics. Lecture 2: Relational Theory Haibo Zhang Computer Science, University of Otago

COSC 430 Advanced Database Topics. Lecture 2: Relational Theory Haibo Zhang Computer Science, University of Otago COSC 430 Advanced Database Topics Lecture 2: Relational Theory Haibo Zhang Computer Science, University of Otago Learning objectives and references You should be able to: define the elements of the relational

More information

CSC 261/461 Database Systems Lecture 8. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

CSC 261/461 Database Systems Lecture 8. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101 CSC 261/461 Database Systems Lecture 8 Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101 Agenda 1. Database Design 2. Normal forms & functional dependencies 3. Finding functional dependencies

More information

Databases Lecture 8. Timothy G. Griffin. Computer Laboratory University of Cambridge, UK. Databases, Lent 2009

Databases Lecture 8. Timothy G. Griffin. Computer Laboratory University of Cambridge, UK. Databases, Lent 2009 Databases Lecture 8 Timothy G. Griffin Computer Laboratory University of Cambridge, UK Databases, Lent 2009 T. Griffin (cl.cam.ac.uk) Databases Lecture 8 DB 2009 1 / 15 Lecture 08: Multivalued Dependencies

More information

Design Theory. Design Theory I. 1. Normal forms & functional dependencies. Today s Lecture. 1. Normal forms & functional dependencies

Design Theory. Design Theory I. 1. Normal forms & functional dependencies. Today s Lecture. 1. Normal forms & functional dependencies Design Theory BBM471 Database Management Systems Dr. Fuat Akal akal@hacettepe.edu.tr Design Theory I 2 Today s Lecture 1. Normal forms & functional dependencies 2. Finding functional dependencies 3. Closures,

More information

Introduction to Data Management CSE 344

Introduction to Data Management CSE 344 Introduction to Data Management CSE 344 Lectures 18: BCNF 1 What makes good schemas? 2 Review: Relation Decomposition Break the relation into two: Name SSN PhoneNumber City Fred 123-45-6789 206-555-1234

More information

Introduction to Management CSE 344

Introduction to Management CSE 344 Introduction to Management CSE 344 Lectures 17: Design Theory 1 Announcements No class/office hour on Monday Midterm on Wednesday (Feb 19) in class HW5 due next Thursday (Feb 20) No WQ next week (WQ6 due

More information

Schema Refinement and Normal Forms Chapter 19

Schema Refinement and Normal Forms Chapter 19 Schema Refinement and Normal Forms Chapter 19 Instructor: Vladimir Zadorozhny vladimir@sis.pitt.edu Information Science Program School of Information Sciences, University of Pittsburgh Database Management

More information

Information Systems for Engineers. Exercise 8. ETH Zurich, Fall Semester Hand-out Due

Information Systems for Engineers. Exercise 8. ETH Zurich, Fall Semester Hand-out Due Information Systems for Engineers Exercise 8 ETH Zurich, Fall Semester 2017 Hand-out 24.11.2017 Due 01.12.2017 1. (Exercise 3.3.1 in [1]) For each of the following relation schemas and sets of FD s, i)

More information

Constraints: Functional Dependencies

Constraints: Functional Dependencies Constraints: Functional Dependencies Fall 2017 School of Computer Science University of Waterloo Databases CS348 (University of Waterloo) Functional Dependencies 1 / 42 Schema Design When we get a relational

More information

Lectures 6. Lecture 6: Design Theory

Lectures 6. Lecture 6: Design Theory Lectures 6 Lecture 6: Design Theory Lecture 6 Announcements Solutions to PS1 are posted online. Grades coming soon! Project part 1 is out. Check your groups and let us know if you have any issues. We have

More information

Answering Queries from Statistics and Probabilistic Views. Nilesh Dalvi and Dan Suciu, University of Washington.

Answering Queries from Statistics and Probabilistic Views. Nilesh Dalvi and Dan Suciu, University of Washington. Answering Queries from Statistics and Probabilistic Views Nilesh Dalvi and Dan Suciu, University of Washington. Background Query answering using Views problem: find answers to a query q over a database

More information

Functional Dependencies. Getting a good DB design Lisa Ball November 2012

Functional Dependencies. Getting a good DB design Lisa Ball November 2012 Functional Dependencies Getting a good DB design Lisa Ball November 2012 Outline (2012) SEE NEXT SLIDE FOR ALL TOPICS (some for you to read) Normalization covered by Dr Sanchez Armstrong s Axioms other

More information

CSE 303: Database. Outline. Lecture 10. First Normal Form (1NF) First Normal Form (1NF) 10/1/2016. Chapter 3: Design Theory of Relational Database

CSE 303: Database. Outline. Lecture 10. First Normal Form (1NF) First Normal Form (1NF) 10/1/2016. Chapter 3: Design Theory of Relational Database CSE 303: Database Lecture 10 Chapter 3: Design Theory of Relational Database Outline 1st Normal Form = all tables attributes are atomic 2nd Normal Form = obsolete Boyce Codd Normal Form = will study 3rd

More information

CSC384: Intro to Artificial Intelligence Reasoning under Uncertainty-II

CSC384: Intro to Artificial Intelligence Reasoning under Uncertainty-II CSC384: Intro to Artificial Intelligence Reasoning under Uncertainty-II 1 Bayes Rule Example Disease {malaria, cold, flu}; Symptom = fever Must compute Pr(D fever) to prescribe treatment Why not assess

More information

Database Design and Implementation

Database Design and Implementation Database Design and Implementation CS 645 Data provenance Provenance provenance, n. The fact of coming from some particular source or quarter; origin, derivation [Oxford English Dictionary] Data provenance

More information

Introduction to Data Management. Lecture #6 (Relational DB Design Theory)

Introduction to Data Management. Lecture #6 (Relational DB Design Theory) Introduction to Data Management Lecture #6 (Relational DB Design Theory) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Announcements v Homework

More information

Queries and Materialized Views on Probabilistic Databases

Queries and Materialized Views on Probabilistic Databases Queries and Materialized Views on Probabilistic Databases Nilesh Dalvi Christopher Ré Dan Suciu September 11, 2008 Abstract We review in this paper some recent yet fundamental results on evaluating queries

More information

On Factorisation of Provenance Polynomials

On Factorisation of Provenance Polynomials On Factorisation of Provenance Polynomials Dan Olteanu and Jakub Závodný Oxford University Computing Laboratory Wolfson Building, Parks Road, OX1 3QD, Oxford, UK 1 Introduction Tracking and managing provenance

More information

A Dichotomy. in in Probabilistic Databases. Joint work with Robert Fink. for Non-Repeating Queries with Negation Queries with Negation

A Dichotomy. in in Probabilistic Databases. Joint work with Robert Fink. for Non-Repeating Queries with Negation Queries with Negation Dichotomy for Non-Repeating Queries with Negation Queries with Negation in in Probabilistic Databases Robert Dan Olteanu Fink and Dan Olteanu Joint work with Robert Fink Uncertainty in Computation Simons

More information

Schema Refinement: Other Dependencies and Higher Normal Forms

Schema Refinement: Other Dependencies and Higher Normal Forms Schema Refinement: Other Dependencies and Higher Normal Forms Spring 2018 School of Computer Science University of Waterloo Databases CS348 (University of Waterloo) Higher Normal Forms 1 / 14 Outline 1

More information

Lecture 8. Probabilistic Reasoning CS 486/686 May 25, 2006

Lecture 8. Probabilistic Reasoning CS 486/686 May 25, 2006 Lecture 8 Probabilistic Reasoning CS 486/686 May 25, 2006 Outline Review probabilistic inference, independence and conditional independence Bayesian networks What are they What do they mean How do we create

More information

CSE 544 Principles of Database Management Systems

CSE 544 Principles of Database Management Systems CSE 544 Principles of Database Management Systems Lecture 3 Schema Normalization CSE 544 - Winter 2018 1 Announcements Project groups due on Friday First review due on Tuesday (makeup lecture) Run git

More information

Introduction to Database Systems CSE 414. Lecture 20: Design Theory

Introduction to Database Systems CSE 414. Lecture 20: Design Theory Introduction to Database Systems CSE 414 Lecture 20: Design Theory CSE 414 - Spring 2018 1 Class Overview Unit 1: Intro Unit 2: Relational Data Models and Query Languages Unit 3: Non-relational data Unit

More information

Design Theory: Functional Dependencies and Normal Forms, Part I Instructor: Shel Finkelstein

Design Theory: Functional Dependencies and Normal Forms, Part I Instructor: Shel Finkelstein Design Theory: Functional Dependencies and Normal Forms, Part I Instructor: Shel Finkelstein Reference: A First Course in Database Systems, 3 rd edition, Chapter 3 Important Notices CMPS 180 Final Exam

More information

L13: Normalization. CS3200 Database design (sp18 s2) 2/26/2018

L13: Normalization. CS3200 Database design (sp18 s2)   2/26/2018 L13: Normalization CS3200 Database design (sp18 s2) https://course.ccs.neu.edu/cs3200sp18s2/ 2/26/2018 274 Announcements! Keep bringing your name plates J Page Numbers now bigger (may change slightly)

More information

SCHEMA NORMALIZATION. CS 564- Fall 2015

SCHEMA NORMALIZATION. CS 564- Fall 2015 SCHEMA NORMALIZATION CS 564- Fall 2015 HOW TO BUILD A DB APPLICATION Pick an application Figure out what to model (ER model) Output: ER diagram Transform the ER diagram to a relational schema Refine the

More information

Chapter 10. Normalization Ext (from E&N and my editing)

Chapter 10. Normalization Ext (from E&N and my editing) Chapter 10 Normalization Ext (from E&N and my editing) Outline BCNF Multivalued Dependencies and Fourth Normal Form 2 BCNF A relation schema R is in Boyce-Codd Normal Form (BCNF) if whenever an FD X ->

More information

Introduction to Data Management. Lecture #7 (Relational DB Design Theory II)

Introduction to Data Management. Lecture #7 (Relational DB Design Theory II) Introduction to Data Management Lecture #7 (Relational DB Design Theory II) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Announcements v Homework

More information

Inverting Proof Systems for Secrecy under OWA

Inverting Proof Systems for Secrecy under OWA Inverting Proof Systems for Secrecy under OWA Giora Slutzki Department of Computer Science Iowa State University Ames, Iowa 50010 slutzki@cs.iastate.edu May 9th, 2010 Jointly with Jia Tao and Vasant Honavar

More information

Schema Refinement and Normal Forms. Chapter 19

Schema Refinement and Normal Forms. Chapter 19 Schema Refinement and Normal Forms Chapter 19 1 Review: Database Design Requirements Analysis user needs; what must the database do? Conceptual Design high level descr. (often done w/er model) Logical

More information

Topics in Probabilistic and Statistical Databases. Lecture 9: Histograms and Sampling. Dan Suciu University of Washington

Topics in Probabilistic and Statistical Databases. Lecture 9: Histograms and Sampling. Dan Suciu University of Washington Topics in Probabilistic and Statistical Databases Lecture 9: Histograms and Sampling Dan Suciu University of Washington 1 References Fast Algorithms For Hierarchical Range Histogram Construction, Guha,

More information

Decomposition. Rudolf Kruse, Alexander Dockhorn Bayesian Networks 81

Decomposition. Rudolf Kruse, Alexander Dockhorn Bayesian Networks 81 Decomposition Rudolf Kruse, Alexander Dockhorn Bayesian Networks 81 Object Representation Property Car family body Property Hatchback Motor Radio Doors Seat cover 2.8 L Type 4 Leather, 150kW alpha Type

More information

CS 186, Fall 2002, Lecture 6 R&G Chapter 15

CS 186, Fall 2002, Lecture 6 R&G Chapter 15 Schema Refinement and Normalization CS 186, Fall 2002, Lecture 6 R&G Chapter 15 Nobody realizes that some people expend tremendous energy merely to be normal. Albert Camus Functional Dependencies (Review)

More information

Comp 5311 Database Management Systems. 5. Functional Dependencies Exercises

Comp 5311 Database Management Systems. 5. Functional Dependencies Exercises Comp 5311 Database Management Systems 5. Functional Dependencies Exercises 1 Assume the following table contains the only set of tuples that may appear in a table R. Which of the following FDs hold in

More information

Introduction to Data Management. Lecture #6 (Relational Design Theory)

Introduction to Data Management. Lecture #6 (Relational Design Theory) Introduction to Data Management Lecture #6 (Relational Design Theory) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Announcements v HW#2 is

More information

Schema Refinement and Normal Forms. The Evils of Redundancy. Schema Refinement. Yanlei Diao UMass Amherst April 10, 2007

Schema Refinement and Normal Forms. The Evils of Redundancy. Schema Refinement. Yanlei Diao UMass Amherst April 10, 2007 Schema Refinement and Normal Forms Yanlei Diao UMass Amherst April 10, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke 1 The Evils of Redundancy Redundancy is at the root of several problems associated

More information

CSE 344 MAY 16 TH NORMALIZATION

CSE 344 MAY 16 TH NORMALIZATION CSE 344 MAY 16 TH NORMALIZATION ADMINISTRIVIA HW6 Due Tonight Prioritize local runs OQ6 Out Today HW7 Out Today E/R + Normalization Exams In my office; Regrades through me DATABASE DESIGN PROCESS Conceptual

More information

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms Schema Refinement and Normal Forms Chapter 19 Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1 The Evils of Redundancy Redundancy is at the root of several problems associated with relational

More information

CSE 344 AUGUST 3 RD NORMALIZATION

CSE 344 AUGUST 3 RD NORMALIZATION CSE 344 AUGUST 3 RD NORMALIZATION ADMINISTRIVIA WQ6 due Monday DB design HW7 due next Wednesday DB design normalization DATABASE DESIGN PROCESS Conceptual Model: name product makes company price name address

More information

Global Database Design based on Storage Space and Update Time Minimization

Global Database Design based on Storage Space and Update Time Minimization Journal of Universal Computer Science, vol. 15, no. 1 (2009), 195-240 submitted: 11/1/08, accepted: 15/8/08, appeared: 1/1/09 J.UCS Global Database Design based on Storage Space and Update Time Minimization

More information

Data Bases Data Mining Foundations of databases: from functional dependencies to normal forms

Data Bases Data Mining Foundations of databases: from functional dependencies to normal forms Data Bases Data Mining Foundations of databases: from functional dependencies to normal forms Database Group http://liris.cnrs.fr/ecoquery/dokuwiki/doku.php?id=enseignement: dbdm:start March 1, 2017 Exemple

More information

DECOMPOSITION & SCHEMA NORMALIZATION

DECOMPOSITION & SCHEMA NORMALIZATION DECOMPOSITION & SCHEMA NORMALIZATION CS 564- Spring 2018 ACKs: Dan Suciu, Jignesh Patel, AnHai Doan WHAT IS THIS LECTURE ABOUT? Bad schemas lead to redundancy To correct bad schemas: decompose relations

More information

Schema Refinement & Normalization Theory

Schema Refinement & Normalization Theory Schema Refinement & Normalization Theory Functional Dependencies Week 13 1 What s the Problem Consider relation obtained (call it SNLRHW) Hourly_Emps(ssn, name, lot, rating, hrly_wage, hrs_worked) What

More information

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms Schema Refinement and Normal Forms Chapter 19 Quiz #2 Next Thursday Comp 521 Files and Databases Fall 2012 1 The Evils of Redundancy v Redundancy is at the root of several problems associated with relational

More information

Chapter 7: Relational Database Design

Chapter 7: Relational Database Design Chapter 7: Relational Database Design Chapter 7: Relational Database Design! First Normal Form! Pitfalls in Relational Database Design! Functional Dependencies! Decomposition! Boyce-Codd Normal Form! Third

More information

Chapter 3 Design Theory for Relational Databases

Chapter 3 Design Theory for Relational Databases 1 Chapter 3 Design Theory for Relational Databases Contents Functional Dependencies Decompositions Normal Forms (BCNF, 3NF) Multivalued Dependencies (and 4NF) Reasoning About FD s + MVD s 2 Our example

More information

Chapter 7: Relational Database Design. Chapter 7: Relational Database Design

Chapter 7: Relational Database Design. Chapter 7: Relational Database Design Chapter 7: Relational Database Design Chapter 7: Relational Database Design First Normal Form Pitfalls in Relational Database Design Functional Dependencies Decomposition Boyce-Codd Normal Form Third Normal

More information

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Example (Contd.

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Example (Contd. The Evils of Redundancy Schema Refinement and Normal Forms Chapter 19 Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1 Redundancy is at the root of several problems associated with relational

More information

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Refining an ER Diagram

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Refining an ER Diagram Schema Refinement and Normal Forms Chapter 19 Database Management Systems, R. Ramakrishnan and J. Gehrke 1 The Evils of Redundancy Redundancy is at the root of several problems associated with relational

More information

Journal of Computer and System Sciences

Journal of Computer and System Sciences Journal of Computer and System Sciences 78 (2012) 1026 1044 Contents lists available at SciVerse ScienceDirect Journal of Computer and System Sciences www.elsevier.com/locate/jcss Characterisations of

More information

Relational Design Theory

Relational Design Theory Relational Design Theory CSE462 Database Concepts Demian Lessa/Jan Chomicki Department of Computer Science and Engineering State University of New York, Buffalo Fall 2013 Overview How does one design a

More information

Reasoning Under Uncertainty: Conditional Probability

Reasoning Under Uncertainty: Conditional Probability Reasoning Under Uncertainty: Conditional Probability CPSC 322 Uncertainty 2 Textbook 6.1 Reasoning Under Uncertainty: Conditional Probability CPSC 322 Uncertainty 2, Slide 1 Lecture Overview 1 Recap 2

More information

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) CIS 330, Spring 2004 Lecture 11 March 2, 2004

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) CIS 330, Spring 2004 Lecture 11 March 2, 2004 Schema Refinement and Normal Forms CIS 330, Spring 2004 Lecture 11 March 2, 2004 1 The Evils of Redundancy Redundancy is at the root of several problems associated with relational schemas: redundant storage,

More information

CSC 261/461 Database Systems Lecture 13. Spring 2018

CSC 261/461 Database Systems Lecture 13. Spring 2018 CSC 261/461 Database Systems Lecture 13 Spring 2018 BCNF Decomposition Algorithm BCNFDecomp(R): Find X s.t.: X + X and X + [all attributes] if (not found) then Return R let Y = X + - X, Z = (X + ) C decompose

More information

A Toolbox of Query Evaluation Techniques for Probabilistic Databases

A Toolbox of Query Evaluation Techniques for Probabilistic Databases 2nd Workshop on Management and mining Of UNcertain Data (MOUND) Long Beach, March 1st, 2010 A Toolbox of Query Evaluation Techniques for Probabilistic Databases Dan Olteanu, Oxford University Computing

More information

Sebastian Link University of Auckland, Auckland, New Zealand Henri Prade IRIT, CNRS and Université de Toulouse III, Toulouse, France

Sebastian Link University of Auckland, Auckland, New Zealand Henri Prade IRIT, CNRS and Université de Toulouse III, Toulouse, France CDMTCS Research Report Series Relational Database Schema Design for Uncertain Data Sebastian Link University of Auckland, Auckland, New Zealand Henri Prade IRIT, CNRS and Université de Toulouse III, Toulouse,

More information

CS122A: Introduction to Data Management. Lecture #13: Relational DB Design Theory (II) Instructor: Chen Li

CS122A: Introduction to Data Management. Lecture #13: Relational DB Design Theory (II) Instructor: Chen Li CS122A: Introduction to Data Management Lecture #13: Relational DB Design Theory (II) Instructor: Chen Li 1 Third Normal Form (3NF) v Relation R is in 3NF if it is in 2NF and it has no transitive dependencies

More information

Database Design and Normalization

Database Design and Normalization Database Design and Normalization Chapter 12 (Week 13) EE562 Slides and Modified Slides from Database Management Systems, R. Ramakrishnan 1 Multivalued Dependencies Employee Child Salary Year Hilbert Hubert

More information

Database Design and Normalization

Database Design and Normalization Database Design and Normalization Chapter 11 (Week 12) EE562 Slides and Modified Slides from Database Management Systems, R. Ramakrishnan 1 1NF FIRST S# Status City P# Qty S1 20 London P1 300 S1 20 London

More information

Probabilistic Databases

Probabilistic Databases Probabilistic Databases Dan Olteanu (Oxford) London, November 2017 The National Archives Probabilistic Databases For the purpose of this introductory talk: Probabilistic data = Relational data + Probabilities

More information

Truth-Functional Logic

Truth-Functional Logic Truth-Functional Logic Syntax Every atomic sentence (A, B, C, ) is a sentence and are sentences With ϕ a sentence, the negation ϕ is a sentence With ϕ and ψ sentences, the conjunction ϕ ψ is a sentence

More information

Databases 2012 Normalization

Databases 2012 Normalization Databases 2012 Christian S. Jensen Computer Science, Aarhus University Overview Review of redundancy anomalies and decomposition Boyce-Codd Normal Form Motivation for Third Normal Form Third Normal Form

More information

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) [R&G] Chapter 19

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) [R&G] Chapter 19 Schema Refinement and Normal Forms [R&G] Chapter 19 CS432 1 The Evils of Redundancy Redundancy is at the root of several problems associated with relational schemas: redundant storage, insert/delete/update

More information

CS54100: Database Systems

CS54100: Database Systems CS54100: Database Systems Keys and Dependencies 18 January 2012 Prof. Chris Clifton Functional Dependencies X A = assertion about a relation R that whenever two tuples agree on all the attributes of X,

More information

KU Leuven Department of Computer Science

KU Leuven Department of Computer Science Implementation and Performance of Probabilistic Inference Pipelines: Data and Results Dimitar Shterionov Gerda Janssens Report CW 679, March 2015 KU Leuven Department of Computer Science Celestijnenlaan

More information

The Evils of Redundancy. Schema Refinement and Normal Forms. Functional Dependencies (FDs) Example: Constraints on Entity Set. Example (Contd.

The Evils of Redundancy. Schema Refinement and Normal Forms. Functional Dependencies (FDs) Example: Constraints on Entity Set. Example (Contd. The Evils of Redundancy Schema Refinement and Normal Forms INFO 330, Fall 2006 1 Redundancy is at the root of several problems associated with relational schemas: redundant storage, insert/delete/update

More information

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms Schema Refinement and Normal Forms Yanlei Diao UMass Amherst April 10 & 15, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke 1 Case Study: The Internet Shop DBDudes Inc.: a well-known database consulting

More information

Relational completeness of query languages for annotated databases

Relational completeness of query languages for annotated databases Relational completeness of query languages for annotated databases Floris Geerts 1,2 and Jan Van den Bussche 1 1 Hasselt University/Transnational University Limburg 2 University of Edinburgh Abstract.

More information

arxiv: v1 [cs.db] 21 Sep 2016

arxiv: v1 [cs.db] 21 Sep 2016 Ladan Golshanara 1, Jan Chomicki 1, and Wang-Chiew Tan 2 1 State University of New York at Buffalo, NY, USA ladangol@buffalo.edu, chomicki@buffalo.edu 2 Recruit Institute of Technology and UC Santa Cruz,

More information

Reasoning Under Uncertainty: Introduction to Probability

Reasoning Under Uncertainty: Introduction to Probability Reasoning Under Uncertainty: Introduction to Probability CPSC 322 Lecture 23 March 12, 2007 Textbook 9 Reasoning Under Uncertainty: Introduction to Probability CPSC 322 Lecture 23, Slide 1 Lecture Overview

More information

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005 Instructor: Professor Jeff A. Bilmes Uncertainty & Bayesian Networks

More information

Introduction to Data Management CSE 344

Introduction to Data Management CSE 344 Introduction to Data Management CSE 344 Lecture 18: Design Theory Wrap-up 1 Announcements WQ6 is due on Tuesday Homework 6 is due on Thursday Be careful about your remaining late days. Today: Midterm review

More information

Schema Refinement. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Schema Refinement. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke Schema Refinement Yanlei Diao UMass Amherst Slides Courtesy of R. Ramakrishnan and J. Gehrke 1 Revisit a Previous Example ssn name Lot Employees rating hourly_wages hours_worked ISA contractid Hourly_Emps

More information

Incomplete Information in RDF

Incomplete Information in RDF Incomplete Information in RDF Charalampos Nikolaou and Manolis Koubarakis charnik@di.uoa.gr koubarak@di.uoa.gr Department of Informatics and Telecommunications National and Kapodistrian University of Athens

More information

Reasoning under Uncertainty: Intro to Probability

Reasoning under Uncertainty: Intro to Probability Reasoning under Uncertainty: Intro to Probability Computer Science cpsc322, Lecture 24 (Textbook Chpt 6.1, 6.1.1) March, 15, 2010 CPSC 322, Lecture 24 Slide 1 To complete your Learning about Logics Review

More information

Database Design: Normal Forms as Quality Criteria. Functional Dependencies Normal Forms Design and Normal forms

Database Design: Normal Forms as Quality Criteria. Functional Dependencies Normal Forms Design and Normal forms Database Design: Normal Forms as Quality Criteria Functional Dependencies Normal Forms Design and Normal forms Design Quality: Introduction Good conceptual model: - Many alternatives - Informal guidelines

More information

A Comparative Study of Noncontextual and Contextual Dependencies

A Comparative Study of Noncontextual and Contextual Dependencies A Comparative Study of Noncontextual Contextual Dependencies S.K.M. Wong 1 C.J. Butz 2 1 Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: wong@cs.uregina.ca

More information

Practice and Applications of Data Management CMPSCI 345. Lecture 16: Schema Design and Normalization

Practice and Applications of Data Management CMPSCI 345. Lecture 16: Schema Design and Normalization Practice and Applications of Data Management CMPSCI 345 Lecture 16: Schema Design and Normalization Keys } A superkey is a set of a/ributes A 1,..., A n s.t. for any other a/ribute B, we have A 1,...,

More information

Schema Refinement and Normalization

Schema Refinement and Normalization Schema Refinement and Normalization Schema Refinements and FDs Redundancy is at the root of several problems associated with relational schemas. redundant storage, I/D/U anomalies Integrity constraints,

More information

On the logical Implication of Multivalued Dependencies with Null Values

On the logical Implication of Multivalued Dependencies with Null Values On the logical Implication of Multivalued Dependencies with Null Values Sebastian Link Department of Information Systems, Information Science Research Centre Massey University, Palmerston North, New Zealand

More information

Chapter 3 Design Theory for Relational Databases

Chapter 3 Design Theory for Relational Databases 1 Chapter 3 Design Theory for Relational Databases Contents Functional Dependencies Decompositions Normal Forms (BCNF, 3NF) Multivalued Dependencies (and 4NF) Reasoning About FD s + MVD s 2 Remember our

More information

Database Applications (15-415)

Database Applications (15-415) Database Applications (15-415) Relational Calculus Lecture 6, January 26, 2016 Mohammad Hammoud Today Last Session: Relational Algebra Today s Session: Relational calculus Relational tuple calculus Announcements:

More information

Design by Example for SQL Tables with Functional Dependencies

Design by Example for SQL Tables with Functional Dependencies VLDB Journal manuscript No. (will be inserted by the editor) Design by Example for SQL Tables with Functional Dependencies Sven Hartmann Markus Kirchberg Sebastian Link Received: date / Accepted: date

More information

Approximate Lineage for Probabilistic Databases. University of Washington Technical Report UW TR: #TR

Approximate Lineage for Probabilistic Databases. University of Washington Technical Report UW TR: #TR Approximate Lineage for Probabilistic Databases University of Washington Technical Report UW TR: #TR2008-03-02 Christopher Ré and Dan Suciu {chrisre,suciu}@cs.washington.edu Abstract In probabilistic databases,

More information