MIS Database Systems Schema Refinement and Normal Forms

Size: px
Start display at page:

Download "MIS Database Systems Schema Refinement and Normal Forms"

Transcription

1 MIS Database Systems Schema Refinement and Normal Forms Ahmet Onur Durahim

2 Learning Objectives Anomalies Functional Dependencies Normal Forms 1 st NF, 2 nd NF, 3 rd NF, Boyce-Codd (BCNF) Decompositions Lossless-Join Decompositions Dependency Preserving Decompositions credit: Yücel Saygın

3 Schema Refinement Redundant (useless) storage of information is the root cause of problems Storing the same information redundantly => in more than one place within a database Refinement approach based on decompositions A relation with redundancy is refined by decomposition => replacing it with smaller relations that contain the same information without redundancy

4 Anomalies Modification (Update) anomalies If one copy of the repeated data is updated, an inconsistency is created unless all copies are updated similarly Insertion anomalies It may not be possible to store certain information unless some other, unrelated, information is stored Deletion anomalies It may not be possible to delete certain information without losing some other, unrelated, information as well

5 Anomalies Insertion anomalies Cannot record filmtype without starname Deletion anomalies If we delete the last starname, we also lose the movie info. title year length filmtype studioname starname Star Wars Action Fox Carrie Fisher Star Wars Action Fox Mark Haill Star Wars Action Fox Harrison Ford Mighty Ducks Animation Disney Emilo Estevez Wayne s World Comedy Paramount Dana Carvey Wayne s World Comedy Paramount Mike Meyers

6 Decomposition title year length filmtype studioname starname Star Wars Action Fox Carrie Fisher Star Wars Action Fox Mark Haill Star Wars Action Fox Harrison Ford Mighty Ducks Animation Disney Emilo Estevez Wayne s World Comedy Paramount Dana Carvey Wayne s World Comedy Paramount Mike Meyers title year starname Star Wars 1977 Carrie Fisher Star Wars 1977 Mark Haill Star Wars 1977 Harrison Ford Mighty Ducks 1991 Emilo Estevez title year length filmtype studioname Star Wars Action Fox Mighty Ducks Animation Disney Wayne s World Comedy Paramount Wayne s World 1992 Dana Carvey Wayne s World 1992 Mike Meyers

7 Anomalies Modification (Update) anomalies To update address of a student who occurs twice or more in a table, we will have to update S_Address column in all rows. Otherwise, data will become inconsistent Insertion anomalies Suppose for a new admission, we have a S_id, name, and address of a student, but if student has not opted for any subjects yet, then we have to insert NULL there Deletion anomalies If S_id 401 has only one subject and temporarily he drop it, when we delete that row, entire student record will be deleted along with it

8 Schema Refinement Functional Dependencies, can be used to identify schemas with problems and to suggest refinements Functional dependency is a kind of IC (Integrity Constraint) that generalizes the concept of a key An instance r of a relational schema R satisfies FD: X Y where X, Y are non-empty sets of attributes in R If t1.x = t2.x, then t1.y = t2.y t1 and t2 are two different tuples of r t1.x: projection of tuple t1 onto the attributes in X Decomposition is used for schema refinement

9 Functional Dependency Example Database of beverage drinkers NAME ADDR BEVERAGE LIKED MANUF FAV BEVERAGE John Doe NY, Soho CocaCola Light CocaCola CocaCola John Doe NY, Soho CocaCola CocaCola CocaCola Elisa Day DC, Dupont Pepsi Light Pepsi Pepsi Max Elisa Day DC, Dupont Fanta CocaCola Pepsi Max

10 Functional Dependency Example title - year length title - year filmtype title - year studioname title - year length - filmtype - studioname TITLE YEAR LENGTH FILMTYPE studioname starname Star Wars Action Fox Carrie Fisher Star Wars Action Fox Mark Hamill Star Wars Action Fox Harrison Ford Mighty Ducks Animation Disney Emilio Estevez Wayne s World Comedy Paramount Dana Carvey Wayne s World Comedy Paramount Mike Meyers

11 Functional Dependencies (FDs) A functional dependency X Y holds over relation R, if for every allowable instance r of R: t1 r, t2 r, π X (t1) = π X (t2) implies π Y (t1) = π Y (t2) i.e., given two tuples in r, if the X values agree, then the Y values must also agree. (X and Y are sets of attributes) t1 t2 X Y Z 1 a p 2 b q 1 a r 2 b p

12 Functional Dependencies (FDs) Does the following relation instance satisfy X Y? X Y Z 1 a p 2 b q 1 a r 2 c p

13 Functional Dependencies (FDs) A functional dependency X Y holds over relation R if, for every allowable instance r of R: t1 r, t2 r, π X (t1) = π X (t2) implies π Y (t1) = π Y (t2) i.e., given two tuples in r, if the X values agree, then the Y values must also agree. (X and Y are sets of attributes) An FD is a statement about all allowable relations Must be identified based on semantics of application Given some allowable instance r1 of R, we can check if it violates some FD f, but we cannot tell if f holds over R K is a candidate key for R means that K R However, K R does not require K to be minimal K being a primary key is a special case of an FD

14 Functional Dependencies (FDs) Does the following relation instance satisfy X Y? X Y Z 1 a p 2 b q 1 a r 3 b p

15 Functional Dependencies (FDs) If X is a candidate key, then X YZ? X Y Z 1 a p 2 b q 1 a r 3 b p

16 Functional Dependencies (FDs) If YZ X, can we say that YZ is a candidate key? X Y Z 1 a p 2 b q 1 a r 3 b p

17 Constraints on Entity Sets Consider relation obtained from Hourly_Emps: Hourly_Emps (ssn, name, lot, rating, hrly_wages, hrs_worked) Notation: We will denote this relation schema by listing the attributes: SNLRWH This is really the set of attributes {S,N,L,R,W,H}. Sometimes, we will refer to all attributes of a relation by using the relation name e.g., Hourly_Emps for SNLRWH Hourly_Emps S N L R W H

18 Constraints on Entity Sets Some FDs on Hourly_Emps: ssn is the key: S SNLRWH rating determines hrly_wages: R W Did you notice anything wrong with the following instance? Hourly_Emps S N L R W H

19 Constraints on Entity Sets Some FDs on Hourly_Emps: ssn is the key: S SNLRWH rating determines hrly_wages: R W Salary should be the same for a given rating! Hourly_Emps S N L R W H

20 Example S N L R W H Attishoo Smiley Smethurst Guldu Madayan Problems due to R W: Redundant Storage: The rating value 8 corresponds to the hourly wage 10 (This association is repeated 3 times) Update anomaly: Can we change W in just the 1 st tuple of SNLRWH? Insertion anomaly: What if we want to insert an employee and don t know the hourly wage for his rating? Deletion anomaly: If we delete all employees with rating 5, we lose the information about the wage for rating 5

21 Hourly_Emps S N L R W H Attishoo Smiley Smethurst Guldu Madayan Hourly_Emps2 S N L R H Attishoo Smiley Smethurst Guldu Madayan Wages R W

22 Reasoning About FDs Given some FDs, we can usually infer additional FDs: ssn did, did lot implies ssn lot An FD f is implied by a given set F of FDs if f holds whenever all FDs in F hold F + = closure of F is the set of all FDs that are implied by F Armstrong s Axioms (X, Y, Z are sets of attributes): Reflexivity: If X Y, then Y X (a trivial FD) Augmentation: If X Y, then XZ YZ for any Z Transitivity: If X Y and Y Z, then X Z These are sound and complete inference rules for FDs S: generate only FDs in F +, C: generate all FDs in F +

23 Reasoning About FDs In the following schema SN S is a trivial FD (by reflexivity) since {S,N} is a superset of {S} Hourly_Emps (ssn, name, lot, rating, hrly_wages, hrs_worked) S N L R W H

24 Reasoning About FDs In the following schema If SN RW, then SNL RWL (by augmentation) Hourly_Emps (ssn, name, lot, rating, hrly_wages, hrs_worked) S N L R W H

25 Reasoning About FDs Couple of additional rules (that follow from AA): Union: If X Y and X Z, then X YZ Proof: From X Y, we have XX XY (by augmentation) Note that XX is X, therefore X XY From X Z, we have XY YZ (by augmentation) From X XY and XY YZ, we have X YZ (by transitivity)

26 Reasoning About FDs Couple of additional rules (that follow from AA): Decomposition: If X YZ, then X Y and X Z Try to prove it at home/dorm/ics

27 Reasoning About FDs Example: Contracts(Cid,Sid,prJid,Did,Pid,Qty,Value), where we denote schema as CSJDPQV and: C is the key: C CSJDPQV Project purchases each part using single contract: JP C Department purchases at most one part from a supplier: SD P JP C and C CSJDPQV imply JP CSJDPQV SD P implies SDJ JP SDJ JP and JP CSJDPQV imply SDJ CSJDPQV We cannot conclude that SD CSDPQV by cancelling J from both sides of SDJ CSJDPQV!!!

28 Example Suppose that we are given; a relation scheme R = (A,B,C,G,H,I) the set of functional dependencies F: F = {A B, A C, CG H, CG I, B H} Is A H logically implied by F? Is AG I logically implied by F?

29 Reasoning About FDs Computing the closure of a set of FDs (F + ) can be expensive Size of closure is exponential in # attrs Example: A database with 4 attributes (A,B,C,D) F = {A B, B C} Find the closure of F denoted by F + A A, A B, A C, B B, B C, C C, D D, AB A, AB B, AB C, AC A, AC B, AC C, AD A, AD B, AD C, AD D, BC B, BC C, BD B, BD C, BD D, CD C, CD D, ABC A, ABC B, ABC C, ABD A, ABD B, ABD C, ABD D, BCD B, BCD C, BCD D, ABCD A, ABCD B, ABCD C, ABCD D

30 Reasoning About FDs Computing the closure of a set of FDs (F + ) can be expensive Size of closure is exponential in # attrs Typically, we just want to check if a given FD, X Y, is in the closure of a set of FDs F An efficient check: Compute attribute closure of X (denoted X + ) wrt F: Set of all attributes A such that X A is in F + There is a linear time algorithm to compute this; For each FD Y Z in F, if X + is a superset of Y then add Z to X +

31 Reasoning About FDs Does F = {A B, B C, CD E} imply A E? i.e., Is A E in the closure F +? Equivalently, is E in A +? Lets compute A + Initialize A + to {A} : A + = {A} From A B, we can add B to A + : A + = {A, B} From B C, we can add C to A + : A + = {A, B, C} We can not add any more attributes, and A + does not contain E Therefore A E does not hold

32 DB Design Guidelines Design a relation schema with a clearly defined semantics Design the relation schemas so that there are no insertion, deletion, or modification anomalies If there may be anomalies, state them clearly Avoid attributes which may frequently have null values as much as possible Make sure that relations can be combined by key-foreign key links

33 Normal Forms Normal forms are standards for a good DB schema (introduced by Codd in 1972) If a relation is in a certain normal form (such as BCNF, 3NF etc.), it is known that certain kinds of problems are avoided/minimized. Normal forms help us decide if decomposing a relation helps

34 Normal Forms: 1NF First Normal Form: Relation in 1NF if every field containes only atomic values No set valued attributes (no lists or sets) sid name phones 1 ali { , } 2 veli 3 ayse 4 fatma

35 First Normal Form Student Not in 1NF Student in 1NF

36 Normal Forms Role of FDs in detecting redundancy; Consider a relation R with 3 attributes, ABC No FDs hold: There is no redundancy here Given a FD, A B: Several tuples could have the same A value, and if so, they ll all have the same B value This potential redundancy can be predicted using this FD information

37 Normal Forms: 2NF Second Normal Form: Every non-prime (non-key) attribute should be fully functionally dependent on every key (no partial dependency) i.e., candidate keys In other words: No non-prime attribute in the table is functionally dependent on a proper subset of any candidate key Prime attribute: any attribute that is part of a key Non-prime attributes: rest of the attributes Ex: If AB is a key, and C is a non-prime attribute, then if A C holds then A partially determines C there is a partial functional dependency to a key

38 Student Not in 2NF 2 nd Normal Form Student Age Subject Adam 15 Biology Adam 15 Math Alex 14 Math Stuart 17 Math candidate key is {Student, Subject} Student in 2NF Student Age Adam 15 Alex 14 Stuart 17 Student Subject in 2NF Student Subject Adam Biology Adam Math Alex Math Stuart Math

39 2 nd Normal Form Composite primary key is [Customer ID, Store ID] The non-key attribute is [Purchase Location] PURCHASE_DETAIL Not in 2 nd normal form [Purchase Location] only depends on [Store ID] [Store ID] is only part of the primary key PURCHASE STORE

40 2 nd Normal Form Composite primary key is [Employee, Skill] The non-key attribute is [Current Work Location] Not in 2 nd normal form [Current Work Location] only depends on [Employee] [Employee] is only part of the primary key

41 Normal Forms: 3NF Relation R with FDs F is in Third Normal Form if, for all X A in F + (Zaniolo s def.) A X (called a trivial FD), (=> X contains A) or X contains a key for R, (=> X is a superkey) or A is part of some key for R (=> Every element of A-X is a prime attribute (contained in some candidate key)) R is in 2NF & there is no transitive functional dependency (Codd s def.) B is functionally dependent on A, and C is functionally dependent on B. Therefore, C is transitively dependent on A via B If R is in 3NF, some redundancy is possible

42 What Does 3NF Achieve? If 3NF violated by X A, one of the following holds: X is a subset of some key K (partial dependency) We store (X, A) pairs redundantly X is not a proper subset of any key (transitive dependency) There is a chain of FDs K X A, which means that we cannot associate an X value with a K value unless we also associate an A value with an X value But: even if relation is in 3NF, these problems could arise e.g., Reserves SBDC (C: Credit Card), S C, C S is in 3NF (SBD & CBD are keys), S C: Sailor uses a unique CreditCard to pay for reservations (Only key is SBD) (S is not a key and C is not part of a key) Hence not in 3NF (redundantly stored SC pairs) If also C S: Credit cards uniquely identify the owner (which means CBD is also a key) Hence in 3NF but for each reservation of sailor S, same (S, C) pair is stored There is a stricter normal form (BCNF)

43 Partial/Transitive Dependencies Partial Dependency Key Attributes X Attribute A Case 1: A not in KEY Transitive Dependencies Violates 3NF Key Attributes X Attribute A Case 1: A not in KEY Key Attribute A Attributes X Case 2: A is in KEY Not violate 3NF

44 3 rd Normal Form [Book ID] (key) determines [Genre ID] [Genre ID] determines [Genre Type] Not in 3 rd normal form BOOK DETAIL [Book ID] determines [Genre Type] via [Genre ID] There is transitive functional dependency BOOK GENRE => non-key attribute

45 3 rd Normal Form Tournament Winners [Tournament, Year] is a minimal set of attributes guaranteed to uniquely identify a row candidate key for the table Not in 3 rd normal form => non-key attribute [Tournament, Year] determines [Date of Birth] via [Winner] Non-prime attribute [Date of Birth] is transitively dependent on the candidate key Tournament Winners Winner Dates of Birth

46 Boyce-Codd Normal Form (BCNF) Relation R with FDs F is in BCNF if, for all X A in F + A X (called a trivial FD), or X contains a key for R. (i.e., X is a superkey) In other words, R is in BCNF if the only non-trivial FDs that hold over R are key constraints Key Nonkey Attr1 Nonkey Attr2 Nonkey AttrK FDs in a BCNF Relation

47 Boyce-Codd Normal Form (BCNF) BCNF ensures that No Redundancy in R can be predicted using FDs alone if a relation is in BCNF, every field of every tuple records a piece of information that cannot be inferred (using only FDs) from the values in all other fields in relation instance If we are shown two tuples that agree upon the X value, we cannot infer the A value in one tuple from the A value in the other If example relation is in BCNF (where X A), the 2 tuples must be identical (X is a key since R in BCNF) this situation cannot arise in relational DBs X Y A x y1 a x y2?

48 BCNF FDs: GPA rank cid cname, cinstructor sid sname, address, GPA Keys: {sid, cid} Student Course sid sname address GPA cid cname cinstructor rank 111 Onur 123 st DB sys Durahim Ahmet 999 st DB sys Durahim Onur 123 st Info sys Jackman 1 Not even in 2 nd NF 2 nd and 3 rd FDs lead to partial dependencies Not in BCNF Not every LHS of FDs contain a key None of the FDs contain a key

49 BCNF FDs: GPA rank cid cname, cinstructor sid sname, address, GPA Keys: {sid, cid} Student Course sid sname address GPA cid cname cinstructor rank 111 Onur 123 st DB sys Durahim Ahmet 999 st DB sys Durahim Onur 123 st Info sys Jackman 1 Not even in 2 nd NF 2 nd and 3 rd FDs lead to partial dependencies Not in BCNF Not every LHS of FDs contain a key None of the FDs contain a key cid cname cinstructor 335 DB sys Durahim 413 Info sys Jackman sid sname address GPA cid rank 111 Onur 123 st Ahmet 999 st Onur 123 st

50 BCNF FDs: GPA rank cid cname, cinstructor sid sname, address, GPA Keys: {sid, cid} Student Course sid sname address GPA cid cname cinstructor rank 111 Onur 123 st DB sys Durahim Ahmet 999 st DB sys Durahim Onur 123 st Info sys Jackman 1 Not even in 2 nd NF 2 nd and 3 rd FDs lead to partial dependencies Not in BCNF Not every LHS of FDs contain a key None of the FDs contain a key cid cname cinstructor 335 DB sys Durahim 413 Info sys Jackman sid sname address GPA cid 111 Onur 123 st Ahmet 999 st Onur 123 st GPA rank

51 BCNF FDs: GPA rank cid cname, cinstructor sid sname, address, GPA Keys: {sid, cid} cid cname cinstructor 335 DB sys Durahim 413 Info sys Jackman Student Course sid sname address GPA cid cname cinstructor rank 111 Onur 123 st DB sys Durahim Ahmet 999 st DB sys Durahim Onur 123 st Info sys Jackman 1 Not in BCNF Not every LHS of FDs contain a key None of the FDs contain a key sid sname address GPA 111 Onur 123 st Ahmet 999 st. 2.9 sid cid GPA All of these four tables are now in BCNF rank

52 BCNF sid instrid CourseCode OffHourAppnt MIS MIS MIS MIS FDs: sid, instrid CourseCode, OffHourAppnt coursecode instrid In 3NF BUT NOT in BCNF

53 BCNF sid instrid CourseCode OffHourAppnt MIS MIS MIS MIS FDs: sid, instrid CourseCode, OffHourAppnt coursecode instrid In 3NF BUT NOT in BCNF No partial key or transitive key dependencies

54 BCNF sid instrid CourseCode OffHourAppnt MIS MIS MIS MIS FDs: sid, instrid CourseCode, OffHourAppnt coursecode instrid In 3NF BUT NOT in BCNF No partial key or transitive key dependencies coursecode is not a superkey sid CourseCode OffHourAppnt 111 MIS MIS MIS MIS instrid CourseCode 123 MIS MIS413

55 Normal Form Shortcuts All attributes are prime At least in 3NF Singleton keys At least in 2NF

56 Decomposition of a Relation Scheme Suppose that relation R contains attributes A1,..., An A decomposition of R consists of replacing R by two or more relations such that: Each new relation scheme contains a subset of the attributes of R (and no attributes that do not appear in R), and Every attribute of R appears as an attribute of one of the new relations Intuitively, decomposing R means we will store instances of the relation schemes produced by the decomposition, instead of instances of R e.g., Can decompose SNLRWH into SNLRH and RW

57 Decomposition of a Relation Scheme We can decompose SNLRWH into SNL and RWH S N L R W H S N L R W H

58 Example Decomposition SNLRWH has FDs {S SNLRWH, R W} Is this in 3NF? R W violates 3NF W values repeatedly associated with R values In order to fix the problem, we need to create a relation RW to store the R W associations, and to remove W from the main schema: i.e., we decompose SNLRWH into SNLRH and RW S N L R H R W

59 S N L R H R W Problems with Decompositions There are three potential problems to consider: Some queries become more expensive (Performance loss due to required joins) e.g., How much did sailor Joe earn? (salary = W*H) Given instances of the decomposed relations, we may not be able to reconstruct the corresponding instance of the original relation Fortunately, not in the SNLRWH example. Checking some dependencies may require joining the instances of the decomposed relations Fortunately, not in the SNLRWH example. Tradeoff: Must consider these issues vs. redundancy

60 Problems with Decompositions What problems does a given decomposition cause, if any? Lossless-join property Enables to recover any instance of the decomposed relation from corresponding instances of the smaller relations Given instances of the decomposed relations, we may not be able to reconstruct the corresponding instance of the original relation! Dependency-preservation property Enables us to enforce any constraint on the original relation by simply enforcing some constraints on each of the smaller relations Checking some dependencies may require joining the instances of the decomposed relations

61 Lossless Join Decompositions Decomposition of R into X and Y is lossless-join w.r.t. a set of FDs F if, for every instance r that satisfies F: π X (r) π Y (r) = r It is always true that r π X (r) π Y (r) In general, the other direction does not hold! If it does, the decomposition is lossless-join. Definition extended to decomposition into 3 or more relations in a straightforward way It is essential that all decompositions used to deal with redundancy be lossless

62 Lossless Join The decomposition of R into X and Y is lossless-join wrt FDs F if and only if the closure of F (F + ) contains: X Y X, or X Y Y The attributes common to X and Y must contain a key for either X or Y If a FD U V holds over R and U V is empty, the decomposition of R into R V and UV is lossless A B C A B C A B B C

63 Person(SSN, Name, Address, Hobby) F = {SSN, Hobby Name, Address; SSN Name, Address} Person SSN Name Address Hobby Celalettin Sabanci D. Stamps Celalettin Sabanci D. Coins Elif Mutlukent Skating Elif Mutlukent Surfing Sercan Esentepe Math Person1 SSN Name Address Celalettin Sabanci D Elif Mutlukent Sercan Esentepe SSN Hobby Stamps Coins Skating Surfing Math Hobby

64 Person(SSN, Name, Address, Hobby) F = {SSN, Hobby Name, Address; SSN Name, Address} SSN Name Address Celalettin Sabanci D Elif Mutlukent Sercan Esentepe SSN Hobby Stamps Coins Skating Surfing Math SSN Name Address Hobby Celalettin Sabanci D. Stamps Celalettin Sabanci D. Coins Elif Mutlukent Skating Elif Mutlukent Surfing Sercan Esentepe Math

65 Problems with Decompositions (Contd.) Checking some dependencies may require joining the instances of the decomposed relations

66 Dependency Preserving Decomposition Consider CSJDPQV, C is key, JP C and SD P SD does not contain a key, thus SD P causes a violation of BCNF BCNF decomposition: CSJDQV and SDP Problem: Checking JP C for each insertion requires a join (expensive!) => decomposition is not dependency-preserving Dependency preserving decomposition: A dependency X Y that appear in F should either appear in one of the sub relations or should be inferred from the dependencies in one of the sub relations Projection of set of FDs F: If R is decomposed into X,... projection of F onto X (denoted F X ) is the set of FDs U V in F + (closure of F) such that U, V are in X Ex: R = ABC, F = {A B, B C, C A} F + includes FDs, {A B, B C, C A, B A, A C, C B} F AB = {A B, B A}, F AC = {C A, A C}

67 Dependency Preserving Decomposition Decomposition of R into schemas with attribute sets X and Y is dependency preserving if (F X F Y ) + = F + take the dependencies in F X and F Y and compute the closure of their union we get back all dependencies in the closure of F therefore, we need to enforce only the dependencies in F X and F Y, then all FDs in F + are sure to be satisfied Important to consider F +, not F, in this definition: ABC, {A B, B C, C A}, decomposed into AB and BC Is this dependency preserving? Is C A preserved??? F + includes FDs, {A B, B C, C A, B A, A C, C B} F AB = {A B, B A}, F BC = {B C, C B}, F AB U F BC = {A B, B A, B C, C B} Does the closure of F AB U F BC imply C A?

68 Dependency Preserving Decomposition Dependency preserving does not imply lossless join: Ex: ABC, A B, decomposed into AB and BC, is a lossy decomposition And vice-versa! Ex: CSJDPQV, {C is key, JP C and SD P}, decomposed into CSJDQV and SDP, is lossless but not dependency preserving

69 Normalization Converting relations to BCNF Possible to obtain a lossless-join decomposition into a collection of BCNF relation schemas But, there may be no dependency-preserving decomposition into a collection of BCNF relation schemas Converting relations to 3NF There is always a dependency-preserving, losslessjoin decomposition into a collection of 3NF relation schemas

70 Decomposition into BCNF Consider relation R (ABCD) with FDs F (AB is key) If X Y (A CD) violates BCNF, decompose R into R-Y (ABCD - CD) and XY (A CD) Y is a single attribute and not in X Repeated application of this idea will give us a collection of relations that are in BCNF, lossless join decomposition, and guaranteed to terminate In general, several dependencies may cause violation of BCNF The order in which we deal with them could lead to very different sets of relations!

71 BCNF decomposition Given a relation R and FDs F for R Compute keys for R (using FDs) Repeat until all relations are in BCNF Pick any R with A B that violates BCNF Decompose R into R 1 (A,B) and R 2 (A, rest) Compute FDs for R 1 and R 2 Compute keys for R 1 and R 2

72 Example: Decomposition into BCNF R = ABCDEFGH with FDs ABH C : A DE : BGH F F ADH : BH GE Is R in BCNF? Which FD violates the BCNF? ABH C? No, since ABH is a superkey A DE violates BCNF Since attribute closure of A is ADE and therefore A is not a superkey Decompose R = ABCDEFGH into R1 = ADE and R2 = ABCFGH

73 Example: Decomposition into BCNF R = ABCDEFGH with FDs ABH C : A DE : BGH F F ADH : BH GE R1 = ADE, F1 = {A DE} R2 = ABCFGH, F2 = {ABH C, BGH F, F AH, BH G} New FDs are obtained by projecting the original FDs on the attributes in the new relations For example: BH GE is decomposed into {BH G, BH E} and BH E is not included in F1 or F2, BH G is included into R2 Is the decomposition of R into R1 and R2 dependency preserving? R1 is in BCNF, but we need to apply the algorithm on R2 since it is not in BCNF

74 BCNF and Dependency Preservation In general, there may not be a dependency preserving decomposition into BCNF e.g., SBD, SB D, D B Can t decompose while preserving 1 st FD; not in BCNF Similarly, decomposition of CSJDPQV into SDP, JS and CJDQV is not dependency preserving (w.r.t. the FDs JP C, SD P and J S) However, it is a lossless join decomposition In this case, adding JPC to the collection of relations gives us a dependency preserving decomposition JPC tuples stored only for checking FD! (Redundancy!) there is no such redundancy within a single BCNF relation This example shows that redundancy can still occur across relations, even though there is no redundancy within a relation

75 Decomposition into 3NF The algorithm for lossless-join decomposition into BCNF can be used to obtain a lossless join decomposition into 3NF (typically, can stop earlier) But this approach does not ensure dependencypreservation To ensure dependency preservation, one idea: If X Y is not preserved, add relation XY Problem is that XY may violate 3NF! e.g., consider the addition of CJP to preserve JP C. What if we also have J C? Refinement: Instead of the given set of FDs F, use a minimal cover for F

76 Minimal Cover for a Set of FDs Minimal cover FD set of G for a set of FDs F s.t.: The closure of F (F + ) = The closure of G (G + ) Right hand side of each FD in G is a single attribute If we modify G by deleting an FD or by deleting attributes from an FD in G, the closure changes Intuitively, every FD in G is needed, and as small as possible in order to get the same closure as F every dependency in it is required for the closure to be equal to F + each attribute on the left side is necessary the right side is a single attribute e.g., A B, ABCD E, EF GH, ACDF EG has the following minimal cover: A B, ACD E, EF G and EF H

77 Minimal Cover for a Set of FDs A B, ABCD E, EF GH, ACDF EG has the following minimal cover by: ACDF EG => ACDF E and ACDF G EF GH => EF G and EF H ABCD E can be replaced by ACD E since A B holds ACDF G is implied by A B, ABCD E, EF GH A B => A AB => AC ABC => ACD ABCD => ACD E => ACDF EF => ACDF GH => ACDF G (can be deleted) A B => A AB => AC ABC => ACD ABCD => ACD E => ACDF EF => ACDF E (can be deleted) A B, ACD E, EF G and EF H

78 Obtaining the Minimal Cover Algorithm Steps: 1. Put the FDs in standard form single attribute on the right hand side 2. Minimize the left hand side of each FD For each FD check if an attribute in LHS can be deleted while preserving equivalence to closure of F 3. Delete the redundant FDs It is necessary to minimize the left sides of FDs before checking for redundant FDs

79 Obtaining the Minimal Cover Example: F = {ABCD E, E D, A B, AC D} Notice that the right hand sides have a single attribute if not we had to decompose the right hand sides first Can we remove B from the left hand side of ABCD E? Check if ACD E is implied by F In order to do this, find the attribute closure ACD wrt F If B is in the attribute closure, then ACD E is implied by F, and therefore we can replace ABCD E with ACD E note that given ACD E, we have ABCD E A B => A AB => ACD ABCD => ACD E Can we remove D from ACD E Check if AC E is implied by F obtained by replacing ABCD E in F with ACD E F = {AC E, E D, A B, AC D} Can we drop any FDs in F? Could we drop any FDs in F before minimizing the left hand sides?

80 Dependency Preserving Decomposition into 3 rd NF Let R be the relation to be decomposed into 3 rd NF and F be the FDs that is a minimal cover Algorithm Steps Perform lossless-join decomposition of R into R1, R2,, Rn Project the FDs in F into F1, F2,, Fn that correspond to R1, R2,, Rn Identify the set of FDs that are not preserved i.e., that are not in the closure of the union of F1, F2,, Fn For each FD X A that is not preserved, create a relation schema XA and add it to the decomposition

81 Example Consider the relation R, Contracts(CSJDPQV) with FDs (C is a key): JP C, SD P, J S Decomposed into R1(SDP), R2(CSJDQV) R1 in BCNF, R2 not in 3NF Decompose R2(CSJDQV) into R3(JS), R4(CJDQV) Both R3 and R4 in 3NF (in BCNF also) Decomposition of R into R1, R3, R4 is lossless-join But not dependency-preserving (JP C is not preserved) Add R5(CJP) into relation Resulting decomposition is CSJDPQV into SDP, JS, CJDQV, CJP

82 Synthesis Approach Consider the relation R, Contracts(CSJDPQV) with FDs (C is a key): JP C, SD P, J S Find minimal cover C CSJDPQV into C S, C J, C D, C P, C Q, C V C S is implied by C J and J S C P is implied by C S, C D and SD P Final set is (C J, C D, C Q, C V, JP C, SD P, J S) So add corresponding schemas for all the FDs in minimal cover CJ, CD, CQ, CV, CJP, SDP, JS Improve this set by combining relations for which C is the key into CDJPQV CDJPQV, SDP, JS

MIS Database Systems Schema Refinement and Normal Forms

MIS Database Systems Schema Refinement and Normal Forms MIS 335 - Database Systems Schema Refinement and Normal Forms http://www.mis.boun.edu.tr/durahim/ Ahmet Onur Durahim Learning Objectives Anomalies Functional Dependencies Normal Forms 1 st NF, 2 nd NF,

More information

Schema Refinement and Normal Forms. The Evils of Redundancy. Schema Refinement. Yanlei Diao UMass Amherst April 10, 2007

Schema Refinement and Normal Forms. The Evils of Redundancy. Schema Refinement. Yanlei Diao UMass Amherst April 10, 2007 Schema Refinement and Normal Forms Yanlei Diao UMass Amherst April 10, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke 1 The Evils of Redundancy Redundancy is at the root of several problems associated

More information

The Evils of Redundancy. Schema Refinement and Normal Forms. Functional Dependencies (FDs) Example: Constraints on Entity Set. Example (Contd.

The Evils of Redundancy. Schema Refinement and Normal Forms. Functional Dependencies (FDs) Example: Constraints on Entity Set. Example (Contd. The Evils of Redundancy Schema Refinement and Normal Forms INFO 330, Fall 2006 1 Redundancy is at the root of several problems associated with relational schemas: redundant storage, insert/delete/update

More information

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) [R&G] Chapter 19

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) [R&G] Chapter 19 Schema Refinement and Normal Forms [R&G] Chapter 19 CS432 1 The Evils of Redundancy Redundancy is at the root of several problems associated with relational schemas: redundant storage, insert/delete/update

More information

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Example (Contd.

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Example (Contd. The Evils of Redundancy Schema Refinement and Normal Forms Chapter 19 Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1 Redundancy is at the root of several problems associated with relational

More information

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Refining an ER Diagram

The Evils of Redundancy. Schema Refinement and Normal Forms. Example: Constraints on Entity Set. Functional Dependencies (FDs) Refining an ER Diagram Schema Refinement and Normal Forms Chapter 19 Database Management Systems, R. Ramakrishnan and J. Gehrke 1 The Evils of Redundancy Redundancy is at the root of several problems associated with relational

More information

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) CIS 330, Spring 2004 Lecture 11 March 2, 2004

Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) CIS 330, Spring 2004 Lecture 11 March 2, 2004 Schema Refinement and Normal Forms CIS 330, Spring 2004 Lecture 11 March 2, 2004 1 The Evils of Redundancy Redundancy is at the root of several problems associated with relational schemas: redundant storage,

More information

Schema Refinement and Normal Forms. Case Study: The Internet Shop. Redundant Storage! Yanlei Diao UMass Amherst November 1 & 6, 2007

Schema Refinement and Normal Forms. Case Study: The Internet Shop. Redundant Storage! Yanlei Diao UMass Amherst November 1 & 6, 2007 Schema Refinement and Normal Forms Yanlei Diao UMass Amherst November 1 & 6, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke 1 Case Study: The Internet Shop DBDudes Inc.: a well-known database consulting

More information

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms Schema Refinement and Normal Forms Chapter 19 Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1 The Evils of Redundancy Redundancy is at the root of several problems associated with relational

More information

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms Schema Refinement and Normal Forms UMass Amherst Feb 14, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke, Dan Suciu 1 Relational Schema Design Conceptual Design name Product buys Person price name

More information

The Evils of Redundancy. Schema Refinement and Normalization. Functional Dependencies (FDs) Example: Constraints on Entity Set. Refining an ER Diagram

The Evils of Redundancy. Schema Refinement and Normalization. Functional Dependencies (FDs) Example: Constraints on Entity Set. Refining an ER Diagram The Evils of Redundancy Schema Refinement and Normalization Chapter 1 Nobody realizes that some people expend tremendous energy merely to be normal. Albert Camus Redundancy is at the root of several problems

More information

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms Schema Refinement and Normal Forms Chapter 19 Quiz #2 Next Thursday Comp 521 Files and Databases Fall 2012 1 The Evils of Redundancy v Redundancy is at the root of several problems associated with relational

More information

Schema Refinement and Normalization

Schema Refinement and Normalization Schema Refinement and Normalization Schema Refinements and FDs Redundancy is at the root of several problems associated with relational schemas. redundant storage, I/D/U anomalies Integrity constraints,

More information

Schema Refinement and Normal Forms

Schema Refinement and Normal Forms Schema Refinement and Normal Forms Yanlei Diao UMass Amherst April 10 & 15, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke 1 Case Study: The Internet Shop DBDudes Inc.: a well-known database consulting

More information

Schema Refinement and Normal Forms Chapter 19

Schema Refinement and Normal Forms Chapter 19 Schema Refinement and Normal Forms Chapter 19 Instructor: Vladimir Zadorozhny vladimir@sis.pitt.edu Information Science Program School of Information Sciences, University of Pittsburgh Database Management

More information

Schema Refinement & Normalization Theory

Schema Refinement & Normalization Theory Schema Refinement & Normalization Theory Functional Dependencies Week 13 1 What s the Problem Consider relation obtained (call it SNLRHW) Hourly_Emps(ssn, name, lot, rating, hrly_wage, hrs_worked) What

More information

CS 186, Fall 2002, Lecture 6 R&G Chapter 15

CS 186, Fall 2002, Lecture 6 R&G Chapter 15 Schema Refinement and Normalization CS 186, Fall 2002, Lecture 6 R&G Chapter 15 Nobody realizes that some people expend tremendous energy merely to be normal. Albert Camus Functional Dependencies (Review)

More information

Schema Refinement and Normal Forms. Chapter 19

Schema Refinement and Normal Forms. Chapter 19 Schema Refinement and Normal Forms Chapter 19 1 Review: Database Design Requirements Analysis user needs; what must the database do? Conceptual Design high level descr. (often done w/er model) Logical

More information

Schema Refinement. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Schema Refinement. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke Schema Refinement Yanlei Diao UMass Amherst Slides Courtesy of R. Ramakrishnan and J. Gehrke 1 Revisit a Previous Example ssn name Lot Employees rating hourly_wages hours_worked ISA contractid Hourly_Emps

More information

CAS CS 460/660 Introduction to Database Systems. Functional Dependencies and Normal Forms 1.1

CAS CS 460/660 Introduction to Database Systems. Functional Dependencies and Normal Forms 1.1 CAS CS 460/660 Introduction to Database Systems Functional Dependencies and Normal Forms 1.1 Review: Database Design Requirements Analysis user needs; what must database do? Conceptual Design high level

More information

Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1

Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1 Schema Refinement & Normalization Theory: Functional Dependencies INFS-614 INFS614, GMU 1 Background We started with schema design ER model translation into a relational schema Then we studied relational

More information

Introduction to Data Management. Lecture #6 (Relational DB Design Theory)

Introduction to Data Management. Lecture #6 (Relational DB Design Theory) Introduction to Data Management Lecture #6 (Relational DB Design Theory) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Announcements v Homework

More information

CS122A: Introduction to Data Management. Lecture #13: Relational DB Design Theory (II) Instructor: Chen Li

CS122A: Introduction to Data Management. Lecture #13: Relational DB Design Theory (II) Instructor: Chen Li CS122A: Introduction to Data Management Lecture #13: Relational DB Design Theory (II) Instructor: Chen Li 1 Third Normal Form (3NF) v Relation R is in 3NF if it is in 2NF and it has no transitive dependencies

More information

12/3/2010 REVIEW ALGEBRA. Exam Su 3:30PM - 6:30PM 2010/12/12 Room C9000

12/3/2010 REVIEW ALGEBRA. Exam Su 3:30PM - 6:30PM 2010/12/12 Room C9000 REVIEW Exam Su 3:30PM - 6:30PM 2010/12/12 Room C9000 2 ALGEBRA 1 RELATIONAL ALGEBRA OPERATIONS Basic operations Selection ( ) Selects a subset of rows from relation. Projection ( ) Deletes unwanted columns

More information

Introduction to Data Management. Lecture #6 (Relational Design Theory)

Introduction to Data Management. Lecture #6 (Relational Design Theory) Introduction to Data Management Lecture #6 (Relational Design Theory) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Announcements v HW#2 is

More information

Introduction to Data Management. Lecture #7 (Relational DB Design Theory II)

Introduction to Data Management. Lecture #7 (Relational DB Design Theory II) Introduction to Data Management Lecture #7 (Relational DB Design Theory II) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Announcements v Homework

More information

Lecture #7 (Relational Design Theory, cont d.)

Lecture #7 (Relational Design Theory, cont d.) Introduction to Data Management Lecture #7 (Relational Design Theory, cont d.) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Announcements

More information

Normal Forms (ii) ICS 321 Fall Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

Normal Forms (ii) ICS 321 Fall Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa ICS 321 Fall 2012 Normal Forms (ii) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 9/12/2012 Lipyeow Lim -- University of Hawaii at Manoa 1 Hourly_Emps

More information

Schema Refinement and Normal Forms. Why schema refinement?

Schema Refinement and Normal Forms. Why schema refinement? Schema Refinement and Normal Forms Why schema refinement? Consider relation obtained from Hourly_Emps: Hourly_Emps (sin,rating,hourly_wages,hourly_worked) Problems: Update Anomaly: Can we change the wages

More information

Functional Dependencies

Functional Dependencies Functional Dependencies CS 186, Fall 2002, Lecture 5 R&G Chapter 15 Science is the knowledge of consequences, and dependence of one fact upon another. Thomas Hobbes (1588-1679) Administrivia Most admissions

More information

Design Theory for Relational Databases. Spring 2011 Instructor: Hassan Khosravi

Design Theory for Relational Databases. Spring 2011 Instructor: Hassan Khosravi Design Theory for Relational Databases Spring 2011 Instructor: Hassan Khosravi Chapter 3: Design Theory for Relational Database 3.1 Functional Dependencies 3.2 Rules About Functional Dependencies 3.3 Design

More information

Introduction. Normalization. Example. Redundancy. What problems are caused by redundancy? What are functional dependencies?

Introduction. Normalization. Example. Redundancy. What problems are caused by redundancy? What are functional dependencies? Normalization Introduction What problems are caused by redundancy? UVic C SC 370 Dr. Daniel M. German Department of Computer Science What are functional dependencies? What are normal forms? What are the

More information

Normal Forms 1. ICS 321 Fall Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

Normal Forms 1. ICS 321 Fall Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa ICS 321 Fall 2013 Normal Forms 1 Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 9/16/2013 Lipyeow Lim -- University of Hawaii at Manoa 1 The Problem with

More information

Relational Design Theory

Relational Design Theory Relational Design Theory CSE462 Database Concepts Demian Lessa/Jan Chomicki Department of Computer Science and Engineering State University of New York, Buffalo Fall 2013 Overview How does one design a

More information

Introduction to Data Management. Lecture #9 (Relational Design Theory, cont.)

Introduction to Data Management. Lecture #9 (Relational Design Theory, cont.) Introduction to Data Management Lecture #9 (Relational Design Theory, cont.) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 It s time again

More information

Schema Refinement. Feb 4, 2010

Schema Refinement. Feb 4, 2010 Schema Refinement Feb 4, 2010 1 Relational Schema Design Conceptual Design name Product buys Person price name ssn ER Model Logical design Relational Schema plus Integrity Constraints Schema Refinement

More information

Relational Database Design

Relational Database Design Relational Database Design Jan Chomicki University at Buffalo Jan Chomicki () Relational database design 1 / 16 Outline 1 Functional dependencies 2 Normal forms 3 Multivalued dependencies Jan Chomicki

More information

CSE 303: Database. Outline. Lecture 10. First Normal Form (1NF) First Normal Form (1NF) 10/1/2016. Chapter 3: Design Theory of Relational Database

CSE 303: Database. Outline. Lecture 10. First Normal Form (1NF) First Normal Form (1NF) 10/1/2016. Chapter 3: Design Theory of Relational Database CSE 303: Database Lecture 10 Chapter 3: Design Theory of Relational Database Outline 1st Normal Form = all tables attributes are atomic 2nd Normal Form = obsolete Boyce Codd Normal Form = will study 3rd

More information

Functional Dependencies, Schema Refinement, and Normalization for Relational Databases. CSC 375, Fall Chapter 19

Functional Dependencies, Schema Refinement, and Normalization for Relational Databases. CSC 375, Fall Chapter 19 Functional Dependencies, Schema Refinement, and Normalization for Relational Databases CSC 375, Fall 2017 Chapter 19 Science is the knowledge of consequences, and dependence of one fact upon another. Thomas

More information

UVA UVA UVA UVA. Database Design. Relational Database Design. Functional Dependency. Loss of Information

UVA UVA UVA UVA. Database Design. Relational Database Design. Functional Dependency. Loss of Information Relational Database Design Database Design To generate a set of relation schemas that allows - to store information without unnecessary redundancy - to retrieve desired information easily Approach - design

More information

Functional Dependencies & Normalization. Dr. Bassam Hammo

Functional Dependencies & Normalization. Dr. Bassam Hammo Functional Dependencies & Normalization Dr. Bassam Hammo Redundancy and Normalisation Redundant Data Can be determined from other data in the database Leads to various problems INSERT anomalies UPDATE

More information

Normalization. October 5, Chapter 19. CS445 Pacific University 1 10/05/17

Normalization. October 5, Chapter 19. CS445 Pacific University 1 10/05/17 Normalization October 5, 2017 Chapter 19 Pacific University 1 Description A Real Estate agent wants to track offers made on properties. Each customer has a first and last name. Each property has a size,

More information

Functional Dependencies and Normalization

Functional Dependencies and Normalization Functional Dependencies and Normalization There are many forms of constraints on relational database schemata other than key dependencies. Undoubtedly most important is the functional dependency. A functional

More information

SCHEMA NORMALIZATION. CS 564- Fall 2015

SCHEMA NORMALIZATION. CS 564- Fall 2015 SCHEMA NORMALIZATION CS 564- Fall 2015 HOW TO BUILD A DB APPLICATION Pick an application Figure out what to model (ER model) Output: ER diagram Transform the ER diagram to a relational schema Refine the

More information

Information Systems (Informationssysteme)

Information Systems (Informationssysteme) Information Systems (Informationssysteme) Jens Teubner, TU Dortmund jens.teubner@cs.tu-dortmund.de Summer 2015 c Jens Teubner Information Systems Summer 2015 1 Part VII Schema Normalization c Jens Teubner

More information

Constraints: Functional Dependencies

Constraints: Functional Dependencies Constraints: Functional Dependencies Spring 2018 School of Computer Science University of Waterloo Databases CS348 (University of Waterloo) Functional Dependencies 1 / 32 Schema Design When we get a relational

More information

CSC 261/461 Database Systems Lecture 10 (part 2) Spring 2018

CSC 261/461 Database Systems Lecture 10 (part 2) Spring 2018 CSC 261/461 Database Systems Lecture 10 (part 2) Spring 2018 Announcement Read Chapter 14 and 15 You must self-study these chapters Too huge to cover in Lectures Project 2 Part 1 due tonight Agenda 1.

More information

Database Normaliza/on. Debapriyo Majumdar DBMS Fall 2016 Indian Statistical Institute Kolkata

Database Normaliza/on. Debapriyo Majumdar DBMS Fall 2016 Indian Statistical Institute Kolkata Database Normaliza/on Debapriyo Majumdar DBMS Fall 2016 Indian Statistical Institute Kolkata Problems with redundancy Data (attributes) being present in multiple tables Potential problems Increase of storage

More information

CS54100: Database Systems

CS54100: Database Systems CS54100: Database Systems Keys and Dependencies 18 January 2012 Prof. Chris Clifton Functional Dependencies X A = assertion about a relation R that whenever two tuples agree on all the attributes of X,

More information

Design Theory: Functional Dependencies and Normal Forms, Part I Instructor: Shel Finkelstein

Design Theory: Functional Dependencies and Normal Forms, Part I Instructor: Shel Finkelstein Design Theory: Functional Dependencies and Normal Forms, Part I Instructor: Shel Finkelstein Reference: A First Course in Database Systems, 3 rd edition, Chapter 3 Important Notices CMPS 180 Final Exam

More information

Database Design and Normalization

Database Design and Normalization Database Design and Normalization Chapter 11 (Week 12) EE562 Slides and Modified Slides from Database Management Systems, R. Ramakrishnan 1 1NF FIRST S# Status City P# Qty S1 20 London P1 300 S1 20 London

More information

Functional Dependencies and Normalization. Instructor: Mohamed Eltabakh

Functional Dependencies and Normalization. Instructor: Mohamed Eltabakh Functional Dependencies and Normalization Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu 1 Goal Given a database schema, how do you judge whether or not the design is good? How do you ensure it does

More information

INF1383 -Bancos de Dados

INF1383 -Bancos de Dados INF1383 -Bancos de Dados Prof. Sérgio Lifschitz DI PUC-Rio Eng. Computação, Sistemas de Informação e Ciência da Computação Projeto de BD e Formas Normais Alguns slides são baseados ou modificados dos originais

More information

10/12/10. Outline. Schema Refinements = Normal Forms. First Normal Form (1NF) Data Anomalies. Relational Schema Design

10/12/10. Outline. Schema Refinements = Normal Forms. First Normal Form (1NF) Data Anomalies. Relational Schema Design Outline Introduction to Database Systems CSE 444 Design theory: 3.1-3.4 [Old edition: 3.4-3.6] Lectures 6-7: Database Design 1 2 Schema Refinements = Normal Forms 1st Normal Form = all tables are flat

More information

Chapter 8: Relational Database Design

Chapter 8: Relational Database Design Chapter 8: Relational Database Design Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 8: Relational Database Design Features of Good Relational Design Atomic Domains

More information

Lectures 6. Lecture 6: Design Theory

Lectures 6. Lecture 6: Design Theory Lectures 6 Lecture 6: Design Theory Lecture 6 Announcements Solutions to PS1 are posted online. Grades coming soon! Project part 1 is out. Check your groups and let us know if you have any issues. We have

More information

Design Theory for Relational Databases

Design Theory for Relational Databases Design Theory for Relational Databases FUNCTIONAL DEPENDENCIES DECOMPOSITIONS NORMAL FORMS 1 Functional Dependencies X ->Y is an assertion about a relation R that whenever two tuples of R agree on all

More information

Database Design and Implementation

Database Design and Implementation Database Design and Implementation CS 645 Schema Refinement First Normal Form (1NF) A schema is in 1NF if all tables are flat Student Name GPA Course Student Name GPA Alice 3.8 Bob 3.7 Carol 3.9 Alice

More information

Problem about anomalies

Problem about anomalies Problem about anomalies Title Year Genre StarName Star Wars 1977 SciFi Carrie Fisher Star Wars 1977 SciFi Harrison Ford Raiders... 1981 Action Harrison Ford Raiders... 1981 Adventure Harrison Ford When

More information

Chapter 3 Design Theory for Relational Databases

Chapter 3 Design Theory for Relational Databases 1 Chapter 3 Design Theory for Relational Databases Contents Functional Dependencies Decompositions Normal Forms (BCNF, 3NF) Multivalued Dependencies (and 4NF) Reasoning About FD s + MVD s 2 Remember our

More information

Constraints: Functional Dependencies

Constraints: Functional Dependencies Constraints: Functional Dependencies Fall 2017 School of Computer Science University of Waterloo Databases CS348 (University of Waterloo) Functional Dependencies 1 / 42 Schema Design When we get a relational

More information

Functional Dependencies. Applied Databases. Not all designs are equally good! An example of the bad design

Functional Dependencies. Applied Databases. Not all designs are equally good! An example of the bad design Applied Databases Handout 2a. Functional Dependencies and Normal Forms 20 Oct 2008 Functional Dependencies This is the most mathematical part of the course. Functional dependencies provide an alternative

More information

Functional Dependencies

Functional Dependencies Functional Dependencies Functional Dependencies Framework for systematic design and optimization of relational schemas Generalization over the notion of Keys Crucial in obtaining correct normalized schemas

More information

Design theory for relational databases

Design theory for relational databases Design theory for relational databases 1. Consider a relation with schema R(A,B,C,D) and FD s AB C, C D and D A. a. What are all the nontrivial FD s that follow from the given FD s? You should restrict

More information

CSC 261/461 Database Systems Lecture 8. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

CSC 261/461 Database Systems Lecture 8. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101 CSC 261/461 Database Systems Lecture 8 Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101 Agenda 1. Database Design 2. Normal forms & functional dependencies 3. Finding functional dependencies

More information

Normal Forms. Dr Paolo Guagliardo. University of Edinburgh. Fall 2016

Normal Forms. Dr Paolo Guagliardo. University of Edinburgh. Fall 2016 Normal Forms Dr Paolo Guagliardo University of Edinburgh Fall 2016 Example of bad design BAD Title Director Theatre Address Time Price Inferno Ron Howard Vue Omni Centre 20:00 11.50 Inferno Ron Howard

More information

DESIGN THEORY FOR RELATIONAL DATABASES. csc343, Introduction to Databases Renée J. Miller and Fatemeh Nargesian and Sina Meraji Winter 2018

DESIGN THEORY FOR RELATIONAL DATABASES. csc343, Introduction to Databases Renée J. Miller and Fatemeh Nargesian and Sina Meraji Winter 2018 DESIGN THEORY FOR RELATIONAL DATABASES csc343, Introduction to Databases Renée J. Miller and Fatemeh Nargesian and Sina Meraji Winter 2018 1 Introduction There are always many different schemas for a given

More information

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen LECTURE 4: DESIGN THEORIES (FUNCTIONAL DEPENDENCIES) Design theory E/R diagrams are high-level design Formal theory for

More information

DECOMPOSITION & SCHEMA NORMALIZATION

DECOMPOSITION & SCHEMA NORMALIZATION DECOMPOSITION & SCHEMA NORMALIZATION CS 564- Spring 2018 ACKs: Dan Suciu, Jignesh Patel, AnHai Doan WHAT IS THIS LECTURE ABOUT? Bad schemas lead to redundancy To correct bad schemas: decompose relations

More information

Functional Dependency and Algorithmic Decomposition

Functional Dependency and Algorithmic Decomposition Functional Dependency and Algorithmic Decomposition In this section we introduce some new mathematical concepts relating to functional dependency and, along the way, show their practical use in relational

More information

CSC 261/461 Database Systems Lecture 13. Spring 2018

CSC 261/461 Database Systems Lecture 13. Spring 2018 CSC 261/461 Database Systems Lecture 13 Spring 2018 BCNF Decomposition Algorithm BCNFDecomp(R): Find X s.t.: X + X and X + [all attributes] if (not found) then Return R let Y = X + - X, Z = (X + ) C decompose

More information

CSE 132B Database Systems Applications

CSE 132B Database Systems Applications CSE 132B Database Systems Applications Alin Deutsch Database Design and Normal Forms Some slides are based or modified from originals by Sergio Lifschitz @ PUC Rio, Brazil and Victor Vianu @ CSE UCSD and

More information

Chapter 7: Relational Database Design

Chapter 7: Relational Database Design Chapter 7: Relational Database Design Chapter 7: Relational Database Design! First Normal Form! Pitfalls in Relational Database Design! Functional Dependencies! Decomposition! Boyce-Codd Normal Form! Third

More information

Design Theory. Design Theory I. 1. Normal forms & functional dependencies. Today s Lecture. 1. Normal forms & functional dependencies

Design Theory. Design Theory I. 1. Normal forms & functional dependencies. Today s Lecture. 1. Normal forms & functional dependencies Design Theory BBM471 Database Management Systems Dr. Fuat Akal akal@hacettepe.edu.tr Design Theory I 2 Today s Lecture 1. Normal forms & functional dependencies 2. Finding functional dependencies 3. Closures,

More information

Introduction to Data Management. Lecture #10 (DB Design Wrap-up)

Introduction to Data Management. Lecture #10 (DB Design Wrap-up) Introduction to Data Management Lecture #10 (DB Design Wrap-up) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Announcements HW and exams: HW

More information

FUNCTIONAL DEPENDENCY THEORY. CS121: Relational Databases Fall 2017 Lecture 19

FUNCTIONAL DEPENDENCY THEORY. CS121: Relational Databases Fall 2017 Lecture 19 FUNCTIONAL DEPENDENCY THEORY CS121: Relational Databases Fall 2017 Lecture 19 Last Lecture 2 Normal forms specify good schema patterns First normal form (1NF): All attributes must be atomic Easy in relational

More information

Chapter 7: Relational Database Design. Chapter 7: Relational Database Design

Chapter 7: Relational Database Design. Chapter 7: Relational Database Design Chapter 7: Relational Database Design Chapter 7: Relational Database Design First Normal Form Pitfalls in Relational Database Design Functional Dependencies Decomposition Boyce-Codd Normal Form Third Normal

More information

CMPT 354: Database System I. Lecture 9. Design Theory

CMPT 354: Database System I. Lecture 9. Design Theory CMPT 354: Database System I Lecture 9. Design Theory 1 Design Theory Design theory is about how to represent your data to avoid anomalies. Design 1 Design 2 Student Course Room Mike 354 AQ3149 Mary 354

More information

CSIT5300: Advanced Database Systems

CSIT5300: Advanced Database Systems CSIT5300: Advanced Database Systems L05: Functional Dependencies Dr. Kenneth LEUNG Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong SAR, China

More information

Lecture 15 10/02/15. CMPSC431W: Database Management Systems. Instructor: Yu- San Lin

Lecture 15 10/02/15. CMPSC431W: Database Management Systems. Instructor: Yu- San Lin CMPSC431W: Database Management Systems Lecture 15 10/02/15 Instructor: Yu- San Lin yusan@psu.edu Course Website: hcp://www.cse.psu.edu/~yul189/cmpsc431w Slides based on McGraw- Hill & Dr. Wang- Chien Lee

More information

But RECAP. Why is losslessness important? An Instance of Relation NEWS. Suppose we decompose NEWS into: R1(S#, Sname) R2(City, Status)

But RECAP. Why is losslessness important? An Instance of Relation NEWS. Suppose we decompose NEWS into: R1(S#, Sname) R2(City, Status) So far we have seen: RECAP How to use functional dependencies to guide the design of relations How to modify/decompose relations to achieve 1NF, 2NF and 3NF relations But How do we make sure the decompositions

More information

Review: Keys. What is a Functional Dependency? Why use Functional Dependencies? Functional Dependency Properties

Review: Keys. What is a Functional Dependency? Why use Functional Dependencies? Functional Dependency Properties Review: Keys Superkey: set of attributes whose values are unique for each tuple Note: a superkey isn t necessarily minimal. For example, for any relation, the entire set of attributes is always a superkey.

More information

Design Theory for Relational Databases

Design Theory for Relational Databases Design Theory for Relational Databases Keys: formal definition K is a superkey for relation R if K functionally determines all attributes of R K is a key for R if K is a superkey, but no proper subset

More information

Normaliza)on and Func)onal Dependencies

Normaliza)on and Func)onal Dependencies Normaliza)on and Func)onal Dependencies 1NF and 2NF Redundancy and Anomalies Func)onal Dependencies A9ribute Closure Keys and Super keys 3NF BCNF Minimal Cover Algorithm 3NF Synthesis Algorithm Decomposi)on

More information

Information Systems for Engineers. Exercise 8. ETH Zurich, Fall Semester Hand-out Due

Information Systems for Engineers. Exercise 8. ETH Zurich, Fall Semester Hand-out Due Information Systems for Engineers Exercise 8 ETH Zurich, Fall Semester 2017 Hand-out 24.11.2017 Due 01.12.2017 1. (Exercise 3.3.1 in [1]) For each of the following relation schemas and sets of FD s, i)

More information

Relational Design: Characteristics of Well-designed DB

Relational Design: Characteristics of Well-designed DB Relational Design: Characteristics of Well-designed DB 1. Minimal duplication Consider table newfaculty (Result of F aculty T each Course) Id Lname Off Bldg Phone Salary Numb Dept Lvl MaxSz 20000 Cotts

More information

Functional Dependencies. Getting a good DB design Lisa Ball November 2012

Functional Dependencies. Getting a good DB design Lisa Ball November 2012 Functional Dependencies Getting a good DB design Lisa Ball November 2012 Outline (2012) SEE NEXT SLIDE FOR ALL TOPICS (some for you to read) Normalization covered by Dr Sanchez Armstrong s Axioms other

More information

Functional Dependency Theory II. Winter Lecture 21

Functional Dependency Theory II. Winter Lecture 21 Functional Dependency Theory II Winter 2006-2007 Lecture 21 Last Time Introduced Third Normal Form A weakened version of BCNF that preserves more functional dependencies Allows non-trivial dependencies

More information

CS322: Database Systems Normalization

CS322: Database Systems Normalization CS322: Database Systems Normalization Dr. Manas Khatua Assistant Professor Dept. of CSE IIT Jodhpur E-mail: manaskhatua@iitj.ac.in Introduction The normalization process takes a relation schema through

More information

11/6/11. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design)

11/6/11. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design) Relational Schema Design Introduction to Management CSE 344 Lectures 16: Database Design Conceptual Model: Relational Model: plus FD s name Product buys Person price name ssn Normalization: Eliminates

More information

Database Design: Normal Forms as Quality Criteria. Functional Dependencies Normal Forms Design and Normal forms

Database Design: Normal Forms as Quality Criteria. Functional Dependencies Normal Forms Design and Normal forms Database Design: Normal Forms as Quality Criteria Functional Dependencies Normal Forms Design and Normal forms Design Quality: Introduction Good conceptual model: - Many alternatives - Informal guidelines

More information

CSE 344 AUGUST 3 RD NORMALIZATION

CSE 344 AUGUST 3 RD NORMALIZATION CSE 344 AUGUST 3 RD NORMALIZATION ADMINISTRIVIA WQ6 due Monday DB design HW7 due next Wednesday DB design normalization DATABASE DESIGN PROCESS Conceptual Model: name product makes company price name address

More information

Relational Design Theory I. Functional Dependencies: why? Redundancy and Anomalies I. Functional Dependencies

Relational Design Theory I. Functional Dependencies: why? Redundancy and Anomalies I. Functional Dependencies Relational Design Theory I Functional Dependencies Functional Dependencies: why? Design methodologies: Bottom up (e.g. binary relational model) Top-down (e.g. ER leads to this) Needed: tools for analysis

More information

Func8onal Dependencies

Func8onal Dependencies ICS 321 Data Storage & Retrieval Func8onal Dependencies Prof. Lipyeow Lim Informa8on & Computer Science Department University of Hawaii at Manoa Lipyeow Lim - - University of Hawaii at Manoa 1 Example:

More information

Lossless Joins, Third Normal Form

Lossless Joins, Third Normal Form Lossless Joins, Third Normal Form FCDB 3.4 3.5 Dr. Chris Mayfield Department of Computer Science James Madison University Mar 19, 2018 Decomposition wish list 1. Eliminate redundancy and anomalies 2. Recover

More information

Introduction to Database Systems CSE 414. Lecture 20: Design Theory

Introduction to Database Systems CSE 414. Lecture 20: Design Theory Introduction to Database Systems CSE 414 Lecture 20: Design Theory CSE 414 - Spring 2018 1 Class Overview Unit 1: Intro Unit 2: Relational Data Models and Query Languages Unit 3: Non-relational data Unit

More information

11/1/12. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design)

11/1/12. Relational Schema Design. Relational Schema Design. Relational Schema Design. Relational Schema Design (or Logical Design) Relational Schema Design Introduction to Management CSE 344 Lectures 16: Database Design Conceptual Model: Relational Model: plus FD s name Product buys Person price name ssn Normalization: Eliminates

More information

Normal Forms Lossless Join.

Normal Forms Lossless Join. Normal Forms Lossless Join http://users.encs.concordia.ca/~m_oran/ 1 Types of Normal Forms A relation schema R is in the first normal form (1NF) if the domain of its each attribute has only atomic values

More information

Schema Refinement: Other Dependencies and Higher Normal Forms

Schema Refinement: Other Dependencies and Higher Normal Forms Schema Refinement: Other Dependencies and Higher Normal Forms Spring 2018 School of Computer Science University of Waterloo Databases CS348 (University of Waterloo) Higher Normal Forms 1 / 14 Outline 1

More information

CSE 344 AUGUST 6 TH LOSS AND VIEWS

CSE 344 AUGUST 6 TH LOSS AND VIEWS CSE 344 AUGUST 6 TH LOSS AND VIEWS ADMINISTRIVIA WQ6 due tonight HW7 due Wednesday DATABASE DESIGN PROCESS Conceptual Model: name product makes company price name address Relational Model: Tables + constraints

More information