Practice and Applications of Data Management CMPSCI 345 Lecture 15: Functional Dependencies
First Normal Form (1NF) } A database schema is in First Normal Form if all tables are flat Student Student Name GPA Course Alice 3.8 Bob 3.7 Carol 3.9 Math DB OS DB OS Math OS May need to add keys Name GPA Alice 3.8 Bob 3.7 Carol 3.9 Takes Student Alice Carol Alice Bob Alice Carol Course Math Math DB DB OS OS Course Course Math DB OS 2
Conceptual Schema Design name Conceptual Model: PaEent paeent_of Doctor zip name dno RelaEonal Model: plus FD s (FD = FuncEonal Dependency) NormalizaEon: Eliminates anomalies 3
Data Anomalies } When a database is poorly designed we get anomalies: } Redundancy: data is repeated } Update anomalies: need to change in several places } Delete anomalies: may lose data when we don t want 4
Relational Schema Design Recall set atributes (persons with several phones): Name SSN PhoneNumber City Fred 123-45-6789 413-555-1234 Amherst Fred 123-45-6789 413-555-6543 Amherst Joe 987-65-4321 908-555-2121 Wes]ield One person may have muleple phones, but lives in only one city Primary key is thus (SSN, PhoneNumber) The above is in 1NF, but what is the problem with this schema? 5
Relational Schema Design Recall set atributes (persons with several phones): Name SSN PhoneNumber City Fred 123-45-6789 413-555-1234 Amherst Fred 123-45-6789 413-555-6543 Amherst Joe 987-65-4321 908-555-2121 Wes]ield Anomalies: Redundancy = repeat data Update anomalies = what if Fred moves to Boston? DeleEon anomalies = what if Joe deletes his phone number? (what if Joe had only one phone #) 6
Relation Decomposition Break the rela2on into two: Name SSN PhoneNumber City Fred 123-45-6789 413-555-1234 Amherst Fred 123-45-6789 413-555-6543 Amherst Joe 987-65-4321 908-555-2121 Wes]ield Name SSN City Fred 123-45-6789 Amherst Joe 987-65-4321 Wes]ield SSN Anomalies have gone: No more repeated data Easy to move Fred to Boston (how?) Easy to delete all Joe s phone numbers (how?) PhoneNumber 123-45-6789 413-555-1234 123-45-6789 413-555-6543 987-65-4321 908-555-2121 7
Relational Schema Design (Logical Design) } Main idea: } Start with some relaeonal schema } Find out its funceonal dependencies (discussed next!) } Use them to design a beter relaeonal schema 8
Functional Dependencies } A form of constraint } Hence, part of the schema } Finding them is part of the database design } Use them to normalize the relaeons 9
Functional Dependencies (FDs) DefiniEon: If two tuples agree on the atributes A 1, A 2,, A n then they must also agree on the atributes B 1, B 2,, B m Formally: A 1, A 2,, A n à B 1, B 2,, B m 10
When Does a FD Hold DefiniEon: A 1,..., A m à B 1,..., B n holds in R if: t, t R, (t.a 1 = t.a 1... t.a m = t.a m t.b 1 = t.b 1... t.b n = t.b n ) R A 1... A m B 1... B n t t if t, t agree here then t, t agree here 11
Example An FD holds, or does not hold on an instance: EmpID Name Phone Posi2on E0045 Smith 1234 Clerk E3542 Mike 9876 Salesrep E1111 Smith 9876 Salesrep E9999 Mary 1234 Lawyer EmpID à Name, Phone, PosiEon PosiEon à Phone but not: Phone à PosiEon 12
Example EmpID Name Phone Posi2on E0045 Smith 1234 Clerk E3542 Mike 9876 Salesrep E1111 Smith 9876 Salesrep E9999 Mary 1234 Lawyer Position à Phone 13
Example EmpID Name Phone Posi2on E0045 Smith 1234 Clerk E3542 Mike 9876 Salesrep E1111 Smith 9876 Salesrep E9999 Mary 1234 Lawyer But not: Phone à Position 14
Example FD s are constraints: On some instances they hold On others they don t name à color category à department color, category à price name category color department price Gizmo Gadget Green Toys 49 Tweaker Gadget Green Toys 99 Does this instance saesfy all the FDs? 15
Example FD s are constraints: On some instances they hold On others they don t name à color category à department color, category à price name category color department price Gizmo Gadget Green Toys 49 Tweaker Gadget Green Toys 99 Gizmo StaEonary Blue Supplies 59 What about this one? 16 x
An Interesting Observation If all these FDs are true: name à color category à department color, category à price Then this FD also holds: name, category à price Why?? 17
Find ALL Functional Dependencies } Anomalies occur when certain bad FDs hold } We know some of the FDs } Need to find all FDs } Then look for the bad ones 18
Armstrong s Rules (1/3) A 1, A 2,, A n à B 1, B 2,, B m Is equivalent to Spli=ng rule and Combing rule A 1, A 2,, A n à B 1 A 1, A 2,, A n à B 2..... A 1, A 2,, A n à B m A1... Am B1... Bm 19 x
Armstrong s Rules (2/3) A 1, A 2,, A n à A i where i = 1, 2,..., n Trivial Rule Why? A 1 A m 20 x
Armstrong s Rules (3/3) Transi2ve Rule If A 1, A 2,, A n à B 1, B 2,, B m and B 1, B 2,, B m à C 1, C 2,, C p then A 1, A 2,, A n à C 1, C 2,, C p Why? 21 x
Armstrong s Rules (3/3) IllustraEon for TransiEvity A 1 A m B 1 B m C 1... C p 22
Example (continued) Start from the following FDs: Infer the following FDs: 1. name à color 2. category à department 3. color, category à price Inferred FD 4. name, category à name 5. name, category à color 6. name, category à category 7. name, category à color, category 8. name, category à price Which Rule did we apply? 23
Example (continued) Answers: 1. name à color 2. category à department 3. color, category à price Inferred FD 4. name, category à name Trivial Which Rule did we apply? 5. name, category à color TransiEvity on 4, 1 6. name, category à category Trivial 7. name, category à color, category Split/combine on 5, 6 8. name, category à price TransiEvity on 3, 7 THIS IS TOO HARD! Let s see an easier way. 24
Closure of a set of Attributes Given a set of atributes A 1,, A n The closure, {A 1,, A n } + = the set of atributes B s.t. A 1,, A n à B Example: name à color category à department color, category à price Closures: name + = {name, color} {name, category} + = {name, category, color, department, price} color + = {color} 25
Closure Algorithm X={A 1,, A n }. Repeat un2l X doesn t change do: if B 1,, B n à C is a FD and B 1,, B n are all in X then add C to X. Example: name à color category à department color, category à price {name, category} + = { name, category, color, department, price } Hence: name, category à color, department, price 26 x
Example In class: R(A,B,C,D,E,F) A, B à C A, D à E B à D A, F à B Compute {A,B} + X = {A, B, } Compute {A, F} + X = {A, F, } 27
Example In class: R(A,B,C,D,E,F) A, B à C A, D à E B à D A, F à B Compute {A,B} + X = {A, B, C, D, E } Compute {A, F} + X = {A, F, } 28
Example In class: R(A,B,C,D,E,F) A, B à C A, D à E B à D A, F à B Compute {A,B} + X = {A, B, C, D, E } Compute {A, F} + X = {A, F, B, C, D, E } 29
Why Do We Need Closure } With closure we can find all FD s easily } To check if X A } Compute X + } Check if A X + 30
Using Closure to Infer ALL FDs Example: A, B à C A, D à B B à D Step 1: Compute X +, for every X: A + = A, B + = BD, C + = C, D + = D AB + =ABCD, AC + =AC, AD + =ABCD, BC + =BCD, BD + =BD, CD + =CD ABC + = ABD + = ACD + = ABCD (no need to compute why?) BCD + = BCD, ABCD + = ABCD Step 2: Enumerate all FD s X à Y, s.t. Y X + and X Y = : B à D, AB à CD, ADàBC, BCàD, ABC à D, ABD à C, ACD à B 31
Another Example Enrollment(student, major, course, room, Eme) student à major major, course à room course à time What else can we infer? 32