Download Database Normalization: Functional Dependencies and Normal Forms - Lecture Notes and more Lecture notes Database Management Systems (DBMS) in PDF only on Docsity!
Normalization
Oct 2017
- What problems are caused by redundancy?
- What are functional dependencies?
11โ2 Normalization (1.1.0)
- What are normal forms?
- What are the benefits of 3NF and BCNF?
- What is the process to decompose a relation into its normal forms?
ssn name lot rating hourWages hoursWorked 123-45- 234-45- 451-78- 457-89-
Smiley Smethurst Guldu Madayan
48 22 35 35
8 8 5 5
10 10 7 7
40 30 30 32
- The hourly rate is determined by the rating (this is an example of a functional dependency)
- Problems: - Redundant Storage : (8, 10) and (5,7) are repeated - Update Anomalies : The hourWages in the first tuple could be updated without making a similar update to the second tuple
ssn name lot rating hourWages hoursWorked 123-45- 234-45-789451-78- 457-89-
Smiley SmethurstGuldu Madayan
48 (^2235) 35
8 (^85) 5
10 (^107) 7
40 (^3030) 32
- Problems (cont): - Insertion Anomalies : We need to know the hourWage in order to insert a tuple (this could be fixed with a NULL value) - Delete Anomalies : If we delete all the tuples with a given (rating, hourWages) we might lose that association
Problems related to Decomposition
- Do we need to decompose? - We use normal forms to answer this question
- What problems arise with a given decomposition? We are interested in the following properties of decompositions: - Loss-less join : Can we rebuild the original relation? - Dependency preservation : Do we preserve the ICs??
Functional Dependencies
- A functional dependency (FD) is a kind of IC that generalizes the concept of a key. Let R be a relation schema, with X and Y be nonempty sets of attributes in R. For an instance r of R, we say that the FD X โ Y (X functionally determines Y) is satisfied if:
โt 1 , t 2 โ r, t 1 .X = t 2 .X =โ t 1 .Y = t 2 .Y
11โ8 Normalization (1.1.0)
- Given a set of FDs we can usually find additional FDs that also
hold.
- Example: Given a key we can always find a superkey
- We say that an FD f is implied^ by a set^ F^ of FDs for relation schema R, if
โr โ R, โg โ F, g holds =โ f holds
- The set of all FDs implied by a given set F of FDs is called the
closure of F , denoted by F +
- Armstrongโs Axioms for FDs. X, Y, Z are sets of attributes over a relation schema R. - Reflexivity : Y โ X =โ X โ Y - Augmentation : X โ Y =โ โZ, X Z โ Y Z - Transitivity : X โ Y โง Y โ Z =โ X โ Z
- What FDs can be inferred from: A โ C and B โ C?
- We are given a relation C ontracts(contractid, supplierid, projectid, deptid, partid, qty, v - contractid is the key - A project purchases a given part using a single contract - A department purchases at most one part from a supplier
- The attribute closure X + with respect to a set F of FDs is the set of attributes A s.t. X โ A can be inferred using Armstrongโs Axioms.
- Algorithm to compute the closure:
closure = X ; repeat until there is no change: { if โU โ V โ F s.t.U โ closure, then set closure = closure U V
}
- This algorithm can be used to determine if X โ Y is in F +
- A relation is in first normal form if every field contains only
atomic values (no lists nor sets)
Boyce-Codd Normal Form
- Let R be a schema
- F be a set of FDs that hold over R
- X be a subset of the attributes of R
- A be an attribute of R
- R is in Boyce-Codd Normal Form , if for every FD X โ A, one of the following is true: - A โ X (it is a trivial FD) or - X is a superkey
- In order to test if a relation is in 3NF, we need to find all the keys
of the relation
- Finding all keys in a relation is NP complete!
- So is finding out if a relation is in 3NF
- Why do we want to use 3NF? Because any^ relation can be decomposed into a set of 3NF relations
- Let R be a schema
- F be a set of FDs that hold over R
- a decomposition of R into two schemas with attributes X and Y is a lossless-join decomposition with respect to F if โr โ R that satisfies F :
ฯ x (r) t><l ฯ y (r) = r
11โ26 Normalization (1.1.0)