





























Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An introduction to normal forms in database systems, explaining functional dependencies, keys, superkeys, and the decomposition rule. It covers topics such as the third normal form, lossless decomposition, and dependency preservation. The document also discusses the importance of normalization in ensuring data integrity and improving query performance.
Typology: Study notes
1 / 37
This page cannot be seen from the preview
Don't miss anything!






























CSC343 – Introduction to Databases Normal Forms — 1
Database Design Database Redundancies and Anomalies Functional Dependencies Entailment, Closure and Equivalence Lossless Decompositions The Third Normal Form (3NF) The Boyce-Codd Normal Form (BCNF) Normal Forms and Database Design
CSC343 – Introduction to Databases Normal Forms — 2
We have seen how to design a relational schema by first designing an ER schema and then transforming it into a relational one. Now we focus on how to transform the generated relational schema into a "better" one. Goodness of relational schemas is defined in terms of the notion of normal form.
CSC343 – Introduction to Databases Normal Forms — 3
A normal form is a property of a database schema. When a database schema is un-normalized (that is, does not satisfy the normal form), it allows redundancies of various types which can lead to anomalies and inconsistencies. Normal forms can serve as basis for evaluating the quality of a database schema and constitutes a useful tool for database design. Normalization is a procedure that transforms an un-normalized schema into a normalized one.
CSC343 – Introduction to Databases Normal Forms — 4
Employee Salary Project Budget Function Brown 20 Mars 2 technician Green 35 Jupiter 15 designer Green 35 Venus 15 designer Hoskins 55 Venus 15 manager Hoskins 55 Jupiter 15 consultant Hoskins 55 Mars 2 consultant Moore 48 Mars 2 manager Moore 48 Venus 15 designer Kemp 48 Venus 15 designer Kemp 48 Jupiter 15 manager
CSC343 – Introduction to Databases Normal Forms — 7
Each employee has a unique salary. We represent this dependency as
and say " Salary functionally depends on Employee ". This means that everywhere we have the same Employee attribute value, we also get the same Salary attribute value. Likewise,
i.e., each project has a unique budget.
CSC343 – Introduction to Databases Normal Forms — 8
Given schema R(X) and non-empty subsets Y and Z of the attributes X , we say that there is a
iff for every relation instance r of R(X) and every pair of tuples t 1 , t 2 of r , if t 1 .Y = t 2 .Y , then t 1 .Z = t 2 .Z.
A functional dependency is a statement about all allowable relations for a given schema. Functional dependencies have to be identified by understanding the semantics of the application. Given a particular relation r 0 of R(X), we can tell if a dependency holds or not; but just because it holds for r 0 , doesn’t mean that it also holds for R(X)!
CSC343 – Introduction to Databases Normal Forms — 9
Employee Salary Project Budget Function Brown 20 Mars 2 technician Green 35 Jupiter 15 designer Green 35 Venus 15 designer Hoskins 55 Venus 15 manager Hoskins 55 Jupiter 15 consultant Hoskins 55 Mars 2 consultant Moore 48 Mars 2 manager Moore 48 Venus 15 designer Kemp 48 Venus 15 designer Kemp 48 Jupiter 15 manager
CSC343 – Introduction to Databases Normal Forms — 10
no attribute in Z appears among attributes of Y , e.g.,
Anomalies arise precisely for the attributes which are involved in (non-trivial) functional dependencies:
Moreover, note that our example includes another functional dependency:
CSC343 – Introduction to Databases Normal Forms — 13
Decomposition: Use two relations to store Person information:Person 99 Person1Person1 ( SI#, Name, Address ) 99 HobbiesHobbies ( SI#, Hobby ) The decomposition is more general: people with hobbies can now be described independently of their name and address. No update anomalies: 9 Name and address stored once; 9 A hobby can be separately supplied or deleted; 9 We can represent persons with no hobbies.
CSC343 – Introduction to Databases Normal Forms — 14
A superkey constraint is a special functional dependency: Let K be a set of attributes of R, and U the set of all attributes of R. Then K is a superkey iff the functional dependency K → U is satisfied in R. 9 E.g., SI# → SI#,Name,Address (for a Person relation) A key is a minimal superkey, I.e., for each X ⊂ K, X is not a superkey 9 SI#, Hobby → SI#, Name, Address, Hobby but 9 SI# → SI#, Name, Address, Hobby 9 Hobby → SI#, Name, Address, Hobby A key attribute is an attribute that is part of a key.
CSC343 – Introduction to Databases Normal Forms — 15
Address → PostalCode
9 DCS’s postal code is M5S 3H
Author, Title, Edition → PublicationDate
9 Ramakrishnan, et al., Database Management Systems, 3rd^ publication date is 2003
CourseID → ExamDate, ExamTime
9 CSC343’s exam date is December 18, starting at 7pm
CSC343 – Introduction to Databases Normal Forms — 16
Sometimes functional dependencies (FDs) seem to be saying the same thing, e.g., Addr → PostalCode,Str# vs Addr → PostalCode, Addr → Str#
Another example
Addr → PostalCode, PostalCode → Province vs Addr → PostalCode, PostalCode → Province Addr → Province When are two sets of FDs equivalent? How do we "infer" new FDs from given FDs?
CSC343 – Introduction to Databases Normal Forms — 19
This is the syntactic way of computing/testing semantic properties of FDs 9 Reflexivity : Y ⊆ X |- X → Y (trivial FD) e.g., | - Name, Address → Name 9 Augmentation : X → Y | - XZ → YZ e.g., Address → ZipCode | - Address,Name → ZipCode, Name 9 Transitivity : X → Y, Y → Z | - X → Z e.g., Phone# → Address, Address → ZipCode | - Phone# → ZipCode
CSC343 – Introduction to Databases Normal Forms — 20
Theorem: F |- f implies F |= f In words: If FD f: X → Y can be derived from a set of FDs F using the axioms, then f holds in every relation that satisfies every FD in F. Example: Given X → Y and X → Z then
Thus, X → YZ is satisfied in every relation where both X → Y and X → Z are satisfied. We have derived the union ruleunion rule for FDs.
X → XY Augmentation by X YX → YZ Augmentation by Y X → YZ Transitivity
CSC343 – Introduction to Databases Normal Forms — 21
Theorem: F |= f implies F |- f In words: If F entails f , then f can be derived from F using Armstrong's axioms. A consequence of completeness is the following (naïve) algorithm to determining if F entails f : AlgorithmAlgorithm : Use the axioms in all possible ways to generate F +^ (the set of possible FD’s is finite so this can be done) and see if f is in F +
CSC343 – Introduction to Databases Normal Forms — 22
The notions of soundness and completeness link the syntax (Armstrong’s axioms) with semantics, I.e., entailment defined in terms of relational instances. This is a precise way of saying that the algorithm for entailment based on the axioms is ``correct’’ with respect to the definitions.
CSC343 – Introduction to Databases Normal Forms — 25
Calculating attribute closure leads to a more efficient way of checking entailment. The (^) attribute closureattribute closure of a set of attributes X with respect to a set of FDs F , denoted X +F , is the set of all attributes A such that X → A 9 X +F is not necessarily same as X +G if F ≠ G Attribute closure and entailment: AlgorithmAlgorithm: Given a set of FDs,^ F , then^ X^ → Y if and only if Y ⊆ X +F
CSC343 – Introduction to Databases Normal Forms — 26
F : AB → C A → D D → E AC → B
X X F +
A {A, D, E} AB {A, B, C, D, E} AC {A, C, B, D, E} B {B} D {D, E} Is AB → E entailed by F? Yes Is D → C entailed by F? No Result : X F +^ allows us to determine all FDs of the form X → Y entailed by F
CSC343 – Introduction to Databases Normal Forms — 27
closure := X; // since X ⊆ X+ F repeat old := closure; if there is an FD Z → V in F such that Z ⊆ closure and V ⊆ closure then closure := closure ∪ V until old = closure
CSC343 – Introduction to Databases Normal Forms — 28
Each normal form is a set of conditions on a schema that together guarantee certain properties (relating to redundancy and update anomalies).
First normal form (1NF) is the same as the definition of relational model (relations = sets of tuples; each tuple = sequence of atomic values).
Second normal form (2NF) – has no practical or theoretical value – won’t discuss.
The two most used are (^) third normal formthird normal form (3NF) and Boyce-Boyce-CoddCodd normal formnormal form (BCNF).
We will discuss in detail the 3NF.
CSC343 – Introduction to Databases Normal Forms — 31
Projection: 9 R1(AD), A --> D 9 R2(ABC)
CSC343 – Introduction to Databases Normal Forms — 32
A relation that is not in 3NF, can be replaced with one or more normalized relations using normalization.
We can eliminate redundancies and anomalies for the example relation Emp(Employee,Salary,Project,Budget,Function)
if we replace it with the three relations obtained by projections on the sets of attributes corresponding to the three functional dependencies:
CSC343 – Introduction to Databases Normal Forms — 33
Employee Salary Project Budget Function Brown 20 Mars 2 technician Green 35 Jupiter 15 designer Green 35 Venus 15 designer Hoskins 55 Venus 15 manager Hoskins 55 Jupiter 15 consultant Hoskins 55 Mars 2 consultant Moore 48 Mars 2 manager Moore 48 Venus 15 designer Kemp 48 Venus 15 designer Kemp 48 Jupiter 15 manager
CSC343 – Introduction to Databases Normal Forms — 34
Employee Salary Brown 20 Green 35 Hoskins 55 Moore 48 Kemp 48 Employee Project Function Brown Mars technician Green Jupiter designer Green Venus designer Hoskins Venus manager Hoskins Jupiter consultant Hoskins Mars consultant Moore Mars manager Moore Venus designer Kemp Venus designer Kemp Jupiter manager
Project Budget Mars 2 Jupiter 15 Venus 15
The keys of new relations are lefthand sides of functional dependencies; satisfaction of 3NF is therefore guaranteed for the new relations.
CSC343 – Introduction to Databases Normal Forms — 37
Employee Project Branch Brown Mars Chicago Green Jupiter Birmingham Green Venus Birmingham Hoskins Saturn Birmingham Hoskins Venus Birmingham Green Saturn Birmingham Hoskins Jupiter Birmingham
The result of the join is different from the original relation. We lost some information during the decomposition!
CSC343 – Introduction to Databases Normal Forms — 38
The decomposition of a relation R(X) on X (^1) and X 2 is lossless if for every instance r of R(X) the join of the projections of R on X (^1) and X 2 is equal to r itself (that is, does not contain spurious tuples). Of course, it is clearly desirable to allow only lossless decompositions during normalization.
CSC343 – Introduction to Databases Normal Forms — 39
Let R(X) be a relation schema and let X 1 and X 2 be two subsets of X such that X 1 ∪ X 2 = X. Also, let X 0 = X 1 ∩ X 2. If R(X) satisfies the functional dependency
of R(X) on X 1 and X 2 is lossless. In other words, R(X) has a lossless decomposition on two relations if the set of attributes common to the relations is a superkey for at least one of the decomposed relations.
CSC343 – Introduction to Databases Normal Forms — 40
Suppose R 1 ∩ R 2 → R 2. Then a row of r 1 can combine with exactly one row of r 2 in the natural join (since in r 2 a particular set of values for the attributes in R 1 ∩ R (^2) defines a unique row) R 1 ∩ R 2 R 1 ∩ R 2 …………. a a ………... ………… a b …………. ………… b c …………. ………… c r 1 r 2