Download Schema Refinement-Databases-Lecture 07 Slides-Computer Science and more Slides Database Management Systems (DBMS) in PDF only on Docsity!
1
Lecture 7:
Schema refinement:
Normalisation
www.cl.cam.ac.uk/Teaching/current/Databases/
Decomposing relations
- In previous lecture, we saw that we could ‘decompose’ the bad relation schema
Data(sid,sname,address,cid,cname,grad e)
to a ‘better’ set of relation schema
Student(sid,sname,address) Course(cid,cname) Enrolled(sid,cid,grade)
Decomposition
- A decomposition of a relation R=R(A 1 : 1 , …, An: (^) n) is a collection of relations {R 1 , …, Rk} and a set of queries
R Q 0 ( R 1 ,, Rk )
if Ri^ Qi ( R )
then
{ Q 0 , Q 1 ,, Qk }
such that
This is Tim’s somewhat non-standard definition….
Special Case: Lossless- join decomposition
- {R 1 ,…,Rk} is a lossless-join
decomposition of R with respect to an FD set F, if for every relation instance r of R that satisfies F,
R 1 (r)^ V^ …^ V^ Rk(r) = r
(this means project on the attributes of the relation’s schema)
Lossless-join: Example
sid sname addres s
cid cname grade
124 Julia USA 206 Database A++ 204 Kim Essex 202 Semantics C 124 Julia USA 201 S/Eng I A+ 206 Tim London 206 Database B- 124 Julia USA 202 Semantics B+
What happens if we decompose on (sid,sname,address) and (cid,cname,grade)?
Dependency preservation
- Intuition: If R is decomposed into R 1 , R 2 and R 3 , say, and we enforce the FDs that hold individually on R 1 , on R 2 and on R 3 , then all FDs that were given to hold on R must also hold
- Reason: Otherwise checking updates for violation of FDs may require computing joins
Dependency preservation: example
- Take R=R(city, street&no, zipcode) with FDs: - city,street&no zipcode - zipcode city
- Decompose to
- R1(street&no,zipcode)
- R2(city,zipcode)
- Claim: This is a lossless-join decomposition
- Is it dependency preserving?
Boyce-Codd normal form “Represent Every Fact Only ONCE”
- A relation R with FDs F is said to be in Boyce-Codd normal form (BCNF) if for all X A in F+^ then - Either A X (‘trivial dependency’), or - X is a superkey for R
- Intuition: A relation R is in BCNF if the left side of every non-trivial FD contains a key
BCNF: Example
BankerSchema(brname,cname,bname)
- With FDs
- bname brname
- brname,cname bname
- Not in BCNF ( Why? )
- We might decompose to
- BBSchema(bname,brname)
- CBrSchema(cname,bname)
- This is in BCNF
- BUT this is not dependency-preserving
Third normal form
- A relation R with FDs F is said to be in third normal form (3NF) if for all X A in F+^ then - Either A X (‘trivial dependency’), or - X is a superkey for R, or - A is a member of some candidate key for R
- Notice that 3NF is strictly weaker than BCNF
- (A prime attribute is one which appears in a candidate key)
- It is always possible to find a dependency-preserving lossless-join decomposition that is in 3NF.
Prehistory: First normal form
- First normal form (1NF) is now considered part of the formal definition of the relational model
- It states that the domain of all attributes must be atomic (indivisible), and that the value of any attribute in a tuple must be a single value from the domain
- NOTE: Modern databases have moved away from this restriction
Prehistory: Second normal form
- A partial functional dependency X Y is an FD where for some attribute A X, (X- {A}) Y
- A relation schema R is in second normal form ( 2NF ) if every non-prime attribute A in R is not partially dependent on any key of R
19
Not the end of problems…
- ONLY TRIVIAL FDs!! (see Date)
- Is in BCNF!
- Obvious insertion anomalies…
Course Teacher Book Databases gmb Date Databases gmb Elmasri Databases jkmm Date Databases jkmm Elmasri OSF gmb Silberschatz OSF tlh Slberschatz
Decomposition
- Even though its in BCNF, we’d prefer to decompose it to the schema - Teaches(Course,Teacher) - Books(Course,Title)
- We need to extend our underlying theory to capture this form of redundancy