Schema Refinement-Databases-Lecture 07 Slides-Computer Science, Slides of Database Management Systems (DBMS)

Schema Refinement: Normalisation, Decomposing Relations, Decomposition, Lossless-join Decomposition, Lossless-join, Dependency Preservation, Boyce-codd Normal Form, Third Normal Form, First Normal Form, Second Normal Form, Partial Functional Dependency, Normal Forms

Typology: Slides

2011/2012

Uploaded on 02/12/2012

tiuw
tiuw 🇺🇸

4.7

(18)

286 documents

1 / 23

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Lecture 7:
Schema refinement:
Normalisation
www.cl.cam.ac.uk/Teaching/current/Databases/
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17

Partial preview of the text

Download Schema Refinement-Databases-Lecture 07 Slides-Computer Science and more Slides Database Management Systems (DBMS) in PDF only on Docsity!

1

Lecture 7:

Schema refinement:

Normalisation

www.cl.cam.ac.uk/Teaching/current/Databases/

Decomposing relations

  • In previous lecture, we saw that we could ‘decompose’ the bad relation schema

Data(sid,sname,address,cid,cname,grad e)

to a ‘better’ set of relation schema

Student(sid,sname,address) Course(cid,cname) Enrolled(sid,cid,grade)

Decomposition

  • A decomposition of a relation R=R(A 1 : 1 , …, An: (^) n) is a collection of relations {R 1 , …, Rk} and a set of queries

R Q 0 ( R 1 ,, Rk )

if Ri^ Qi ( R )

then

{ Q 0 , Q 1 ,, Qk }

such that

This is Tim’s somewhat non-standard definition….

Special Case: Lossless- join decomposition

  • {R 1 ,…,Rk} is a lossless-join

decomposition of R with respect to an FD set F, if for every relation instance r of R that satisfies F,

R 1 (r)^ V^ …^ V^ Rk(r) = r

(this means project on the attributes of the relation’s schema)

Lossless-join: Example

sid sname addres s

cid cname grade

124 Julia USA 206 Database A++ 204 Kim Essex 202 Semantics C 124 Julia USA 201 S/Eng I A+ 206 Tim London 206 Database B- 124 Julia USA 202 Semantics B+

What happens if we decompose on (sid,sname,address) and (cid,cname,grade)?

Dependency preservation

  • Intuition: If R is decomposed into R 1 , R 2 and R 3 , say, and we enforce the FDs that hold individually on R 1 , on R 2 and on R 3 , then all FDs that were given to hold on R must also hold
  • Reason: Otherwise checking updates for violation of FDs may require computing joins 

Dependency preservation: example

  • Take R=R(city, street&no, zipcode) with FDs: - city,street&no zipcode - zipcode city
  • Decompose to
    • R1(street&no,zipcode)
    • R2(city,zipcode)
  • Claim: This is a lossless-join decomposition
  • Is it dependency preserving?

Boyce-Codd normal form “Represent Every Fact Only ONCE”

  • A relation R with FDs F is said to be in Boyce-Codd normal form (BCNF) if for all X A in F+^ then - Either A X (‘trivial dependency’), or - X is a superkey for R
  • Intuition: A relation R is in BCNF if the left side of every non-trivial FD contains a key

BCNF: Example

BankerSchema(brname,cname,bname)

  • With FDs
    • bname brname
    • brname,cname bname
  • Not in BCNF ( Why? )
  • We might decompose to
    • BBSchema(bname,brname)
    • CBrSchema(cname,bname)
  • This is in BCNF 
  • BUT this is not dependency-preserving 

Third normal form

  • A relation R with FDs F is said to be in third normal form (3NF) if for all X A in F+^ then - Either A X (‘trivial dependency’), or - X is a superkey for R, or - A is a member of some candidate key for R
  • Notice that 3NF is strictly weaker than BCNF
  • (A prime attribute is one which appears in a candidate key)
  • It is always possible to find a dependency-preserving lossless-join decomposition that is in 3NF.

Prehistory: First normal form

  • First normal form (1NF) is now considered part of the formal definition of the relational model
  • It states that the domain of all attributes must be atomic (indivisible), and that the value of any attribute in a tuple must be a single value from the domain
  • NOTE: Modern databases have moved away from this restriction

Prehistory: Second normal form

  • A partial functional dependency X Y is an FD where for some attribute A X, (X- {A}) Y
  • A relation schema R is in second normal form ( 2NF ) if every non-prime attribute A in R is not partially dependent on any key of R

19

Not the end of problems…

  • ONLY TRIVIAL FDs!! (see Date)
  • Is in BCNF!
  • Obvious insertion anomalies…

Course Teacher Book Databases gmb Date Databases gmb Elmasri Databases jkmm Date Databases jkmm Elmasri OSF gmb Silberschatz OSF tlh Slberschatz

Decomposition

  • Even though its in BCNF, we’d prefer to decompose it to the schema - Teaches(Course,Teacher) - Books(Course,Title)
  • We need to extend our underlying theory to capture this form of redundancy