Download Normalisation-Introduction to Database System-Lecture 13 Slides-Computer Science and more Slides Introduction to Database Management Systems in PDF only on Docsity!
G52DBS – Database Systems www.cs.nott.ac.uk/~smx/DBS
Normalisation so Far
- First normal form
- All data values are atomic
- Second normal form
- As 1NF plus no non- primary-key attribute is partially dependant on the primary key - Third normal form - As 2NF plus no non- primary-key attribute depends transitively on the primary key - These are all based
on a relation with a
single (primary) key
G52DBS – Database Systems www.cs.nott.ac.uk/~smx/DBS
Normalisation Example
representing orders
in an online store
table represents an
item on a particular
order
- Columns
- Order
- Product
- Customer
- Address
- Quantity
- UnitPrice
- Primary key is
{Order, Product}
G52DBS – Database Systems www.cs.nott.ac.uk/~smx/DBS
Functional Dependencies
single customer
single address
single price
transitivity
{Customer}
{Address}
{UnitPrice}
{Address}
G52DBS – Database Systems www.cs.nott.ac.uk/~smx/DBS
Example – FD Diagram
Order Product Customer Address Quantity UnitPrice
R
G52DBS – Database Systems www.cs.nott.ac.uk/~smx/DBS
Normalisation to 2NF
means no partial
dependencies on the
primary key
- {Order} → {Customer, Address}
- {Product} → {UnitPrice} - To remove the first
FD we project over
- {Order, Customer, Address}
- And
- {Order, Product, Quantity and UnitPrice}
G52DBS – Database Systems www.cs.nott.ac.uk/~smx/DBS
Normalisation to 2NF
Order Product Customer Address Quantity UnitPrice
R
Order Product Quantity UnitPrice
Order Customer Address
R
R
G52DBS – Database Systems www.cs.nott.ac.uk/~smx/DBS
Normalisation to 2NF
but there is still a
partial FD in R
- {Product} → {UnitPrice} - To remove this we
project over
- {Product, UnitPrice}
- And
- {Order, Product, Quantity}
Order Product Quantity UnitPrice
G52DBS – Database Systems www.cs.nott.ac.uk/~smx/DBS
Normalisation to 2NF
Order Product Quantity UnitPrice
R
Product UnitPrice
Order Product Quantity
R
R
G52DBS – Database Systems www.cs.nott.ac.uk/~smx/DBS
Normalisation to 3NF
into 3 relations - R1,
R3, and R
- R3 and R4 are 3NF
- R1 has a transitive FD on its key - To remove {Order} → {Customer} → {Address} - we project R
over
- {Order, Customer}
- {Customer, Address}
Order Customer Address
R
G52DBS – Database Systems www.cs.nott.ac.uk/~smx/DBS
Normalisation to 3NF
Order Customer Address
R
Order Customer
Customer Address
R
R
G52DBS – Database Systems www.cs.nott.ac.uk/~smx/DBS
Normalisation
- 1NF:
- {Order, Product, Customer, Address, Quantity, UnitPrice}
- 2NF:
- {Order, Customer, Address}, {Product, UnitPrice}, and {Order, Product, Quantity}
- 3NF:
- {Product, UnitPrice}, {Order, Product, Quantity}, {Order, Customer}, and {Customer, Address}
G52DBS – Database Systems www.cs.nott.ac.uk/~smx/DBS
2NF Revisited
relations with several
candidate keys
relation is called
prime if it is part of a
candidate key, and
non-prime otherwise
- Second normal form
- A relation is in second normal form if it is in first normal form and no non-prime attribute is partially dependent on any candidate key
- If there is only one key, this is the same as before
G52DBS – Database Systems www.cs.nott.ac.uk/~smx/DBS
Anomalies in Stream
- INSERT anomolies
- You can’t add an empty stream
- UPDATE anomolies
- Moving the 12: class to 9:00 means changing two rows
- DELETE anomolies
- Deleting Rebecca removes a stream
Student Course Time John Databases 12: Mary Databases 12: Richard Databases 15: Richard Programming 10: Mary Programming 10: Rebecca Programming 13:
G52DBS – Database Systems www.cs.nott.ac.uk/~smx/DBS
Boyce-Codd Normal Form
Boyce-Codd normal
form (BCNF) if for
every FD A → B
either
- B is contained in A (the FD is trivial), or
- A contains a candidate key of the relation - The same as 3NF
except we don’t
allow B to be prime
candidate key then
3NF and BCNF are
the same
G52DBS – Database Systems www.cs.nott.ac.uk/~smx/DBS
Stream and BCNF
BCNF as the FD
{Time} → {Course}
is non-trivial and
{Time} does not
contain a candidate
key
Student Course Time
Student Course Time John Databases 12: Mary Databases 12: Richard Databases 15: Richard Programming 10: Mary Programming 10: Rebecca Programming 13:
G52DBS – Database Systems www.cs.nott.ac.uk/~smx/DBS
Conversion to BCNF
Student Course Time
Student Course Course Time
Stream has been put into BCNF but we have lost the FD
{Student, Course} → {Time}
G52DBS – Database Systems www.cs.nott.ac.uk/~smx/DBS
Decomposition Properties
not be lost or created
when splitting
relations up
preservation: It is
desirable that FDs are
preserved when
splitting relations up
is always lossless and
dependency
preserving
BCNF is lossless, but
may not preserve all
dependencies
G52DBS – Database Systems www.cs.nott.ac.uk/~smx/DBS
Higher Normal Forms
can go with FDs
- Higher normal forms are based on other sorts of dependency
- Fourth normal form removes multi-valued dependencies
- Fifth normal form removes join dependencies
1NF Relations
2NF Relations
3NF Relations
BCNF Relations
4NF Relations
5NF Relations
G52DBS – Database Systems www.cs.nott.ac.uk/~smx/DBS
Denormalisation
- Normalisation
- Removes data redundancy
- Solves INSERT, UPDATE, and DELETE anomalies
- This makes it easier to maintain the information in the database in a consistent state - However - It leads to more tables in the database - Often these need to be joined back together, which is expensive to do - So sometimes (not often) it is worth ‘denormalising’
G52DBS – Database Systems www.cs.nott.ac.uk/~smx/DBS
Denormalisation
denormalise if
- Database speeds are unacceptable (not just a bit slow)
- There are going to be very few INSERTs, UPDATEs, or DELETEs
- There are going to be lots of SELECTs that involve the joining of tables
Number Street City Postcode
Address
Not normalised since
{Postcode} → {City}
Number Street Postcode
City
Address
Postcode
Address
G52DBS – Database Systems www.cs.nott.ac.uk/~smx/DBS
Normalisation in Exams
Given a relation with scheme {ID, Name, Address, Postcode, CardType, CardNumber}, the candidate key {ID}, and the following functional dependencies:
- {ID} → {Name, Address, Postcode, CardType, CardNumber}
- {Address} → {Postcode}
- {CardNumber} → {CardType}
(i) Explain why this relation is in second normal form, but not in third normal form.
(3 marks)
G52DBS – Database Systems www.cs.nott.ac.uk/~smx/DBS
Normalisation in Exams
(ii) Show how this relation can be converted to third normal form. You should show what functional dependencies are being removed, explain why they need to be removed, and give the relation(s) that result. (4 marks)
(iii) Give an example of a relation that is in third normal form, but that is not in Boyce-Codd normal form, and explain why it is in third, but not Boyce-Codd, normal form. (4 marks)