Understanding Normalization and Denormalization in Database Design, Slides of Database Management Systems (DBMS)

An overview of normalization and denormalization in database design. It explains the concept of normal forms (1nf, 2nf, 3nf, and boyce-codd normal form), the normalization process, and the benefits and drawbacks of normalization. It also covers denormalization and its use cases.

Typology: Slides

2012/2013

Uploaded on 04/27/2013

arunima
arunima 🇮🇳

3

(2)

99 documents

1 / 19

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
A Comprehensive Approach to
Normalization and Denormalization
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13

Partial preview of the text

Download Understanding Normalization and Denormalization in Database Design and more Slides Database Management Systems (DBMS) in PDF only on Docsity!

A Comprehensive Approach to

Normalization and Denormalization

Normalization overview

  • E.F. Codd proposed three normal forms:

The first, second, and third normal forms

  • 1NF, 2NF and 3NF are based on the functional dependencies among the attributes of a relation
  • Boyce-Codd normal form was proposed by Boyce and Codd and is known as a stronger definition of 3NF

Normalization of data

 based on set of functional dependencies (FD) and Prime Key(PK) you analyze the relation schemas to achieve two desirable properties:

  1. Minimizing data redundancy
  2. Minimizing insertion, deletion and update anomalies

 Therefore the ideal goal of DB design will be developing a fully normalized relational schema, preferably in BCNF.

Using normal forms

  • Decomposed the relation when a NF test fails

starting from 1NF.

  • Disclaimers to use of normal forms:
  1. Do not guarantee good database design.

  2. Sometimes highest normal form may not be the best design for performance reasons.

Continued…

  • Problem solved?

Make attributes for the locations if you know

the possible number of locations.

  • Better solution?

Make a new table with dname and dlocation

  • This is the preferred solution

Normalizing the relation into 2NF

  • Based on the concept of a full functional dependency.
  • A FD X -> Y is a full functional dependency if removal of any attribute from X means that the dependency does not hold any more.
  • A partial dependency can occur only if the determinate is composite.
  • Partial dependency, if there is some attribute A in X that can be removed from X and the dependency still holds.
  • Normalize the EMP_PROJ table
  • Steps:
  1. Figure out the dependencies, noting the determinates that is any attribute(s) on which some other attribute(s) are dependent)
  2. Put each determinate in a table by itself.
  3. Include in each table the attributes that are dependent on that determinate

SSN PNUMBER hours ename pname plocation

EMP_PROJ

Normalizing the relation int o 3NF

EMP_DEPT

 Analyze the EMP_DEPT relation:

  1. In 1NF?
  2. In 2NF?
  3. Still a problem? 4)Identify the problem.  The problem with the EMP_DEPT table  DNAME depends on DNUM, not the prime key.(transitive dependency)  X-> Y is a transitive dependency if there is a set of attributes Z of the relation and both X -> Z and Z ->Y hold.

 A relation is in 3NF if every non-key attribute is:  In 2NF).  Non transitively dependent on the prime key.

ename SSN bdate address dnumber dname dmgrssn

Generalizing the definition of 2NF and

3NF

 To assure we do not have update anomalies we need to extend this definition to include all candidate key rather than just PK.

 Disallowed partial dependencies on any key in 2NF.  Disallowed transitive dependencies on any key in 3NF

 Simpler definition for 3NF:

 Every nonprime attribute must be: Fully functionally dependent on every key Non transitively dependent on every key

Boyce-Codd Normal Form

  • Simpler, yet stricter form of 3NF.
  • A relation is in BCNF if and only if every determinate is a candidate key.
  • Decomposed into an equivalent set of BCNF if relation is not in BCNF.
  • If relations have a composite candidate key, one of whose members are determined by a non prime attribute are in 3NF but not in BCNF.

Summary

  • In general, it is best to have a relational

schema in BCNF.

  • If that is not possible, 3NF will do.
  • 2NF and 1NF are not considered good

relation schema designs.

  • They allow too much data redundancy which

leads to update anomalies.

Denormalization

  • Jovial Remark, “Normalize until it hurts, and denormalize until it works”.
  • Easier to query but reintroduce data redundancies.
  • Always improves data retrieval performance.
  • Example: R: EMPLOYEE

Employee of a larger corporation have access to a handful of mutual fund companies for their retirement investment

Emp# Name Age Salary M_fund# Fund_nm Fund_mgr

Continued…

  • However, if there are thousands of employees and just a handful of these retirement mutual funds.
  • Query requiring only mutual fund data will execute rather inefficiently in denormalized design.
  • In contrary, denormalization might not improve data retrieval performance in certain situation.