Normalization in Database Management Systems: Avoiding Redundancy and Anomalies - Prof. Kr, Exams of Database Management Systems (DBMS)

A chapter from the 'database management systems' textbook by kristen lefevre, focusing on normalization. Normalization is a process used to eliminate redundancy and prevent anomalies in database schemas. Topics such as redundancy problems, functional dependencies, and the importance of lossless join decompositions and normal forms like bcnf and 3nf.

Typology: Exams

Pre 2010

Uploaded on 09/02/2009

koofers-user-cb3-1
koofers-user-cb3-1 🇺🇸

10 documents

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
4/12/09 EECS 484: Database Man agement Systems, Kristen L eFevre 1
Normalization
Chapter 19
4/12/09 EECS 484: Database Man agement Systems, Kristen L eFevre 2
Database Design: The Story so Far
Requirements Analysis
Data stored, operations, apps, …
Conceptual Database Design
Model high-level description of the data, constraints, ER model
Logical Database Design
Choose a DBMS and design a database schema
Schema Refinement
Normalize relations, avoid redundancy, anomalies …
Physical Database Design
Examine physical database structures like indices, restructure …
Security Design
4/12/09 EECS 484: Database Man agement Systems, Kristen L eFevre 3
Redundancy
What is a good relational schema?
Suppose we did the following…
Supplier ItemSupplies
price desc
name addr item
$3A2pinkFlowersAcme
$10A2bluePaintAcme
$8A2boomDynamiteAcme
PriceAddrDescItemName Redundant
storage problem:
Supplier
information
stored once per
item it supplies!
pf3
pf4
pf5

Partial preview of the text

Download Normalization in Database Management Systems: Avoiding Redundancy and Anomalies - Prof. Kr and more Exams Database Management Systems (DBMS) in PDF only on Docsity!

4/12/09 EECS 484: Database Management Systems, Kristen LeFevre 1

Normalization

Chapter 19

4/12/09 EECS 484: Database Management Systems, Kristen LeFevre 2

Database Design: The Story so Far

 Requirements Analysis

 Data stored, operations, apps, …

 Conceptual Database Design

 Model high-level description of the data, constraints, ER model

 Logical Database Design

 Choose a DBMS and design a database schema

 Schema Refinement

 Normalize relations, avoid redundancy, anomalies …

 Physical Database Design

 Examine physical database structures like indices, restructure …

 Security Design

4/12/09 EECS 484: Database Management Systems, Kristen LeFevre 3

Redundancy

 What is a good relational schema?

 Suppose we did the following…

Supplier Supplies Item name addr (^) price item desc Acme Flowers pink A2 $ Acme Paint blue A2 $ Acme Dynamite boom A2 $ Name Item Desc Addr Price Redundant storage problem: Supplier information stored once per item it supplies!

4/12/09 EECS 484: Database Management Systems, Kristen LeFevre 4

Redundancy

 Redundancy Problems:

 Redundant storage

 A supplier supplies multiple items

 Update anomalies

 Change address of a supplier

 Insertion anomalies

 Insert a supplier (nulls?)  What if the supplier doesn’t supply any items?

 Deletion anomalies

 What if we want to delete the last item tuple? Acme Flowers pink A2 $ Acme Paint blue A2 $ Acme Dynamite boom A2 $ Name Item Desc Addr Price

Better Schema:

S(name,addr)

SP(name,item,price)

I(item,desc)

4/12/09 EECS 484: Database Management Systems, Kristen LeFevre 5

Dealing with Redundancy

 Redundancy arises when schema forces an unnatural

association among attributes

 Resolve with functional dependencies

 Main refinement technique: decomposition

 replacing larger relation with smaller ones

 Decomposition should be used judiciously:

 Normal forms: guarantees against (some) redundancy 4/12/09 EECS 484: Database Management Systems, Kristen LeFevre 6

Functional Dependencies (FDs)

 A form of IC  D: X → Y X and Y subsets of relation R’s attributes t1 ∈ r, t2 ∈ r, ∏X (t1) = ∏X (t2) ⇒ ∏y (t1) = ∏y (t2)  An FD is a statement about all allowable relations.  Based only on application semantics, can’t deduce from instances  Can simply check if an instance violates FD (and other ICs)  Consider, (X,Y) → Z. Does this imply (X,Y) is a key?

X Y Z K Primary Key IC is a special case of FD

1 1 11 A

1 2 12 A

2 2 22 A

2 2 22 B

 Role of FDs in detecting redundancy:  Relation R with 3 attributes, ABC.  No FDs hold ⇒ no redundancy.  A → B ⇒ 2 or more tuples with the same A value, redundantly have the same B value!

4/12/09 EECS 484: Database Management Systems, Kristen LeFevre 10

Decomposition

 Replace a relation with two or more relations

 Problems with decomposition

 Some queries become more expensive. (more joins)  Lossless Join: Can we reconstruct the original relation from instances of the decomposed relations?  Dependency Preservation: Checking some dependencies may require joining the instances of the decomposed relations. 4/12/09 EECS 484: Database Management Systems, Kristen LeFevre 11

Lossless Join Decompositions

 Relation R, FDs F: Decomposed to X, Y

 Lossless-Join decomposition if:

∏X(r) ⋈ ∏Y(r) = r for every instance r of R

 Note, r ⊆ ∏X(r) ⋈ ∏Y(r) is always true,

not vice versa, unless the join is lossless

 Can generalize to three or more relations

A B C

A B

B C

A B C

4/12/09 EECS 484: Database Management Systems, Kristen LeFevre 12

Lossless Join (cont.)

 Relation R, FDs F: Decomposed to X, Y

 Test: lossless-join w.r.t. F if and only if F+^ contains:

 X ∩ Y → X, or

 X ∩ Y → Y

i.e. attributes common to X and Y contain a key for

either X or Y

Lossless join decomposition is always required!

4/12/09 EECS 484: Database Management Systems, Kristen LeFevre 13

Dependency Preserving Decomposition

 R (sailor, boat, date) { D → S , D → B}

decomp. to X (sailor, boat)

Y (boat, date) {D → B}

 To check D → S need to join R1 and R2 (expensive)

 Dependency preserving:

 R decomp. to X, Y F+^ = (Fx ⋃ Fy)+

 Note: F not necessarily = Fx ⋃ Fy

Dependency preserving decomposition is desired!

4/12/09 EECS 484: Database Management Systems, Kristen LeFevre 14

Normal Forms

 Certain kind of decomposition

 Guarantees that certain problems won’t occur

 1 NF : No set-valued attrs

 2 NF : Historical

 3 NF : …

 BCNF : Boyce-Codd Normal Form

More Restrictive 4/12/09 EECS 484: Database Management Systems, Kristen LeFevre 15

Boyce-Codd Normal Form (BCNF)

 Rel. R with FDs F is in BCNF if, for all X → A in F+  A X (trivial FD), or  X is a super key i.e. all non-trivial FDs over R are keys.  No redundancy in R (at least none that FDs detect)  Most desirable normal form

X Y A

x y 1 a

x y 2?

 Consider a relation in BCNF and FD: X → A,

two tuples have the same X value

 Can the A values be the same (i.e. redundant)?

 NO! X is a key, ⇒ y1 = y2.

4/12/09 EECS 484: Database Management Systems, Kristen LeFevre 19

Announcements

 Optional Exercises: 19.1, 19.3, 19.5, 19.7,