Database Design: Normalization and Data Models, Slides of Data Structures and Algorithms

An overview of database normalization and data models, including primary and composite keys, foreign keys, composite foreign keys, many-to-many relationships, and normal forms. It explains the importance of ensuring data consistency and avoiding redundancy in database design.

Typology: Slides

2020/2021

Uploaded on 06/11/2021

salim
salim 🇺🇸

4.4

(24)

242 documents

1 / 12

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
1.204 Lecture 2
Data models, concluded
NlNormaliizatition
Keys
PrimaryPrimary keykey:: oneone oror moremore attributesattributes thatthat uniquelyuniquely
identify a record
Name or identifying number, often system generated
Composite keys are made up of two fields
E.g., aircraft manufacturer and model number
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Database Design: Normalization and Data Models and more Slides Data Structures and Algorithms in PDF only on Docsity!

1.204 Lecture 2

Data models, concluded

NNormali lizatition

Keys

• PrimaryPrimary keykey:: oneone oror moremore attributesattributes thatthat uniquelyuniquely

identify a record

  • Name or identifying number, often system generated
  • Composite keys are made up of two fields
    • E.g., aircraft manufacturer and model number

Foreign keys

• PrimaryPrimary keykey ofof thethe independentindependent or parentor parent

entity type is maintained as a non-key

attribute in the dependent or child entity type

Foreign keys

DeptID DeptName 930 Receiving 378 Assembly 372 Finance 923 Pl i

EmpID DeptID EmpFirstName EmpLastName 4436 483 Brown John 4574 483 Jones Helen 5678 372 Smith Jane 5674 372 Crane Sally (^) 923 Planning 483 Construction

5674 372 Crane Sally 9987 923 Black Joe 5123 923 Green Bill 5325 483 Clinton Bob

Database requires a valid department number when employee is added Employee ID is the unique identifier of employees; department number is not needed as part of the employee primary key

Foreign keys (many-many

relationships)

  • Primaryy key of py parent is used in pprimaryy keyy of child

Independent Dependent Independent

Vehicle can be driven by many drivers; driver can drive many vehicles

Many-to-many relationships with

foreign keys

Vehicle VehicleID VehicleMake VehicleModel

Vehicle Driver VehicleID DriverID

Driver VehicleID VehicleMake VehicleModel DriverID DriverName DriverLicenseNbr 35 Volvo Wagon 33 Ford Sedan 89 GMC Truck

VehicleID DriverID 35 900 35 253 89 900

DriverID DriverName DriverLicenseNbr 253 Ken A 900 Jen B

Never create an entity with vehicle1, vehicle2,…!

Five normal forms: preventing errors

• 1:1: AllAll ooccurrencesccurrences ofof anan entityentity mustmust containcontain tthehe

same number of attributes.

  • No lists, no repeated attributes.

• 2: All non- primary key fields must be a function

of the primary key.

• 3: All non- primary key fields must not be a

function of other non- primary key fields.

• 4: A row must not contain two or more

independent multi-valued facts about an entity.

• 5: A record cannot be reconstructed from several

smaller record types. (Informal)

Examples based on William Kent, "A Simple Guide to Five Normal Forms in Relational Database Theory", Communications of the ACM 26(2), Feb. 1983

First normal form

• • All rows must beAll rows must be fixed lengthfixed length

– Does not allow variable length lists.

– Does not allow repeated fields, e.g., vehicle1,

vehicle2, vehicle3…

  • However many columns you allow, you will always need one more…
  • UseUse a many-many relationship instead, always Seea many many relationship instead always. See vehicle-driver example

Third normal form

Employee Department DepartmentLocation 234 Finance Boston 223 Finance Boston 399 Operations Washington

• NonNon-keykey fieldsfields cannotcannot bebe aa ffunctionunction ofof otherother nonnon-

key fields

  • Example that violates third normal form
    • Key is employee
    • Someone found it convenient to add department location for a report
    • Department location is a function of department, which is not a key
  • Problems:
    • Department location is repeated in every employee record
    • If department location changes, every record with it must be changed
    • Data might become inconsistent
    • If a department has no employees, there may be nowhere to store its location

Third normal form

• • SolutionSolution

– Two entity types: Employee and department

Employee Department 234 Finance 223 Finance 399 Operations

Department DepartmentLocation Finance Boston Operations Washington

TV: “The truth, the whole truth, and nothing but the truth”

DB: “The key, the whole key, and nothing but the key”

Fourth normal form

Employee Skill Language Brown cook English Smith typeyp^ German

• A row should not contain two or more

independent multi-valued facts about an

entity.

– Example that violates fourth normal form:

  • AnAn employemployeeee maymay havhavee sevseveraleral skillsskills andand llanguagesanguages

– Problems

  • Uncertainty in how to maintain the rows. Several approaches are possible and different consultants, analysts or programmers may (will) take different approaches, as shown on next slide

Employee Skill Language Brown cook Brown type BrownBrown FrenchFrench Brown German Brown Greek

Fourth normal form problems

  • Blank fields ambiguous. Blank skill could mean:
    • Person has no skill
    • Attribute doesn’t apply to this employee
    • Data is unknown
    • Data may be found in another record (as in this case)
  • Programmers will use all these assumptions over time, as will data entry staff, analysts, consultants and users - Disjoint format is used on this slide. Effectively same as 2 entity types.

• A record cannot be reconstructed from several

smaller record types.

• Example:

  • Agents represent companies
  • Companies make products
  • Agents sell products

• Most general case (allows any combination):

Fifth normal form

Agent Company Product

Smith Ford car

Smith GM truck

  • Smith does not sell Ford trucks nor GM cars
  • If these are the business rules, a single entity is fine
  • But…

Fifth normal form

• In most real cases a problem occurs

– If an agent sells a certain product and she

representts th the company, ththen shhe selllls tthhatt

product for that company.

Agent Company Product Smith Ford car Smith Ford truck Smith GM car Smith GM truck Jones Ford car

– We can reconstruct all true facts from 3 tables

instead of the single table:

Agent Company Smith Ford Smith GM Jones Ford

(Repetition of facts)

Agent Product Smith car Smith truck Jones car

Company Product Ford car Ford truck GM car GM truck

(No repetition of facts)

Fifth normal form

• ProblemsProblems wiwithth thethe singlesingle tabletable formform

  • Facts are recorded multiple times. E.g., the fact that Smith sells cars is recorded twice. If Smith stops selling cars, there are 2 rows to update and one will be missed.
  • Size of this table increases multiplicatively, while the normalized tables increase additively. With big operations, this is a big difference. - 100,000 x 100,000 is a lot bigger than 100,000 + 100,

• It’s much easier to write the business rules fIt’ h i t it th b i l from

5 th^ normal

  • Rules are more explicit
  • Supply chains usually have all sorts of 5th^ normal issues

Fifth normal form, concluded

• An example with a subtle set of conditions

Non-normal

Agent Company Product Smith Ford car Smith Ford truck SmithSmith GMGM carcar Smith GM truck Jones Ford car Jones Ford truck Brown Ford car Brown GM car Brown Toyota car Brown Toyota bus

Can yyou quickly q y deduce the business rules from this table?

Fifth normal

Agent Company SmithSmith FordFord Smith GM Jones Ford Brown Ford Brown GM Brown Toyota

Company Product FordFord carcar Ford truck GM car GM truck Toyota car Toyota bus

Agent Product SmithSmith carcar Smith truck Jones car Jones truck Brown car Brown bus

  • Jones sells cars and GM makes cars, but Jones does not represent GM
  • Brown represents Ford and Ford makes trucks, but Brown does not sell trucks
  • Brown represents Ford and Brown sells buses, but Ford does not make buses