Database Modeling and Design Lecture Notes, Exams of Database Management Systems (DBMS)

database management system (DBMS) -- a generalized software system for manipulating databases. Includes logical view (schema, sub-schema), physical view. ( ...

Typology: Exams

2022/2023

Uploaded on 05/11/2023

ekaram
ekaram 🇺🇸

4.6

(30)

264 documents

1 / 95

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Database Modeling and Design
3rd Edition
Toby J. Teorey
University of Michigan
Lecture Notes
Contents
I. Database Systems and the Life Cycle (Chapter 1)……………………2
Introductory concepts; objectives of database management 2
Relational database life cycle 3
Characteristics of a good database design process 7
II. Requirements Analysis (Chapter 3)………………………………….8
III. Entity-Relationship (ER) Modeling (Chapters 2-4)……………… 11
Basic ER modeling concepts 11
Schema integration methods 22
Entity-relationship 26
Transformations from ER diagrams to SQL Tables 29
IV. Normalization and normal forms (Chapter 5)………………………35
First normal form (1NF) to third normal form (3NF) and BCNF 35
3NF synthesis algorithm (Bernstein) 42
Fourth normal form (4NF) 47
V. Access Methods (Chapter 6)…………………………..………………50
Sequential access methods 50
Random access methods 52
Secondary Indexes 58
Denormalization 62
Join strategies 64
VI. Database Distribution Strategies (Chapter 8)……………………….66
Requirements of a generalized DDBMS: Date’s 12 Rules 68
Distributed database requirements 72
The non-redundant “ best fit” method 74
The redundant “all beneficial sites” method 77
VII. Data Warehousing, OLAP, and Data Mining (Chapter 9)…….....79
Data warehousing 79
On-line analytical processing (OLAP) 86
Data mining 93
Revised 11/18/98 – modify Section V
Revised 11/21/98 – insertions into Section VII
Revised 1/14/99 – modify Section VI
Revised 2/11/99 – modify Section IV, 4NF (p.47 FD, MVD mix)
Revised 6/13/00 – modify Section V (secondary indexes)
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f

Partial preview of the text

Download Database Modeling and Design Lecture Notes and more Exams Database Management Systems (DBMS) in PDF only on Docsity!

Database Modeling and Design

rd Edition

Toby J. Teorey University of Michigan

Lecture Notes

Contents

I. Database Systems and the Life Cycle (Chapter 1)…………………… Introductory concepts; objectives of database management 2 Relational database life cycle 3 Characteristics of a good database design process 7

II. Requirements Analysis (Chapter 3)………………………………….

III. Entity-Relationship (ER) Modeling (Chapters 2-4)……………… 11 Basic ER modeling concepts 11 Schema integration methods 22 Entity-relationship 26 Transformations from ER diagrams to SQL Tables 29

IV. Normalization and normal forms (Chapter 5)……………………… First normal form (1NF) to third normal form (3NF) and BCNF 35 3NF synthesis algorithm (Bernstein) 42 Fourth normal form (4NF) 47

V. Access Methods (Chapter 6)…………………………..……………… Sequential access methods 50 Random access methods 52 Secondary Indexes 58 Denormalization 62 Join strategies 64

VI. Database Distribution Strategies (Chapter 8)………………………. Requirements of a generalized DDBMS: Date’s 12 Rules 68 Distributed database requirements 72 The non-redundant “ best fit” method 74 The redundant “all beneficial sites” method 77

VII. Data Warehousing, OLAP, and Data Mining (Chapter 9)……..... Data warehousing 79 On-line analytical processing (OLAP) 86 Data mining 93

Revised 11/18/98 – modify Section V Revised 11/21/98 – insertions into Section VII Revised 1/14/99 – modify Section VI Revised 2/11/99 – modify Section IV, 4NF (p.47 FD, MVD mix) Revised 6/13/00 – modify Section V (secondary indexes)

I. Database Systems and the Life Cycle

Introductory Concepts

data —a fact, something upon which an inference is based (information or knowledge has value, data has cost)

data item —smallest named unit of data that has meaning in the real world (examples: last name, address, ssn, political party)

data aggregate (or group ) -- a collection of related data items that form a whole concept; a simple group is a fixed collection, e.g. date (month, day, year); a repeating group is a variable length collection, e.g. a set of aliases.

record —group of related data items treated as a unit by an application program (examples: presidents, elections, congresses)

file —collection of records of a single type (examples: president, election)

database —computerized collection of interrelated stored data that serves the needs of multiple users within one or more organizations, i.e. interrelated collections of records of potentially many types. Motivation for databases over files: integration for easy access and update, non-redundancy, multi-access.

database management system (DBMS) -- a generalized software system for manipulating databases. Includes logical view (schema, sub-schema), physical view (access methods, clustering), data manipulation language, data definition language, utilities

  • security, recovery, integrity, etc.

database administrator (DBA) -- person or group responsible for the effective use of database technology in an organization or enterprise. Motivation: control over all phases of the lifecycle.

Objectives of Database Management

  1. Data availability—make an integrated collection of data available to a wide variety of users
    • at reasonable cost—performance in query update, eliminate or control data redundancy
    • in meaningful format—data definition language, data dictionary
    • easy access—query language (4GL, SQL, forms, windows, menus); embedded SQL, etc.; utilities for editing, report generation, sorting
  2. Data integrity—insure correctness and validity
    • checkpoint/restart/recovery
    • concurrency control and multi-user updates
    • accounting, audit trail (financial, legal)
  3. Privacy (the goal) and security (the means)
    • schema/sub-schema, passwords
  4. Management control—DBA: lifecycle control, training, maintenance

Characteristics of a Good Database Design Process

  • iterative requirements analysis
  • interview top-down
  • use simple models for data flow and data relationships
  • verify model
  • stepwise refinement and iterative re-design

  • well-defined design review process to reduce development costs review team -database designers -DBMS software group -end users in the application areas when to review

  • after requirements analysis & conceptual design
  • after physical design
  • after implementation (tuning) meeting format
  • short documentation in advance
  • formal presentation
  • criticize product, not person
  • goal is to locate problems, do solutions off line
  • time limit is 1-2 hours

II. Requirements Analysis

Purpose - identify the real-world situation in enough detail to be able to define database components. Collect two types of data: natural data (input to the database) and processing data (output from the database).

Natural data requirements (what goes into the database)

  1. Organizational objectives
    • sell more cars this year
  • move into to recreational vehicle market
  1. Information system objectives
    • keep track of competitors’ products and prices
    • improve quality and timing of data to management regarding production schedule delays, etc.
    • keep track of vital resources needed to produce and market a product
  2. Organizational structure/chart
  3. Administrative and operational policies
    • annual review of employees
    • weekly progress reports
    • monthly inventory check
    • trip expense submission
  4. Data elements, relationships, constraints, computing environment

Processing requirements (what comes out of the database)

  1. Existing applications - manual, computerized
  2. Perceived new applications
  • quantifies how data is used by applications

  • should be a subset of data identified in the natural relationships (but may not be due to unforeseen applications)

  • problem - many future applications may be unknown

Interviews at different levels

Top management - business definition, plan/objectives, future plans

Middle management - functions in operational areas, technical areas, job-titles, job functions

Employees - individual tasks, data needed, data out

Specific end-users of a DBMS - applications and data of interest

Basic rules in interviewing

  1. Investigate the business first
  2. Agree with the interviewee on format for documentation (ERD, DFD, etc.)
  3. Define human tasks and known computer applications
  4. Develop and verify the flow diagram(s) and ER diagram(s)
  5. Relate applications to data (this helps your programmers)

Example: order entry clerk

Function: Take customer orders and either fill them or make adjustments. Frequency: daily

Task Def Volume Data Elements

  1. Create order 2000 A, B, E, H
  2. Validate order 2000 A, B, G, H, J
  3. Fill out error form 25 A, C
  4. Reserve item/price 6000 A, D, H
  5. Request alternate items 75 A, E, I, K,M
  6. Enter unit price 5925 A, F, J, N

III. Entity-Relationship (ER) Modeling

Basic ER Modeling Concepts

Entity - a class of real world objects having common characteristics and properties about which we wish to record information.

Relationship - an association among two or more entities

  • occurrence - instance of a relationship is the collective instances of the related entities

  • degree - number of entities associated in the relationship (binary, ternary, other n-ary)

  • connectivity - one-to-one, one-to-many, many-to-many

  • existence dependency (constraint) - optional/mandatory

Attribute - a characteristic of an entity or relationship

  • Identifier - uniquely determines an instance of an entity

  • Identity dependence - when a portion of an identifier is inherited from another entity

  • Multi-valued - same attribute having many values for one entity

  • Surrogate - system created and controlled unique key (e.g. Oracle’s “create sequence”)

Constraints

Constraints in ER modeling

  • role - the function an entity plays in a relationship

  • existence constraint (existence dependency) - weak entity

  • exclusion constraint - restricts an entity to be related to only of several other

  • entities at a given point in time

  • mandatory/optional
  • specifies lower bound of connectivity of entity instances
  • participating in a relationship as 1 or 0
  • uniqueness constraint – one-to-one functional dependency among key attributes in a relationship: binary, ternary, or higher n-ary