Database Management Systems, Lecture notes of Database Management Systems (DBMS)

The course is to be taken by ICT Students

Typology: Lecture notes

2018/2019

Uploaded on 03/13/2019

sipiwe
sipiwe 🇿🇲

1 document

1 / 1098

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
pf3
pf4
pf5
pf9
pfa
pfd
pff
pf13
pf14
pf15
pf17
pf18
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf30
pf32
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4c
pf4e
pf4f
pf50
pf52
pf53
pf54
pf56
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf60
pf61
pf63
pf64

Partial preview of the text

Download Database Management Systems and more Lecture notes Database Management Systems (DBMS) in PDF only on Docsity!

It's your^ choice!

New Modular Organization!

3 Relational Model SQLDDL

27 Infonnation Retrieval and XML Data Management

2 ER Model Conceptual Design

Appncatirms emphasis: A course that covers the principles of database systems and emphasizes

how they are used in developing data-intensive applications..

f,;~tY'W';Yl~t';;:;,~7' A course that has a strong systems emphasis and assumes that students have

good programming skills in C and C++.

Hybrid course: Modular organization allows you to teach the course with the emphasis you want.

......- := Dependencies

 I v I II IV VIr III DATABASE MANAGEMENT SYSTEMS **_McGraw-Hill Higher Education_** _tz A Lhvision of The McGraw-Hill Companies_ #### DATABASE MANAGEMENT SYSTEMS, THIRD EDITION International Edition 2003 Exclusive rights by McGraw-Hill Education (Asia), for manufacture and export. This book cannot be re-exported from the country to which it is sold by McGraw-Hill. The International Edition is not available in North America. Published by McGraw-Hili, a business unit of The McGraw-Hili Companies, Inc., 1221 Avenue of the Americas, New York, NY 10020. Copyright © 2003, 2000, 1998 by The McGraw-Hill Companies, Inc. All rights reserved. No part of this publication may be reproduced or distributed in any form or by any means, or stored in a database or retrieval system, without the prior written consent of The McGraw-Hill Companies, Inc., including, but not limited to, in any network or other electronic storage or transmission, or broadcast for distance learning. Some ancillaries, including electronic and print components, may not be available to customers outside the United States. #### CTF BJE Library of Congress Cataloging-in-Publication Data Ramakrishnan, Raghu Database management systems / Raghu Ramakrishnan, Johannes Gehrke.~3rd ed. p. cm. Includes index. ISBN 0-07-246563-8-ISBN 0-07-115110-9 (ISE) 1. Database management. 1. Gehrke, Johannes. II. Title. QA76.9.D3 R237 2003 005.74--Dc21 2002075205 CIP When ordering this title, use ISBN 0-07-123151-X Printed in Singapore www.mhhe.com _To Apu, Ketan, and Vivek with love_ _To Keiko and Elisa_ PREFACE Part I FOUNDATIONS **CONTENTS** XXIV 1 2 OVERVIEW OF DATABASE SYSTEMS 1.1 Managing Data 1.2 A Historical Perspective 1.3 File Systems versus a DBMS 1.4 Advantages of a DBMS 1.5 Describing and Storing Data in a DBMS 1.5.1 The Relational Model 1.5.2 Levels of Abstraction in a DBMS 1.5.3 Data Independence 1.6 Queries in a DBMS 1. 7 Transaction Management 1.7.1 Concurrent Execution of Transactions 1.7.2 Incomplete Transactions and System Crashes 1.7.3 Points to Note 1.8 Structure of a DBMS 1.9 People Who Work with Databases 1.10 Review Questions INTRODUCTION TO DATABASE DESIGN 2.1 Database Design and ER Diagrams 2.1.1 Beyond ER Design 2.2 Entities, Attributes, and Entity Sets 2.3 Relationships and Relationship Sets 2.4 Additional Features of the ER Model 2.4.1 Key Constraints 2.4.2 Participation Constraints 2.4.3 Weak Entities 2.4.4 Class Hierarchies 2.4.5 Aggregation ### vii 4 6 8 9 10 11 12 15 16 17 17 18 19 19 21 22 25 26 27 28 29 32 32 34 35 37 39 Vlll DATABASE "NIANAGEMENT SYSTEivlS Preliminaries Relational Algebra 4.2.1 Selection and Projection 4.2.2 Set Operations 3 4 2.5 Conceptual Design With the ER Model 2 ..5.1 Entity versus Attribute 2.5.2 Entity versus Relationship 2.5.3 Binary versus Ternary Relationships 2 ..5.4 Aggregation versus Ternary Relationships 2.6 Conceptual Design for Large Enterprises 2.7 The Unified Modeling Language 2.8 Case Study: The Internet Shop 2.8.1 Requirements Analysis 2.8.2 Conceptual Design 2.9 Review Questions **THE RELATIONAL MODEL** 3.1 Introduction to the Relational Model 3.1.1 Creating and Modifying Relations Using SQL 3.2 Integrity Constraints over Relations 3.2.1 Key Constraints :3.2.2 Foreign Key Constraints 3.2.3 General Constraints 3.3 Enforcing Integrity Constraints 3.3.1 Transactions and Constraints 3.4 Querying Relational Data 3.5 Logical Database Design: ER to Relational 3.5.1 Entity Sets to Tables 3.5.2 Relationship Sets (without Constraints) to Tables 3.5.3 Translating Relationship Sets with Key Constraints 3.5.4 Translating Relationship Sets with Participation Constraints 3.5.5 Translating Weak Entity Sets 3.5.6 cn'anslating Class Hierarchies 3.5.7 Translating ER Diagrams with Aggregation 3.5.8 ER to Relational: Additional Examples :3.6 Introduction to Views 3.6.1 Views, Data Independence, Security 3.6.2 Updates on Views :3.7 Destroying/ Altering Tables and Views :3.8 Case Study: The Internet Store :3.9 Review Questions **RELATIONAL ALGEBRA AND CALCULUS** 4. 4. 40 41 42 43 46 72 73 76 82 83 88 91 92 94 _Contents_ (^) :»:i 7.7 The Middle Tier 7.7.1 CGI: The Common Gateway Interface 7.7.2 Application Servers 7.7.3 Servlets 7.7.4 JavaServer Pages 7.7.5 Maintaining State 7.8 Case Study: The Internet Book Shop 7.9 Review Questions **Part III STORAGE AND INDEXING 271** Data on External Storage File Organizations and Indexing 8.2.1 Clustered Indexes 8.2.2 Primary and Secondary Indexes Index Data Structures 8.3.1 Hash-Based Indexing 8.3.2 Tree-Based Indexing Comparison of File Organizations 8.4.1 Cost Model 8.4.2 Heap Files 8.4.3 Sorted Files 8.4.4 Clustered Files 8.4.5 Heap File with Unclustered Tree Index 8.4.6 Heap File With Unclustered Hash Index 8.4.7 Comparison of I/O Costs Indexes and Performance Tuning 8..5.1 Impact of the Workload 8.5.2 Clustered Index Organization 8.5.3 Composite Search Keys 8.5.4 Index Specification in SQL: 8.6 Review Questions **OVERVIEW OF STORAGE AND INDEXING** 8. 8. 8 9 **STORING DATA: DISKS AND FILES** 9.1 The Memory Hierarchy 9.1.1 Magnetic Disks 9.1.2 Performance Implications of Disk Structure 9.2 Redundant Arrays of Independent Disks 9.2.1 Data Striping 9.2.2 Redundancy 9.2.3 Levels of Redundancy 9.2.4 Choice of RAID Levels _Contents_ **12 OVERVIEW OF QUERY EVALUATION** 12.1 The System Catalog 12.1.1 Information in the Catalog 12.2 Introduction to Operator Evaluation 12.2.1 Three Common Techniques 12.2.2 Access Paths 12.3 Algorithms for Relational Operations 12.3.1 Selection 12.3.2 Projection 12.3.3 Join 12.3.4 Other Operations 12.4 Introduction to Query Optimization 12.4.1 Query Evaluation Plans 12.4.2 Multi-operator Queries: Pipelined Evaluation 12.4.3 The Iterator Interface 12.5 Alternative Plans: A Motivating Example 12.5.1 Pushing Selections 12.5.2 Using Indexes 12.6 What a Typical Optimizer Does 12.6.1 Alternative Plans Considered 12.6.2 Estimating the Cost of a Plan 12.7 Review Questions **13 EXTERNAL SORTING** 13.1 When Does a DBMS Sort Data? 13.2 A Simple Two- Way Merge Sort 13.3 External Merge Sort 13.3.1 Minimizing the Number of Runs 13.4 Minimizing I/O Cost versus Number of I/Os 13.4.1 Blocked I/O 13.4.2 Double Buffering 13.5 Using B+ Trees for Sorting 13.5.1 Clustered Index 1:3.5.2 Unclustered Index 13.6 Review Questions **14 EVALUATING RELATIONAL OPERATORS** 14.1 The' Selection Operation 14.1.1 No Index, U nsorted Data 14.1.2 No Index, Sorted Data 14.1.:3 B+ Tree Index 14.1.4 Hash Index, Equality Selection 14.2 General Selection Conditions 4: 433 434 444 #### XIV DATABASE^ ~11ANAGEMENT^ SYSTEMS 14.2.1 CNF and Index Matching 14.2.2 Evaluating Selections without Disjunction 14.2.3 Selections with Disjunction 14.3 The Projection Operation 14.3.1 Projection Based on Sorting 14.3.2 Projection Based on Hashing 14.3.3 Sorting Versus Hashing for Projections 14.3.4 Use of Indexes for Projections 14.4 The Join Operation 14.4.1 Nested Loops Join 14.4.2 Sort-Merge Join 14.4.3 Hash Join 14.4.4 General Join Conditions 14.5 The Set Operations 14.5.1 Sorting for Union and Difference 14.5.2 Hashing for Union and Difference 14.6 Aggregate Operations 14.6.1 Implementing Aggregation by Using an Index 14.7 The Impact of Buffering 14.8 Review Questions **15 A TYPICAL RELATIONAL QUERY OPTIMIZER 478** 15.1 Translating SQL Queries into Algebra 479 15.1.1 Decomposition of a Query into Blocks 479 15.1.2 A Query Block as a Relational Algebra Expression 481 15.2 Estimating the Cost of a Plan 482 15.2.1 Estimating Result Sizes 483 15.3 Relational Algebra Equivalences 488 15.3.1 Selections 488 15.3.2 Projections 488 15.3.3 Cross-Products and Joins 489 15.3.4 Selects, Projects, and Joins 490 15.3.5 Other Equivalences 491 15.4 Enumeration of Alternative Plans 492 15.4.1 Single-Relation Queries 492 15.4.2 Multiple-Relation Queries 496 IS.5 Nested Sub queries 504 15.6 The System R Optimizer 506 15.7 Other Approaches to Query Optimization S 15.8 Review Questions 507 **Part V TRANSACTION MANAGEMENT 517** XVI (^) DATABASE rvlANAGEMENT SYSTEMS **18 CRASH RECOVERY** 18.1 Introduction to ARIES 18.2 The Log 18.3 Other Recovery-Related Structures 18.4 The Write-Ahead Log Protocol 18.5 Checkpointing 18.6 Recovering from a System Crash 18.6.1 Analysis Phase 18.6.2 Redo Phase 18.6.3 Undo Phase 18.7 Media Recovery 18.8 Other Approaches and Interaction with Concurrency Control 18.9 Review Questions **Part VI DATABASE DESIGN AND TUNING** **579** 580 582 585 586 587 587 588 590 592 595 596 597 **19 SCHEMA REFINEMENT AND NORMAL FORMS 605** 19.1 Introduction to Schema Refinement 606 19.1.1 Problems Caused by Redundancy 606 19.1.2 Decompositions 608 19.1.3 Problems Related to Decomposition 609 19.2 Functional Dependencies 611 19.3 Reasoning about FDs 612 19.3.1 Closure of a Set of FDs 612 19.3.2 Attribute Closure 614 19.4 Normal Forms 615 19.4.1 Boyce-Codd Normal Form 615 19.4.2 Third Normal Form 617 19.5 Properties of Decompositions 619 19.5.1 Lossless-Join Decomposition 619 19.5.2 Dependency-Preserving Decomposition 621 19.6 Normalization 622 19.6.1 Decomposition into BCNF 622 19.6.2 Decomposition into 3NF 625 19.7 Schema Refinement in Database Design 629 19.7.1 Constraints on an Entity Set 630 19.7.2 Constraints on a Relationship Set 630 19.7.3 Identifying Attributes of Entities 631 19.7.4 Identifying Entity Sets 6: 19.8 Other Kinds of Dependencies 6: 19.8.1 Multivalued Dependencies 6: 19.8.2 Fourth Normal Form 6: 19.8.:3 Join Dependencies (1: ## Contents XVll - 4.2.3 Renaming - 4.2.4 Joins - 4.2.5 Division - 4.2.6 1\'lore Examples of Algebra Queries - 4.3 Relational Calculus - 4.3.1 Tuple Relational Calculus - 4.3.2 Domain Relational Calculus - 4.4 Expressive Power of Algebra and Calculus - 4.5 Review Questions - 5 SQL: QUERIES, CONSTRAINTS, TRIGGERS - 5.1 Overview - 5.1.1 Chapter Organization - .5.2 The Form of a Basic SQL Query - 5.2.1 Examples of Basic SQL Queries - 5.2.2 Expressions and Strings in the SELECT Command - 5.3 UNION, INTERSECT, and EXCEPT - 5.4 Nested Queries - 5.4.1 Introduction to Nested Queries - 5.4.2 Correlated Nested Queries - 5.4.3 Set-Comparison Operators - 5.4.4 More Examples of Nested Queries - 5.5 Aggregate Operators - 5.5.1 The GROUP BY and HAVING Clauses - 5.5.2 More Examples of Aggregate Queries - 5.6 Null Values - 5.6.1 Comparisons Using Null Values - 5.6.2 Logical Connectives AND, OR, and NOT - 5.6.3 Impact 011 SQL Constructs - 5.6.4 Outer Joins - 5.6.5 Disallowing Null Values - 5.7 Complex Integrity Constraints in SQL - 5.7.1 Constraints over a Single Table - 5.7.2 Domain Constraints and Distinct Types - 5.7.3 Assertions: ICs over Several Tables - 5.8 Triggers and Active Databases - 5.8.1 Examples of Triggers in SQL - 5.9 Designing Active Databases - 5.9.1 Why Triggers Can Be Hard to Understand - 5.9.2 Constraints versus Triggers - 5.9.:3 Other Uses of Triggers - 5.10 Review Questions 17: - Part II APPLICATION DEVELOPMENT x DATABASE J\;1ANAGEMENT SYSTEMS - 6 DATABASE APPLICATION DEVELOPMENT - 6.1 Accessing Databases from Applications - 6.1.1 Embedded SQL - 6.1.2 Cursors - 6.1.3 Dynamic SQL - 6.2 An Introduction to JDBC - 6.2.1 Architecture - 6.3 JDBC Classes and Interfaces - 6.3.1 JDBC Driver Management - 6.3.2 Connections - 6.3.3 Executing SQL Statements - 6.3.4 ResultSets - 6.3.5 Exceptions and Warnings - 6.3.6 Examining Database Metadata - 6.4 SQLJ - 6.4.1 Writing SQLJ Code - 6.5 Stored Procedures - 6.5.1 Creating a Simple Stored Procedure - 6.5.2 Calling Stored Procedures - 6.5.3 SQL/PSM - 6.6 Case Study: The Internet Book Shop - 6.7 Review Questions - 7 INTERNET APPLICATIONS - 7.1 Introduction - 7.2 Internet Concepts - 7.2.1 Uniform Resource Identifiers - 7.2.2 The Hypertext Transfer Protocol (HTTP) - 7.3 HTML Documents - 7.4 XML Documents - 7.4.1 Introduction to XML - 7.4.2 XML DTDs - 7.4.3 Domain-Specific DTDs - 7.5 The Three-Tier Application Architecture - 7.5.1 Single- Tier and Client-Server Architectures - 7.5.2 Three- Tier Architectures - 7.5.3 Advantages of the Three-Tier Architecture - 7.6 The Presentation Layer - 7.6.1 HTrvlL Forms - 7.6.2 JavaScript - 7.6.3 Style Sheets - 16 OVERVIEW OF TRANSACTION MANAGEMENT - 16.1 The ACID Properties - 16.1.1 Consistency and Isolation - 16.1.2 Atomicity and Durability - 16.2 Transactions and Schedules - 16.3 Concurrent Execution of Transactions - 16.3.1 rvlotivation for Concurrent Execution - 16.3.2 Serializability - 16.3.3 Anomalies Due to Interleaved Execution - 16.3.4 Schedules Involving Aborted Transactions - 16.4 Lock-Based Concurrency Control - 16.4.1 Strict Two-Phase Locking (Strict 2PL) - 16.4.2 Deadlocks - 16.5 Performance of Locking - 16.6 Transaction Support in SQL - 16.6.1 Creating and Terminating Transactions - 16.6.2 What Should We Lock? - 16.6.3 Transaction Characteristics in SQL - 16.7 Introduction to Crash Recovery - 16.7.1 Stealing Frames and Forcing Pages - 16.7.2 Recovery-Related Steps during Normal Execution - 16.7.3 Overview of ARIES - 16.7.4 Atomicity: Implementing Rollback - 16.8 Review Questions - 17 CONCURRENCY CONTROL - 17.1 2PL, Serializability, and Recoverability - 17.1.1 View Serializability - 17.2 Introduction to Lock Management - 17.2.1 Implementing Lock and Unlock Requests - 17.3 Lock Conversions - 17.4 Dealing With Deadlocks - 17.4.1 Deadlock Prevention - 17.5 Specialized Locking Techniques - 17.5.1 Dynamic Databases and the Phantom Problem - 17.5.2 Concurrency Control in B+ Trees - 17.5.3 Multiple-Granularity Locking - 17.6 ConClurency Control without Locking - 17.6.1 Optimistic Concurrency Control - 17.6.2 Timestamp-Based Concurrency Control - 17.6.3 Multiversion Concurrency Control - 17.7 Reviev Questions 57: - 19.8.4 Fifth Normal Form 6: - 19.8.5 Inclusion Dependencies - 19.9 Case Study: The Internet Shop - 19.10 Review Questions - 20 PHYSICAL DATABASE DESIGN AND TUNING - 20.1 Introduction to Physical Database Design - 20.1.1 Database Workloads - 20.1.2 Physical Design and Tuning Decisions - 20.1.3 Need for Database Tuning - 20.2 Guidelines for Index Selection - 20.3 Basic Examples of Index Selection - 20.4 Clustering and Indexing - 20.4.1 Co-clustering Two Relations - 20.5 Indexes that Enable Index-Only Plans - 20.6 Tools to Assist in Index Selection - 20.6.1 Automatic Index Selection - 20.6.2 How Do Index Tuning Wizards Work? - 20.7 Overview of Database Tuning - 20.7.1 Tuning Indexes - 20.7.2 Tuning the Conceptual Schema - 20.7.3 Tuning Queries and Views - 20.8 Choices in Tuning the Conceptual Schema - 20.8.1 Settling for a Weaker Normal Form - 20.8.2 Denormalization - 20.8.3 Choice of Decomposition - 20.8.4 Vertical Partitioning of BCNF Relations - 20.8.5 Horizontal Decomposition - 20.9 Choices in Tuning Queries and Views - 20.10 Impact of Concurrency - 20.10.1 Reducing Lock Durations - 20.10.2 Reducing Hot Spots - 20.11 Case Study: The Internet Shop - 20.11.11\ming the Datab~'ie - 20.12 DBMS Benchmarking - 20.12.1 Well-Known DBMS Benchmarks - 20.12.2 Using a Benchmark - 20.13 Review Questions - 21 SECURITY AND AUTHORIZATION - 21.1 Introduction to Datab~"e Security - 21.2 Access Control - 21.3 Discretionary Access Control