Download DBMS Notes and Study Materials and more Lecture notes Database Management Systems (DBMS) in PDF only on Docsity!
Asst. Prof. KAVYA R AI&ML Department SSIT-TUMKUR 1 Fundamentals of Database Systems Elmasri and Navathe 7th Edition, Pearson Education, 2017,
Subject: DATABASE MANAGEMENT SYSTEM
Subject Code: 22AM 502
UNIT- 1
Databases and Database Users: Introduction with an example, Characteristics of Database approach, Actors on the scene, Workers behind the scene, Advantages of using DBMS approach, when not to use a DBMS. Database System Concepts and Architecture : Data models, schemas and instances, Three schema architecture and data independence, Database languages and interfaces, The database system environment, Centralized and client-server architectures. (Text 1: 1.1 to 1.8) (Text 1: 2.1 to 2.5)
1.1 Introduction
- A database is a collection of related data.
- A database has the following implicit properties:
- A database represents some aspect of the real world, sometimes called the miniworld or the universe of discourse (UoD). Changes to the miniworld are reflected in the database.
- A database is a logically coherent collection of data with some inherent meaning. A random assortment of data cannot correctly be referred to as a database.
- A database is designed, built, and populated with data for a specific purpose. It has an intended group of users and some preconceived applications in which these users are interested.
- In other words, a database has some source from which data is derived, some degree of interaction with events in the real world, and an audience that is actively interested in its contents.
- A database can be of any size and complexity. For example, social media company such as Facebook,
- An example of a large commercial database is Amazon.com
- A database may be generated and maintained manually or it may be computerized. For example, a library card catalog is a database that may be created and maintained manually.
- A database management system (DBMS) is a computerized system that enables users to create and maintain a database. The DBMS is a general-purpose software system that facilitates the processes of defining, constructing, manipulating, and sharing databases among various users and applications.
- The database definition or descriptive information is stored by the DBMS in the form of a database catalog or dictionary; it is called meta-data.
- Constructing the database is the process of storing the data on some storage medium that is controlled by the DBMS.
Asst. Prof. KAVYA R AI&ML Department SSIT-TUMKUR 1 Fundamentals of Database Systems Elmasri and Navathe 7th Edition, Pearson Education, 2017,
- Manipulating a database includes functions such as querying the database to retrieve specific data, updating the data base to reflect changes in the miniworld, and generating reports from the data.
- Sharing a database allows multiple users and programs to access the database simultaneously.
- An application program accesses the database by sending queries or requests for data to the DBMS.
- A query typically causes some data to be retrieved;
- A transaction may cause some data to be read and some data to be written into the database
- functions provided by the DBMS
- protecting the database and maintaining it over a long period of time. Protection includes system protection against hardware or software malfunction (or crashes) and security protection against unauthorized or malicious access.
- A typical large database may have a life cycle of many years, so the DBMS must be able to maintain the database system by allowing the system to evolve as requirements change over time.
- To complete our initial definitions, we will call the database and DBMS software together a database system To define this database , we must specify the structure of the records of each file by specifying the different types of data elements to be stored in each record
Asst. Prof. KAVYA R AI&ML Department SSIT-TUMKUR 1 Fundamentals of Database Systems Elmasri and Navathe 7th Edition, Pearson Education, 2017,
- Design of a new application for an existing database or design of a brand-new database starts off with a phase called requirements specification and analysis.
- These requirements are documented in detail and transformed into a conceptual design that can be represented and manipulated using some computerized tools so that it can be easily maintained, modified, and transformed into a database implementation. ( We will introduce a model called the Entity-Relationship model in Chapter 3 that is used for this purpose.)
- The final stage is physical design, during which further specifications are provided for storing and accessing the database.
1.3 Characteristics of the Database Approach
- Writing customized programs to access data stored in files. In traditional file processing
- In the database approach , a single repository maintains data that is defined once and then accessed by various users repeatedly through queries, transactions, and application programs.
- The main characteristics of the database approach versus the file-processing approach are the following: ✓ Self-describing nature of a database system ✓ Insulation between programs and data, and data abstraction ✓ Support of multiple views of the data ✓ Sharing of data and multiuser transaction processing 1.3.1 Self-Describing Nature of a Database System The information stored in the catalog is called meta-data
Asst. Prof. KAVYA R AI&ML Department SSIT-TUMKUR 1 Fundamentals of Database Systems Elmasri and Navathe 7th Edition, Pearson Education, 2017, 1.3.2 Insulation between Programs and Data, and Data Abstraction The structure of data files is stored in the DBMS cata log separately from the access programs. We call this property program-data independence. The characteristic that allows program-data independence and program-operation independence is called data abstraction data model is a type of data abstraction that is used to provide this conceptual representation. The data model uses logical concepts, such as objects, their properties, and their interrelationships, that may be easier for most users to understand than computer storage concepts. Hence, the data model hides storage and implementation details that are not of interest to most database users. 1.3.3 Support of Multiple Views of the Data
- The DBMS must include concurrency control software to ensure that several users trying to update the same data For example, when several reservation agents try to assign a seat on an airline flight , the DBMS should ensure that each seat can be accessed by only one agent at a time for assignment to a passenger. These types of applications are generally called online transaction processing (OLTP) applications.
- The isolation property ensures that each transaction appears to execute in isolation from other transactions, even though hundreds of transactions may be executing concurrently.
- The atomicity property ensures that either all the database operations in a transaction are executed or none are.
1.4 Actors on the Scene
1.4.1 Database Administrators
- Administering the resources is the responsibility of the database administrator (DBA ).
- The DBA is responsible for authorizing access to the database, coordinating and monitoring its use, and acquiring software and hardware resources as needed.
1.4.2 Database Designers
- Database designers are responsible for identifying the data to be stored in the data base and for choosing appropriate structures to represent and store this data.
- Database designers typically interact with each potential group of users and develop views of the database that meet the data and processing requirements of the groups.
1.4.3 End Users
End users are the people whose jobs require access to the database for querying, updating, and generating reports; the database primarily exists for their use. There are several categories of end users:
Asst. Prof. KAVYA R AI&ML Department SSIT-TUMKUR 1 Fundamentals of Database Systems Elmasri and Navathe 7th Edition, Pearson Education, 2017, 1.6.2 Restricting Unauthorized Access: 1.6.3 Providing Persistent Storage for Program Objects : 1.6.4 Providing Storage Structures and Search Techniques for Efficient Query Processing:
- Auxiliary files called indexes are often used
- DBMS often has a buffering or caching module that maintains parts of the database in main memory 1.6.5 Providing Backup and Recovery: The backup and recovery subsystem of the DBMS is responsible for recovery. 1.6. 6 Providing Multiple User Interfaces: DBMS should provide a variety of user interfaces. These include apps for mobile users, query languages for casual users, programming language interfaces for application programmers, forms and command codes for parametric users, and menu-driven interfaces and natural language interfaces for standalone users. Both forms-style interfaces and menu-driven interfaces are commonly known as graphical user interfaces (GUIs). 1.6. 7 Representing Complex Relationships among Data : A database may include numerous varieties of data that are interrelated in many ways: 1.6.8 Enforcing Integrity Constraints : 1.6.9 Permitting Inferencing and Actions Using Rules and Triggers : 1.6.10 Additional Implications of Using the Database Approach : using the database approach that can benefit most organizations
- Potential for Enforcing Standards.
- Reduced Application Development Time.
- Flexibility
- Availability of Up-to-Date Information
- Economies of Scale
1.7 A Brief History of Database Applications
1.7.1 Early Database Applications Using Hierarchical and Network Systems 1.7.2 Providing Data Abstraction and Application Flexibility with Relational Databases 1.7.3 Object-Oriented Applications and the Need for More Complex Databases 1.7.4 Interchanging Data on the Web for E-Commerce Using XML 1.7. 5 Emergence of Big Data Storage Systems and NOSQL Databases: 1.7. 6 Extending Database Capabilities for New Applications: a) Scientific applications b) Storage and retrieval of images c) videos, such as movies, and video clips from news or personal digital cameras d) Data mining applications e) Spatial applications f) Time series applications
Asst. Prof. KAVYA R AI&ML Department SSIT-TUMKUR 1 Fundamentals of Database Systems Elmasri and Navathe 7th Edition, Pearson Education, 2017,
1.8 When Not to Use a DBMS
The overhead costs of using a DBMS are due to the following: ■ High initial investment in hardware, software, and training ■ The generality that a DBMS provides for defining and processing data ■ Overhead for providing security, concurrency control, recovery, and integrity functions Therefore, it may be more desirable to develop customized database applications under the following circumstances: ■ Simple, well-defined database applications that are not expected to change at all ■ Stringent, real-time requirements for some application programs that may not be met because of DBMS overhead ■ Embedded systems with limited storage capacity, where a general-purpose DBMS would not fit ■ No multiple-user access to data
Chapter 2 Database System Concepts and Architecture
2.1 Data Models, Schemas, and Instances Data abstraction generally refers to the suppression of details of data organization and storage, and the highlighting of the essential features for an improved understanding of data. ➢ One of the main characteristics of the database approach is to support data abstraction so that different users can perceive data at their preferred level of detail. A data model —a collection of concepts that can be used to describe the structure of a database 2.1.1 Categories of Data Models
- High-level or conceptual data models provide concepts that are close to the way many users perceive data.
- Representational (or implementation) data models : which provide concepts that may be easily understood by end users but that are not too far removed from the way data is organized in computer storage. - Representational data models hide many details of data storage on disk but can be implemented on a computer system directly.
- low-level or physical data models provide concepts that describe the details of how data is stored on the computer storage media, typically magnetic disks. - Concepts provided by physical data models are generally meant for computer specialists, not for end users.
Asst. Prof. KAVYA R AI&ML Department SSIT-TUMKUR 1 Fundamentals of Database Systems Elmasri and Navathe 7th Edition, Pearson Education, 2017,
- The data in the database at a particular moment in time is called a database state or snapshot. It is also called the current set of occurrences or instances in the database.
- The schema is sometimes called the intension , and a database state is called an extension of the schema.
2.2 Three-Schema Architecture and Data Independence
2.2.1 The Three-Schema Architecture
The goal of the three-schema architecture, illustrated in Figure 2.2, is to separate the user applications from the physical database.
In this architecture, schemas can be defined at the following three levels:
- The internal level has an internal schema , which describes the physical storage structure of the database. The internal schema uses a physical data model and describes the complete details of data storage and access paths for the database
- The conceptual level has a conceptual schema
- Which describes the structure of the whole database for a community of users.
- The conceptual schema hides the details of physical storage structures and concentrates on describing entities, data types, relationships, user operations, and constraints.
- Usually, a rep-resentational data model is used to describe the conceptual schema when a database system is implemented.
Asst. Prof. KAVYA R AI&ML Department SSIT-TUMKUR 1 Fundamentals of Database Systems Elmasri and Navathe 7th Edition, Pearson Education, 2017,
- This implementation conceptual schema is often based on a conceptual schema design in a high-level data model.
- The external or view level-
- Includes a few external schemas or user views.
- Each external schema describes the part of the database that a particular user group is interested in and hides the rest of the database from that user group.
- As in the previous level, each external schema is typically implemented using a representational data model, possibly based on an external schema design in a high- level conceptual data model. ➢ The processes of transforming requests and results between levels are called mappings.
2.2.2 Data Independence
We can define two types of data independence:
- Logical data independence is the capacity to change the conceptual schema without having to change external schemas or application programs.
- Physical data independence is the capacity to change the internal schema without having to change the conceptual schema.
2.3 Database Languages and Interfaces
2.3.1 DBMS Languages
- Data definition language (DDL)-
- Storage definition language (SDL),
- View definition language (VDL)
- data manipulation language (DML) o A high-level or nonprocedural DML/ set-at-a-time or set-oriented DMLs- A query in a high-level DML often specifies which data to retrieve rather than how to retrieve it; therefore, such languages are also called declarative. high-level DML used in a standalone interactive manner is called a query language. o A low level or procedural DML/ record-at-a-time DMLs- 2.3.2 DBMS Interfaces
User-friendly interfaces provided by a DBMS may include the following:
- Menu-based Interfaces for Web Clients or Browsing:
- Apps for Mobile Devices.
- Forms-based Interfaces
- Graphical User Interfaces.
- Natural Language Interfaces.
- Keyword-based Database Search
- Speech Input and Output.
- Interfaces for Parametric Users
Asst. Prof. KAVYA R AI&ML Department SSIT-TUMKUR 1 Fundamentals of Database Systems Elmasri and Navathe 7th Edition, Pearson Education, 2017,
- The precompiler extracts DML commands from an application program written in a host programming language. In the lower part of Figure 2.3,
- The runtime database processor executes---
- the privileged commands, (2) the executable query plans, and (3) the canned transactions with runtime parameters.
- It works with the system catalog and may update it with statistics.
- It also works with the stored data manager, which in turn uses basic operating system services for carrying out low-level input/output (read/write) operations between the disk and main memory.
- concurrency control and backup and recovery systems separately as a module in this figure. They are integrated into the working of the runtime database processor for purposes of transaction management.
- It is common to have the client program that accesses the DBMS running on a separate computer or device from the computer on which the database resides. The former is called the client computer running DBMS client software and the latter is called the database server. In many cases, the client accesses a middle computer, called the application server , which in turn accesses the database server.
2.4.2 Database System Utilities
- Database utilities that help the DBA manage the database system.
- Common utilities have the following types of functions:
- Loading
- Backup.
- Database storage reorganization.
- Performance monitoring.
2.4.3 Tools, Application Environments, and Communications Facilities
- data dictionary stores other information, such as design decisions, usage stan dards, application program descriptions, and user information. Such a system is also called an information repository.
- A data dictionary utility is similar to the DBMS catalog , but it includes a wider variety of information and is accessed mainly by users rather than by the DBMS software.
2.5 Centralized and Client/Server Architectures for DBMSs
2.5.1 Centralized DBMSs Architecture
- Centralized DBMS in which all the DBMS functionality, application program execution, and user interface processing were carried out on one machine.
- Figure 2.4 illustrates the physical components in a centralized architecture.
- Gradually, DBMS systems started to exploit the available processing power at the user side, which led to client/server DBMS architectures.
Asst. Prof. KAVYA R AI&ML Department SSIT-TUMKUR 1 Fundamentals of Database Systems Elmasri and Navathe 7th Edition, Pearson Education, 2017,
2.5.2 Basic Client/Server Architectures
Asst. Prof. KAVYA R AI&ML Department SSIT-TUMKUR 1 Fundamentals of Database Systems Elmasri and Navathe 7th Edition, Pearson Education, 2017,
2. The second criterion used to classify DBMSs is the number of users
supported by the system.
- Single-user systems support only one user at a time and are mostly used with PCs.
- Multiuser systems , which include most DBMSs, support concurrent multiple users.
3. The third criterion is the number of sites over which the database is
distributed:
- DBMS is centralized if the data is stored at a single computer site.
- A distributed DBMS (DDBMS) can have the actual database and DBMS software distributed over many sites connected by a computer network.
- Homogeneous DDBMSs use the same DBMS software at all the sites, whereas heterogeneous DDBMSs can use different DBMS software at each site.
4. The fourth criterion is cost.
- It is difficult to propose a classification of DBMSs based on cost.
- Today we have open source (free) DBMS products like MySQL and PostgreSQL that are supported by third-party vendors with additional services. SEVERAL DATA MODELS
- The relational data model represents a database as a collection of tables, where each table can be stored as a separate file.
- The object data model defines a database in terms of objects, their properties, and their operations. Objects with the same structure and behaviour belong to a class, and classes are organized into hierarchies (or acyclic graphs).
- The key-value data model associates a unique key with each value (which can be a record or object) and provides very fast access to a value given its key.
- The document data model is based on JSON (Java Script Object Notation) and stores the data as documents, which somewhat resemble complex objects.
- The graph data model stores objects as graph nodes and relationships among objects as directed graph edges.
- Finally, the column-based data models store the columns of rows clustered on disk pages for fast access and allow multiple versions of the data.
- The XML model has emerged as a standard for exchanging data over the Web and has been used as a basis for implementing several prototype native XML systems. Two older, historically important data models, now known as legacy data models, are
- the network and hierarchical models. The network model represents data as record types and also represents a limited type of 1: N relationship, called a set type.
- The hierarchical model represents data as hierarchical tree structures. Each hierar chy represents several related records.