DBMS lecture notes and tutorial, Lecture notes of Database Programming

DBMS and database lectures notes during interview

Typology: Lecture notes

2020/2021

Uploaded on 05/02/2021

mutethia-isaiah
mutethia-isaiah 🇰🇪

4.8

(5)

6 documents

1 / 38

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1.0 OVERVIEW OF DATABASE SYSTEMS
Data management, which focuses on data collection, storage and retrieval, constitutes
a core activity for any organization.
To generate relevant information efficiently you need quick access to data (raw facts)
from which the required information is produced.
Efficient data management requires the use of a computer database. A database is a
shared, integrated computer structure that houses a collection of: -
i. End -user data i.e. raw facts of interest to the user.
ii. Meta -data i.e. raw facts of interest to the user
iii. Meta data i.e. data about data through which the data is integrated. The Meta
data provides a description of the data characteristics and the set of
relationships that link the data found within the database. The database
resembles a very well organized electronic filing cabinet in which powerful
software referred to as DBMS helps manage the cabinet’s contents.
1.1 Review Of Traditional Processing And It’s Limitations
Consider a saving bank enterprise that keeps information about all customers and
savings accounts in permanent system files at the bank.
The bank will need a number of applications e.g.
i. Program to debit or credit an account
ii. A program to add a new account
iii. A program to find the balance of an account
iv. A program to generate monthly statements
v. Any new program would be added as per the banks requirements
Such a typical filing /processing system has the limitation of more and more files and
application programs being added to the system at any time. Such a scheme has a
number of major disadvantages:
i. Data redundancy and inconsistency - Since the files and application
programs are created by different programmers over a long period of type, the
files are likely to have different formats and the programs may be written in
several programming languages. Moreover, the same piece of information
may be duplicated in several files. This redundancy leads to higher storage
and access costs. It may also lead to inconsistency i.e. the various copies of
the same data may no longer agree.
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26

Partial preview of the text

Download DBMS lecture notes and tutorial and more Lecture notes Database Programming in PDF only on Docsity!

1.0 OVERVIEW OF DATABASE SYSTEMS

 Data management, which focuses on data collection, storage and retrieval, constitutes a core activity for any organization.  To generate relevant information efficiently you need quick access to data (raw facts) from which the required information is produced.  Efficient data management requires the use of a computer database. A database is a shared, integrated computer structure that houses a collection of: - i. End -user data i.e. raw facts of interest to the user. ii. Meta -data i.e. raw facts of interest to the user iii. Meta data i.e. data about data through which the data is integrated. The Meta data provides a description of the data characteristics and the set of relationships that link the data found within the database. The database resembles a very well organized electronic filing cabinet in which powerful software referred to as DBMS helps manage the cabinet’s contents.

1.1 Review Of Traditional Processing And It’s Limitations

 Consider a saving bank enterprise that keeps information about all customers and savings accounts in permanent system files at the bank.  The bank will need a number of applications e.g. i. Program to debit or credit an account ii. A program to add a new account iii. A program to find the balance of an account iv. A program to generate monthly statements v. Any new program would be added as per the banks requirements  Such a typical filing /processing system has the limitation of more and more files and application programs being added to the system at any time. Such a scheme has a number of major disadvantages: i. Data redundancy and inconsistency - Since the files and application programs are created by different programmers over a long period of type, the files are likely to have different formats and the programs may be written in several programming languages. Moreover, the same piece of information may be duplicated in several files. This redundancy leads to higher storage and access costs. It may also lead to inconsistency i.e. the various copies of the same data may no longer agree. 1

ii. Difficulty in accessing - Suppose that one of the bank officers needs to find out the names of all customers who live within the city's 78-phone code. The officer would ask the data processing department to generate such a list. Such a request may not have been anticipated while designing the system originally and the only options available are:-  Extract the data manually  Write the necessary application, therefore do not allow the data to be accessed conveniently and efficiently iii. Data isolation - Since data is scattered in various files and files may be in different formats, it may be difficult to write new applications programs to retrieve the appropriate data. iv. Concurrent access anomalies - Interaction of concurrent updates may result in inconsistent data e.g. if 2 customers withdraw funds say 50/= and 100/= from an account at about the same time the result of the concurrent execution may leave the account in an incorrect state. v. Security problems - Not every user of the database system should be able to access all the data. Since application programs are added to the system in an ad-hoc manner, it is difficult to enforce security constraints. vi. Integrity - The data value stored in the database must satisfy certain types of consistency constraints e.g. a balance of a bank account may never fall below a prescribed value e.g. 5,000/=. These constraints are enforced in a system by adding appropriate code in the various application programs. However, when new constraints are added there is need to change the other programs to enforce. Conclusion. These difficulties among others have prompted the development of DBMS.

1.2 Evaluation of the DBMS

Unlike the file system with may separate and unrelated files, the Database consists of logically related data store in a single data repository. The problems inherent in file systems make using the database system very desirable and therefore, the database represents a change in the way the end user data are stored accessed and arranged.

1.3 Types Of Database Systems

i. Single User database systems This is a database system that supports one user at a time such that if user A is using the database, users B & C must wait until user A complete his or her database work. If a single user database runs on a personal computer it’s called a desktop database. 2

  1. Security - Only authorized people must access confidential data. The DBA ensures that proper access procedures are followed including proper authentication schemes process that the DBMS and additional checks before permitting access to sensitive data. Different levels of security can be implemented for various types of data or operations.
  2. Conflict Resolution - The DBA is in a position to resolve conflicting resolve conflicting requirements of various users and applications. It is by choosing the best file structure and access method to get optimum performance for the response. This could be by classifying applications into critical and less critical applications.
  3. Data Independence - It involves both logical and physical independence logical data independence indicates that the conceptual schemes can be changed without affecting the existing external schemes. Physical data independence indicates that the physical storage structures/devices used for storing the data would be changed without necessitating a change in the conceptual view or any of the external use.

1.5 Disadvantages Of Database Systems

  1. Cost - in terms of:  The DBMS - software  Purchasing or developing S/W  H/W  Workspace (disks for storage)  Migration (movement from tradition separate systems to an integrated one)
  2. Centralization Problems You would require adequate backup in case of failure You would require increased severity of security breaches and disruption of operation of the organization because of downtimes and failures.
  3. Complexity of Backup and recovery 4

File System Environment

________________________________________________________________________

______

Integrated System

Database System Environment

The database eliminates most of the file systems' data inconsistencies, anomalies and structural dependency problems. The current generation of DBMS software stores not only the data structures in a central location but also stores the relationships between the database components The DBMS also takes care of defining all the required access paths of the required component.

1.6 The Database System Environment

The term database system refers to an organization of components that define and regulate the collection storage, management and use of data within a database environment. The database system is composed of 5 major parts i.e. Accounts Department Sales Department Personnel Department employees customer sales inventory accountsaccounts DBMS

DATABASE

Employees Customers Sales Inventory Accounts Accounting Department Sales Department Personnel Department 5

  1. End users - These are the people who use the application programs to run the organizations daily operations. They fall in the following classes: i. Sophisticated users - These interact with the system without writing programs. They form their requests in a database query language. ii. Specialized database applications that do not fit in the traditional data processing framework e.g. CAD Systems, knowledge based & expect systems. iii. Application programmers: These interact with the system through the DML & applications. iv. Naive – Unsophisticated user who interact with the systems by invoking one of the permanent application programs that have been written previously. Procedures  These are instructions and rules that govern the design and use of the database system.  They enforce standards by which business is conducted within the organisation an with customers.  They also ensure that there is an organized way to monitor and audit both the data that enter the database and the information that is generated through the use of such data. DATA This covers the collection for facts stored in the database and since data is the raw material from which information is generated the determination of what data is to be stored into the database and how the data is to be organized is a vital part of the database designer jobs.

2.0 DATABASE ARCHITECTURE AND ENVIRONMENT

2.1 Abstraction and Data Integration

Abstraction is the simplification mechanism to hide superfluous(extra) details of a set of objects. It allows one to concentrate on the properties that are of interest to the application e.g. a car is an abstraction of personal transportation vehicle but does not reveal details about model, year, colour etc. Vehicle itself is an abstraction that includes the types; car, truck, bus and lorry. Consider a non- database environment of a number of application programs as shown below: Application 1 will contain values for the attributes employee Name and Employee. Address and this record can be described in pseudo-code as Type Employee = record Employee.name:string Employee.address:string End Application 2 will have: Type Employee = record Employee.name: String Employee.soc_sec_No: Integer Employee.Adress: String Employee. Annual_salary:integer End In a non-database environment each application is responsible for maintaining the currency of data and a change in data item. In a database environment, data can be stored in this application and their requirement be integrated by whoever is responsible for centralized control (DBA). The integrated version would appear as recorded containing attributes required by both applications. The record will appear as: Type Employee = record Employee.Name:string Employee.soc-sec.no: Integer Employee.Address:string Employee.Annual_Salary: double End The views supported are derived from the conceptual record by using appropriate mapping.

This is at the highest level of database abstraction where only those portions of the database of concern to the user or application programs are included. Any number of user views may be possible, some of which may be identical. Each external view is described by means of a scheme called external scheme, which consists of a definition of the logical records and the relationships in the external view. It also contains the method of devising the objects in the external view from the objects in the conceptual view (entities, attributes and relationships). Conceptual Or Global View Contains all database entities and the relationships among them are included and one conceptual view represents the entire database. It is defined by the conceptual scheme. Also contains the methods of deriving the objects in the conceptual view from the object s in the internal view. It is independent of the physical presentation. Internal View This is the lowest level of abstraction closest to the physical storage method used. It indicated how data would be stored and describe the data structures and access methods to be used by the database. The internal schema implements it. External Level User/application view Mapping supplied by the DBMS Conceptual Level Internal Level The 3 levels of architecture of a DBMS 2 View A View B View C Conceptual View Internal View

Mapping between views Two mappings are required, one between external and conceptual views and another between the conceptual records to internal ones.

Data Independence

This is the immunity of users/application programs from changes in storage structure and access mechanism. The 3 levels of abstractions along with the mappings from internal to conceptual and from conceptual to external provide 2 distinct levels of data independence i.e.:  Logical Data Independence  Physical Data Independence (i) Logical Data Independence This indicates that the conceptual schema can be changed without affecting the existing external schema. The mapping between the external and conceptual levels would absorb the change. It also insulates application programs from operations such as combining two records into one or splitting an existing record into or more records. The LDI is achieved by providing the external level or user view database. The application programs or users see the database as described by the respective external view. DBMS provided a mapping from this view to the conceptual view. NB: The view at conceptual level of the database is the sum total of the current and anticipated views of the database. (ii) Physical Data Independence This indicates that the physical storage structures or devices used for storing the data can be changed without necessitating a change in the conceptual view or any of the external view. Any change is absorbed by the mapping between the conceptual and internal views.

  1. Business Graphics - Some DBMS may provide means of generating graphical output e.g. bar charts, pie charts scatter graphics line plots etc. others will allow users to export data into graphics software.
  2. Application Generators - This is a type of 4th generation language used to create complete application programs. The user describes what need to be done, the data and files that are to be used and the application generator then translates the description into a program. They are also referred to as rapid application tools.
  3. Data Dictionary (DD) - This provides the following facilities:  Documentation of data items  Provision of Standard definition an names for data items.  Data item description.  Removal of redundancy in documentation of data item.  Documentation of relationships between data items;
  4. Fort Generation Languages (4GLS'S) - A 4GL'S is a non-procedural language in which the programs flows and not designed by the programmer but by the 4G software itself.. The user requests for the result rather than a detailed procedure to obtain these results.

2.4 Typical DBMS Functions

A DBMS performs several functions that guarantee the integrity and consistency of the data in the database. Most of these functions are transparent to end-users and can be achieved only through the use of a DBMS. They include: i. Data Dictionary Management - The DBMS enquires that definitions of the data element and their relationships (metadata) be stored in a data dictionary. The DBMS uses the DD to look up the required data component, structures and relationships thus relieving us from having to code such complex relationships in each program. Any changes made in the database structure are automatically recorded in the DD thereby freeing us from having to modify all the programs that access the changed structure. So, the DBMS provides data obstruction and removes structural or data dependency of the system. ii. Data Storage Management - Creation of complex structure required for data storage is done by DBMS thus relieving us from the difficult task of defining and programming the physical data characteristics. A modern DBMS system provides storage for data and related data entry forms or screen definitions, report definition, data validation rules, procedural code structures to handle video and picture formats etc. iii. Data Transformation and Presentation - Transformation of entered data to conform the data structures that are required to store the data is done by the DBMS relieving us the core issue of making a distinction between the data logical formats and data physical format. By maintaining data independence the DBMS translates logical requests it no commands that physically locate and retrieve the requested data. That

is the DBMS transform the physically retrieved data to conform to the users logical expectations. This is by providing application programs with software independence and data abstraction. iv. Security Management - The DBMS creates the systems security that enforces users security and data privacy within the database. Security rules determine which users can access database which data item each user can access and which data operations (read, add, delete, modify) the user may perform. This is important in multi user database system where many users can access the database simultaneously. v. Multi User Access Control - The database creates complex structures that allow multi-user access to the structure. In order to provide data integrity and consistency the DBMS users sophisticated algorithms to ensure that multiple users can access the database con-currently and still guarantee integrity of the database. vi. Back-up and recovery management - To ensure data safety and integrity current DBMS systems provide special utilities that allow the DBA to perform routing and special backup and restore procedures. Recovery management deals with recovery of the database after a failure such as a bad sector in the disk, a power failure etc. Such capability is critical to the preservation of the database integrity. vii. Data integrity Management - The DBMS promotes and enforces integrity rules to eliminate data integrity problems thus minimizing data redundancy and maximizing data consistency. The relationships stored in the Data Dictionary are used to enforce data integrity. Data integrity is especially important in transaction oriented database systems. viii. Data base Access Language and Application Programming Interfaces - The DBMS provides data access via a query language. It contains 2 components, DDL and DML. The DDL defines the structures in which the data are housed and the DML allows end users to extract the data from the database. It also allows data access to programmers via procedural languages such as Cobol, C, Pascal, and Visual Basic etc. It also provides utilities used by the DBA and the Database Designer to create, implement, monitor and maintain the database. ix. Database Communication interfaces - Current generation of DBMS's provide special communication routines designed to allow the database to accept end-use r requests within a computer network environment. The DBMS may provide communication functions to access the database through the internet using internet browsers e.g. Netscape or Explorer as the front-ends

Major Components Of Dbms

Programmers Users DBA Database & System catalog

Database Life Cycle (DBLC)

  1. The Database Initial Study  Examine the current system operation.  Try to establish how and why the current system fails.  Define the problems and constraints  Define the objectives  Define scope and boundaries Application programs Queries Database scheme DML Pre-processor Program object code Query processor Database manager DDL compiler Dictionary manager Access methods File manager System buffers
  1. Database Design  This involves the conceptual design, selection of database, management system software.  Creation of the logical design  Creation of the physical design
  2. Implementation  This involves installation of the DBMS  Creation of the database  Loading or conversion of data
  3. Testing and evaluation The activities involve:  Testing the database  Tune the database  Evaluate the database application programs  Provide the required information flow
  4. Operation Once the database has passed the evaluation stage it is considered to be operational, the database, its management, its users and its application programs constitute a complete I.S. The beginning of the operational phase starts the process of system evaluation.
  5. Maintenance and Evaluation It involves the following:  Preventive Maintenance  Corrective maintenance  Adaptive maintenance  Assignment and maintenance of access permission to new and old user  Generation of database access statistics to improve the efficiency and usefulness of audits and to monitor system persons.  Periodic security based on the system generated statistics  Periodic (monthly, quarterly or yearly) system using summaries for internal billing or budgeting purposes.
  1. Entity sets An entity is a thing or object in the real world that is distinguishable from all other objects. It may be concrete e.g. a person or a book or it may be abstract e.g. a loan, holiday a concept etc. An entity set is a set of entities of the same type that share the same properties or attitudes e.g. a set of all persons who are customers of a bank.
  2. Relationship sets An association between two or more entities is called a relationship.
  3. Attributes They are descriptive properties or characteristics possessed by each member of an entity set.

3.2.2 Characteristics Of Attributes

  1. Simple and Composite attributes - e.g. a customer name or first name, middle name, last name. Composite attributes are necessary if a user wishes to refer to entire attribute on some occasions and to only a component of the attributes on other occasions.
  2. Single valued and Multi valued Attribute - The social security number or ID number can only have a single value at any instance and therefore its said to be single valued. An attribute like dependant name can take several values ranging from o-n thus it is said to be multi valued.
  3. Null Attributes - A null value is used when an entity does not have a value for an attribute e.g. dependent name.
  4. Calculated attribute - The value for this type of attribute can be derived from the values of other related attributes or entities e.g. i. Employment length value can be derived from the value for the start date and the current date. ii. Loans held can be a count of the number of loans a customer has.

3.2.3 Relationship Sets

A relationship is an association amongst several entities while a relationship set is a set of relationships of the same tuple. It is a mathematical relation on n>2 possible non-distinct entity sets e.g. consider 2 entity sets, loan and branch. A relationship set loan, branch can be defined to denote association between a bank loan and the branch in which that loan is obtained. Example Consider 2 entity sets Customer and loan. 002qaAA relationship set - A borrower can be defined to denote the association between customers and the bank loans that the customers have.

Types Of Relationships

i. One to one relationship ( 1:1 ) - An entity in A is associated with utmost one entity in B is associated with at utmost one entity in A. ii. One to Many relationship ( 1:M ) - An entity in A is associated with any number of entities in B while an entity in B can be associated with at most one entity in A. Many to one relationship ( M:1 ) - An entity in A is associated with at most one entity in B and an entity in B can be associated with a number of entities in A. a1 b a2 b a3 b a4 b a5 b iii. Many to many ( M:N ) - An entity in A is associated with at least one entity in B and an entity in B can be associated with a number of entities in A. a 1 a 2 b 1 b 2 b b b b b a a a a a b 1 b 2 b a 1 a 2 a