Download Database Management Systems: Concepts and Design and more Lecture notes Database Management Systems (DBMS) in PDF only on Docsity!
DATABASE.
Introduction Database is a collection of related data database management system (DBMS) is the software that manages and controls access to the database. E.g. MYSQL, XAMPP. Database application is simply a program that interacts with the database. Usually used by the end users. E.g. uniplus interface. database system is the collection of application programs that interact with the database along with the DBMS and database itself. Examples of database.
- Purchases from the supermarket.
- Purchases using credit card.
- Booking an airline ticket.
- Using library. Definitions
- Data admin. There main work is to deal with data i.e. how data is to be saved, the security measures, recovery of data and the best software and DBMS to handle data of a given company.
- Database admin. There main work is to deal with the physical aspects of the database. i.e. choosing hardware, making sure the hardware is working by replacing and updating.
- Application developer. These is the person who writers the database system programs after the planning is done.
- Logical database designer. this is the person who do the planning of the database. Logical designs.
- Physical database designer. This is the person who decide how the hardware and other components of database are placed on the firm. How logical designs can be made physical.
- End users. This are people who access the end product of database. TOPIC ONE File-based systems and database.
File-based system - a collection of application programs that perform services for the end-users such as the production of reports. Each program defines and manages its own data. A good example is the Microsoft access it makes its own filing system that is not related to another software even if they are the same software and make. E.g. access 2007 and 2016. Limitations of the file-based approach.
- Separation and isolation of data Since each and every application stores data on its own file it will be hard to manage logically related data.
- Duplication of data. Data is saved on its own file this means one has to repeat the data every time especially for the logically related data.
- Data dependence For an update in the file containing the data to happen a new file is created and then deleted to create the new updated file.
- Incompatible file format. For the firm that created a database using Microsoft access for windows xp it means they can’t access the data in access 2016 and vice versa.
- Fixed queries/proliferation of application programs. For one to add a new feature to the file system one has to write a new program. Database approach The database- a shared collection of logically related data, and a description of this data, designed to meet the information needs of an organization. Also, can be defined as self-describing collection of integrated records. This means that data is shared among all departments of the firm with very few data redundancy and can be accessed by more than one person at the same time. Description of the data is known as the system catalog or data dictionary or metadata system catalog or data dictionary or metadata they are data about the data also known as objects that creates the data independence from the application. Data abstraction this is the process by which internal data(objects) can be changed without affecting the external objects. This is done by changing internal definitions and making sure that external objects remain the same. Example. When one pays school fees internal objects like finance, registry and school accounts are changed to show the addition will the object student portal does not change. An entity
- Cost of DBMS
- Additional hardware costs
- Cost of conversion
- Performance
- Higher impact of a failure TOPIC TWO: Database environment DATA ARCHITECTURE
1. ANSI /SAPRC ARCHITECTURES
There are three levels as proposed by American national standards institute(ANSI) and standards planning and requirements committee(SPARC). Objectives of levels
- Each user should be able to access the same data, but have a different customized view of the data.
- Users should not know the storage considerations.
- Database can be changed without affecting the user’s views.
- Internal structure should not be affected by the physical aspects.
- Change of Conceptual structure of database should not affect all users.
LEVELS ARE:
- External The user’ view of the database. This level describes that part of the database that is relevant to each user. This part of database is represented in form that normal person can understand and view that is used to it. i.e. a GUI.
- Conceptual The community view of the database. This level describes what data is stored in the database and the relationships among the data. This level contains the logical structure of the entire database as seen by the DBA.
- Internal The physical representation of the database on the computer. This level describes how the data is stored in the database.
Physical level- that is managed by the operating system under direction of the DBMS. Schemas It is the overall description of the database. Types;
- External schemas(subschemas). That correspond to different views of the data.
- Conceptual schema. Describe all the entities, attributes, and relationships together with integrity constraints.
- Internal schema. Complete description of the internal model, containing the definitions of stored records, the methods of representation, the data fields, and the indexes and storage structures used. Mappings The way the DBMS communicates between conceptual internal and external levels. Data independence It means that when the changes that are made in internal level are not affected by external levels. They are two types;
- Logical independence immunity of the external schemas to changes in the conceptual schema. Change in new entities, attributes, or relationships, should be possible without having to change existing external schemas or having to rewrite application programs.
- Physical independence immunity of conceptual schema to changes in the internal schema. Using different file organization or storage structures, using different storage devices, modifying indexes, or hashing algorithms, should be possible without having to change the conceptual or external schemas.
- File system model. E.g. VSAM
- Hierarchical and network data model. E.g. ADABAS
- Relational data model. E.g. oracle, MySQL.
- Object-oriented. E.g. oracle 10g
- Xml. E.g. oracle 10g, MS SQL server. OBJECT-BASED DATA MODELS It uses the concepts such as entities, attributes, and relationships. Record-based data models
- Communication costs are reduced.
- Increased consistency.
- It maps on to open systems architecture quite naturally. Transaction processing monitors A program that controls data transfer between clients and servers in order to provide a consistent environment, particularly for online transaction processing (OLTP)
TOPIC THREE: DATABASE PLANNING
it is closely related to lifecycle of the information system. Database system development lifecycle are not strictly sequential, but involve some amount of repetition of previous stages through feedback loops.
- Database planning The management activities that allow the stages of the database system development lifecycle to be realized as efficiently and effectively as possible. It must integrate the information system;
- Identification of enterprise plans and goals.
- Evaluation of current information system.
A DATA MODEL: it is a simple representation, usually graphical for more complex real-world data structures. In simple terms, it is an abstraction of complex real-world data or objects The building blocks of data model
- Entity. It is a representation of real life objects about which data is collected and stored. This makes them distinguishable (each occurrence is unique and distinct)
- Attributes. Characteristic of an entity. i.e. the it explains some part of entity property.
- Relationships. Describes association among entities. Types: a. One to many: e.g. a painter can paint many rooms or buildings. b. Many to many: an employee with many job skills which means that many jobs can be done by many employees. c. One to one: a single branch managed by single manager.
- Constraints. A restriction placed on the data. They are important because they ensure integrity. They are expressed in form of rules. Business rules It is a brief, precise, and unambiguous description of a policy, procedure, or principle within a specific organization. Properly written business rules are used to define entities, attributes, relationships, and constraints. Criteria for data models Structural validity - Consistency with the way the enterprise defines and organizes information. Simplicity - Ease of understanding by IS professionals and non-technical users. Express ability- Ability to distinguish between different data, relationships between data, and constraints. No redundancy - Exclusion of extraneous information; in particular, the representation of any one piece of information exactly once. Share ability - Not specific to any particular application or technology and thereby usable by many. Extensibility - Ability to evolve to support new requirements with minimal effect on existing users. Integrity - Consistency with the way the enterprise uses and manages information.
Diagrammatic representation - Ability to represent a model using an easily understood diagrammatic notation. PHASES OF DATABASE DESIGN
made up of three main phases;
- Conceptual database design The process of constructing a model of data used in an enterprise, independent of all physical considerations. The data model is built using the information documented in the user’s requirements specification. Conceptual database design is entirely independent of implementation details such as the target DBMS software, application programs, programming languages, hardware platform, or any other physical considerations
- Logical database design The process of constructing a model of the data used in an enterprise based on a specific data model, but independent of a particular DBMS and other physical considerations The conceptual design is mapped into logical design
- Physical database design the process of producing a description of the implementation of the database on secondary storage; it describes the base relations, file organizations, and indexes used to achieve efficient access to the data and any associated integrity constraints and security measures. Created to target a specific DBMS.
DBMS SELECTION
It is the next step in database planning it involves the selection of an appropriate DBMS to support the database system. This is done to expand or replace the existing system. It is important to select it in the conceptual and logical database design phase. Selecting the DBMS
- Define terms of reference of study It states the objectives and scope of the study, and the tasks that need to be undertaken. It is a document that contain the user requirements specification.
- Shortlist two or three products Using the reference of study one shortlist three products that can be taken to evaluation. This is done through pricing, support and skills needed to operate it.
- Evaluate products
Visually appealing layout of the form/report The color arrangement of the gui should be appalling to the user. Familiar field labels Consistent terminology and abbreviations Consistent use of color Visible space and boundaries for data-entry fields Convenient cursor movement Error correction for individual characters and entire fields Error messages for unacceptable values Optional fields marked clearly Explanatory messages for field Completion signal PROTOTYPING Building a working model of a database system. It is a working model that does not normally have all the required features or provide all the functionality of the final system. IMPLEMENTATION The physical realization of the database and application designs. DATA CONVERSION AND LOADING Transferring any existing data into the new database and converting any existing applications to run on the new database. TESTING The process of running the database system with the intent of finding errors. The criteria used are; Learnability – How long does it take a new user to become productive with the system? Performance – How well does the system response match the user’s work practice? Robustness – How tolerant is the system of user error? Recoverability – How good is the system at recovering from user errors? Adaptability – How closely is the system tied to a single model of work? OPERATIONAL MAINTENANCE The process of monitoring and maintaining the database system following installation.
- Monitoring the performance of the system. If the performance falls below an acceptable level, tuning or reorganization of the database may be required.
- Maintaining and upgrading the database system (when required). New requirements are incorporated into the database system through the preceding stages of the lifecycle TOPIC FOUR RELATIONAL MODEL Each relational data model is logically structured within relations (tables). Each relation has a name and is made up of named attributes (columns) of data. Each tuple (row) contains one value per attributes. These makes relational model very strong. Proposed in 1970 by Codd. Logically seeing data is relating data without considering the physical storage problems. Terminology It is based on mathematical concept of a relation , which is physically represented as a table. By relating entities attributes and tuples in that order. Characteristics of a relational table
- It must be perceived as a two-dimensional structure composed of rows and columns.
- Each tuple represents a single entity occurrence within the entity set.
- Each table column represents an attribute, and each column has a distinct name.
- Each row/column intersection represents a single data value.
- All values in a column must conform to the same data format. (datatype).
- Each column has a specific range of values known as the attribute domain.
- The order of the rows and columns is immaterial in DBMS.
- Each table must have an attribute or a combinations of attributes that uniquely identifies each row. Relational data structure Relation- a relation is a table with columns and rows. RDBMS requires only that the database be perceived by the user as tables. This only applies in external and conceptual levels. Attribute- it is a named column of a relation. Domain- Is a set of allowable values for one or more attributes. It is a powerful feature of the relational model. Every attribute in a relation is defined on a domain. Domains may be distinct for each attribute, or two or more attributes may be defined on the same domain. The domain concept is important because it allows the user to define in a central place the meaning and source of values that attributes can hold. It is also important when making relationships cause all related attributes must have the same domain. Tuple- is a row of relation. The elements of a relation are the rows or t uples in the table. Degree- the degree of a relation is the number of attributes it contains. The number of attributes gives the degree.
Super key An attribute, or set of attributes, that uniquely identifies a tuple within a relation. It can contain several additional attributes that may not necessary be for unique identification. Candidate key A super key such that no proper subset is a super key within the relation. Identifies one unique column in a database without referring to another. A candidate key, K , for a relation R has two properties: ▲ uniqueness – in each tuple of R , the values of K uniquely identify that tuple; ▲ irreducibility – no proper subset of K has the uniqueness property. A key consists of more than one attribute is called composite key. Primary key The candidate key that is selected to identify tuples uniquely within the relation. Alternate keys The candidate keys that are not selected to be the primary key. Foreign key An attribute, or set of attributes, within one relation that matches the candidate key of some (possibly the same) relation. Secondary key: a key that is used strictly for data retrieval purposes.
Integrity constraints there are two types of the constraints;
- entity integrity. It applies to the primary keys of base relations. Base relation is a relation that corresponds to an entity I n the conceptual schema. Thus, no attribute of a primary key can be null. It must be unique to ensure that each row is uniquely identified by the primary key.
- referential integrity. if a foreign key exists in a relation, either the foreign key value must match a candidate key value of some tuple in its home relation or the foreign key value must be wholly null. It is an attribute whose value match the primary key value in a related table. Relational schema: it is a representation of the entities and it attribute placed outside and attributes in ().
INTEGRITY RULES
ENTITY INTEGRITY DESCRIPTION
Requirement = All primary key entries are unique, and no part of a primary key may be null. Purpose = Each row will have a unique identity, and foreign key values can properly reference primary key values. Example No invoice can have a duplicate number, nor can it be null. In short, all invoices are uniquely identified by their invoice number. REFERENTIAL INTEGRITY DESCRIPTION Requirement A foreign key may have either a null entry, as long as it is not a part of its table’s primary key, or an entry that matches the primary key value in a table to which it is related. (Every non-null foreign key value must reference an existing primary key value.) Purpose It is possible for an attribute NOT to have a corresponding value, but it will be impossible to have an invalid entry. The enforcement of the referential integrity rule makes it impossible to delete a row in one table whose primary key has mandatory matching foreign key values in another table. Example A customer might not yet have an assigned sales representative (number), but it will be impossible to have an invalid sales representative (number).
- Multiplicity. The number (or range) of possible occurrences of an entity type that may relate to a single occurrence entity type through a particular relationship. It is the representation of business rules. It constrains the way entities are related.
- General constraints. Additional rules specified by the users or database administrators of a database the define or constrain some aspect of the enterprise.
- Not null and unique constraints are also applied. Null Represents a value for an attribute that is currently unknown or is not applicable for this tuple. Remember zeros and spaces are values. To avoid nulls special codes known as flags are used. Relational set operators They are 8 of them;
- Union. Combines all rows from two tables, excluding duplicate rows. The tables must have the same attribute characteristics (the columns and domains must be identical).
VIEWS
Views in relational model means that view is virtual or derived relation: meaning a relation that does not necessarily exist in its own right, but may be dynamically derived from one or more base relations. Terminology Base relation A named relation corresponding to an entity in the conceptual schema, whose tuples are physically stored in the database. Views are then defined from base relations: View the dynamic results of one or more relational operations operating on the base relations to produce another relation. A view is a virtual relation that does not necessarily exist in the database but can be produced upon request by particular user, at the time of request. If any changes are made in the base relations the change is immediately reflected in the view. Also changes made in the view is made in the base relation underlying relations. Purpose of views
- It provides a powerful and flexible security mechanism by hiding parts of the database from certain users. Users are not aware of the existence of any attributes or tuples that are missing from the view.
- It permits users to access data in a way that is customized to their needs, so that the same data can be seen by different users in different ways, at the same time.
- It can simplify complex operations on the base relations. Updating views Changes made either in view or base relation must reflect immediately. But there are restrictions to how much change can be done through view. The conditions are;
- Updates are allowed through a view defined using a simple query involving a single base relation and containing either the primary key or a candidate key of the base relation.
- Updates are not allowed through views involving multiple base relations.
- Updates are not allowed through views involving aggregation or grouping operations. Classes of views have been defined that are theoretically not updatable, theoretically updatable and partially updatable. Advantages and disadvantages of views
- Data independence The view only shows the required rows and columns thus any change in the rows and columns in source file won’t affect the view. Thus, any change in the underlining relation cannot affect the data view.
- Currency The changes (query) done in the base table are immediately reflected on view.
- Improved security Users view part of base table that is relevant to them. This makes sure that the user doesn’t make changes to the table that is not relevant to them.
- Reduced complexity View reduces the number of queries that the end user need to know so as to access the database.
- Convenience It is easy to see the data that the user needs unlike going through the whole database looking for what the user needs.
- Customization Provides way to represent the same base tables in different appearance to the users.
- Data integrity It is ease to control data addition to the base table is done through the view.
- Update restriction If the view doesn’t allow the update the user can’t update the base tables.
- Structure restriction View structure is determined during the creation time thus any change in the base table will need the view to be defined again.
- Performance If the view requires to read a lot of base tables this may lead to slowness in the view.
TOPIC FIVE ENTITY- RELATIONAL MODEL Entity type A group of objects with the same properties, which are identified by the enterprise as having and independent existence. The real-world names like students, client and finance as defined by the business rules. The first name of and entity is in caps. Entity occurrence A uniquely identifiable object of an entity type. Created using rectangles. Relationship types