





















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Data modelling material from pdms course
Typology: Study notes
1 / 29
This page cannot be seen from the preview
Don't miss anything!






















✓ The entity-relationship (ER) model is a popular high-level conceptual data model. ✓ Conceptual modeling is a very important phase in designing a successful database application. ✓ Generally, the term database application refers to a particular database and the associated programs that implement the database queries and updates. ✓ ER model and its variations are frequently used for the conceptual design of database applications, and many database design tools employ its concepts. ✓ The diagrammatic notation associated with the ER model, known as ER diagrams
✓ COMPANY database keeps track of a company’s employees, departments, and projects. ✓ The company is organized into departments. Each department has a unique name, a unique number, and a particular employee who manages the department. ✓ A department controls a number of projects, each of which has a unique name, a unique number, and a single location. ✓ The database will store each employee’s name, Social Security number,2 address, salary, sex (gender), and birth date. ✓ An employee is assigned to one department, but may work on several projects, which are not necessarily controlled by the same department.
✓ Most attributes have a single value for a particular entity; such attributes are called single-valued. ○ Age is a single-valued attribute of a person. ✓ When there are possibilities of multiple values for a particular entity; such attributes are called multi-valued. ○ A Colors attribute for a car, or A College_degrees attribute for a person. ✓ A multivalued attribute may have lower and upper bounds to constrain the number of values allowed for each individual entity. ○ The Colors attribute of a car may be restricted to have between one and two values
✓ In some cases, two (or more) attribute values are related—for example, the Age and Birth_date attributes of a person. ✓ The Age attribute is called a derived attribute and is said to be derivable from the Birth_date attribute, which is called a stored attribute.
✓ Group of components of a composite attribute between parentheses ( ) and separating the components with commas, and by displaying multivalued attributes between braces { }. {Address_phone({Phone(Area_code,Phone_number)},Address(Street_addr ess (Number,Street,Apartment_number),City,State,Zip) )}
✓ In some cases, a particular entity may not have an applicable value for an attribute. ✓ For such situations, a special value called NULL is created. ○ A College_degrees attribute applies only to people with college degrees. A person with no college degree would have NULL for College_degrees ✓ NULL can also be used if we do not know the value of an attribute for a particular entity: Unknown ○ When it is known that the attribute value exists but is missing ○ When it is not known whether the attribute value exists Entity Types, Entity Sets, Keys, and Value Sets ✓ An entity type defines a collection (or set) of entities that have the same attributes. ✓ Each entity type in the database is described by its name and attributes. ✓ The collection of all entities of a particular entity type in the database at any point in time is called an entity set or entity collection
✓ An entity type usually has one or more attributes whose values are distinct for each individual entity in the entity set: key attribute ✓ Specifying that an attribute is a key of an entity type means that the preceding uniqueness property must hold for every entity set of the entity type. ✓ It is not the property of a particular entity set; rather, it is a constraint on any entity set of the entity type at any point in time. ✓ An entity type may also have no key, in which case it is called a weak entity type. ✓ Each simple attribute of an entity type is associated with a value set (or domain of values), which specifies the set of values that may be assigned to that attribute for each individual entity. Notations ✓ An entity type is represented in ER diagrams as a rectangular box enclosing the entity type name. ✓ Attribute names are enclosed in ovals and are attached to their entity type by straight lines. ✓ Composite attributes are attached to their component attributes by straight lines. ✓ Multivalued attributes are displayed in double ovals. ✓ Key attribute has its name underlined inside the oval ✓ The Registration attribute is an example of a composite key formed from two simple component attributes, State and Number, neither of which is a key on its own.
✓ The degree of a relationship type is the number of participating entity types. ✓ A relationship type of degree two is called binary, and one of degree three is called ternary.
Binary Relationship Constraints ✓ Relationship types usually have certain constraints that limit the possible combinations of entities that may participate in the corresponding relationship set. Two main types of binary relationship constraints: ✓ Cardinality ratio : maximum number of relationship instances that an entity can participate in. The possible cardinality ratios for binary relationship types are 1: (Employee [Manages] Department), 1:N (Department:Employee), N:1 (Student: Project), and M:N (Employee [Works on] Project). Cardinality ratios for binary relationships are represented on ER diagrams by displaying 1, M, and N on the diamonds. ✓ Participation : This constraint specifies the minimum number of relationship instances that each entity can participate in: minimum cardinality constraint. Two types of participation constraints ✓ Total : If a company policy states that every employee must work for a department, then an employee entity can exist only if it participates in at least one WORKS_FOR relationship instance: existence dependency ✓ Partial : we do not expect every employee to manage a department, so the participation of EMPLOYEE in the MANAGES relationship type is partial ✓ In ER diagrams, total participation (or existence dependency) is displayed as a double line connecting the participating entity type to the ✓ relationship, whereas partial participation is represented by a single line
✓ The relationship type that relates a weak entity type to its owner the identifying relationship of the weak entity type ✓ A weak entity type always has a total participation constraint (existence dependency) with respect to its identifying relationship because a weak entity cannot be identified without an owner entity. ✓ Not every existence dependency results in a weak entity type. ✓ For example, a DRIVER_LICENSE entity cannot exist unless it is related to a PERSON entity, even though it has its own key (License_number) and hence is not a
✓ A weak entity type normally has a partial key, which is the attribute that can uniquely identify weak entities that are related to the same owner entity. ✓ For example, if we assume that no two dependents of the same employee ever have the same first name, the attribute Name of DEPENDENT is the partial key. ✓ In ER diagrams, both a weak entity type and its identifying relationship are distinguished by surrounding their boxes and diamonds with double lines. The partial key attribute is underlined with a dashed or dotted line. Summary of Notation for ER Diagrams ✓ Regular (strong) entity types such as EMPLOYEE, DEPARTMENT, and PROJECT are shown in rectangular boxes. ✓ Relationship types such as WORKS_FOR, MANAGES, CONTROLS, and WORKS_ON are shown in diamond-shaped boxes attached to the participating entity types with straight lines. ✓ Attributes are shown in ovals, and each attribute is attached by a straight line to its entity type or relationship type. ✓ Component attributes of a composite attribute are attached to the oval representing the composite attribute, as illustrated by the Name attribute of EMPLOYEE. ✓ Multivalued attributes are shown in double ovals, as illustrated by the Locations attribute of DEPARTMENT. ✓ Key attributes have their names underlined. Derived attributes are shown in dotted ovals, as illustrated by the Number_of_employees attribute of DEPARTMENT.
✓ Weak entity types are distinguished by being placed in double rectangles and by having their identifying relationship placed in double diamonds, as illustrated by the DEPENDENT entity type and the DEPENDENTS_OF identifying relationship type. ✓ The partial key of the weak entity type is underlined with a dotted line. ✓ The cardinality ratio of each binary relationship type is specified by attaching a 1, M, or N on each participating edge. ✓ The cardinality ratio of DEPARTMENT: EMPLOYEE in MANAGES is 1:1, whereas it is 1:N for DEPARTMENT: EMPLOYEE in WORKS_FOR, and M:N for WORKS_ON. ✓ The participation constraint is specified by a single line for partial participation and by double lines for total participation (existence dependency).
Ex: ✓ The domain of Shift has the set of all possible days: {Mon, Tue, Wed…}. ✓ The domain of Salary is the set of all floating-point numbers greater than 0 and less than 200,000. ✓ The domain of First Name is the set of character strings that represents names of people. ✓ Employee_ages: Possible ages of employees in a company; each must be an integer value between 15 and 80. ✓ Usa_phone_numbers. The set of ten-digit phone numbers valid in the United States ✓ A data type or format is also specified for each domain. ✓ A relation schema R, denoted by R(A1, A2, … , An), is made up of a relation name R and a list of attributes, A1, A2, … , An. ✓ D is called the domain of Ai and is denoted by dom(Ai). ✓ A relation schema is used to describe a relation; R is called the name of this relation. ✓ The degree (or arity) of a relation is the number of attributes n of its relation schema. ✓ A relation of degree seven, which stores information about university students, would contain seven attributes describing each student as follows: ○ STUDENT(Name, Ssn, Home_phone, Address, Office_phone, Age, Gpa) ○ STUDENT(Name: string, Ssn: string, Home_phone: string, Address: string, Office_phone: string, Age: integer, Gpa: real) ○ STUDENT relation: dom(Name) = Names; dom(Ssn) = Social_security_numbers; dom(HomePhone) = USA_phone_numbers , dom(Office_phone) = USA_phone_numbers, and dom(Gpa) = Grade_point_averages. ✓ A relation state at a given time—the current relation state—reflects only the valid tuples that represent a particular state of the real world. ✓ It is possible for several attributes to have the same domain. Ordering of Tuples in a Relation ✓ Tuples in a relation do not have any particular order.
✓ Tuple ordering is not part of a relation definition because a relation attempts to represent facts at a logical or abstract level. ✓ Ex: (Two identical tuples when the order of attributes and values is not part of relation definition) ○ t = < (Name, Dick Davidson),(Ssn, 422-11-2320),(Home_phone, NULL),(Address, 3452 Elgin Road), (Office_phone, (817)749-1253),(Age, 25),(Gpa, 3.53)> ○ t = < (Address, 3452 Elgin Road),(Name, Dick Davidson),(Ssn, 422-11- 2320),(Age, 25), (Office_phone, (817)749-1253),(Gpa, 3.53),(Home_phone, NULL)> Values and NULLs in the Tuples ✓ Each value in a tuple is an atomic value; that is, it is not divisible into components within the framework of the basic relational model. ✓ Hence, composite and multivalued attributes are not allowed. ✓ This model is sometimes called the flat relational model. ✓ Several meanings for NULL values, ✓ Value unknown, ✓ Value exists but is not available, ✓ Attribute does not apply to this tuple (also known as value undefined). Interpretation (Meaning) of a Relation ✓ The relation schema can be interpreted as a declaration or a type of assertion. ✓ For example, the schema of the STUDENT relation asserts that, in general, a student entity has a Name, Ssn, Home_phone, Address, Office_phone, Age, and Gpa. ✓ Each tuple in the relation can then be interpreted as a fact or a particular instance of the assertion. ✓ For example, the first tuple asserts the fact that there is a STUDENT whose Name is Dick Davidson, Ssn is 422-11-2320, Age is 25, and so on. ✓ An alternative interpretation of a relation schema is as a predicate; in this case, the values in each tuple are interpreted as values that satisfy the predicate. ✓ For example, the predicate STUDENT (Name, Ssn, …) is true for the five tuples in relation STUDENT of Figure. ✓ These tuples represent five different propositions or facts in the real world.
✓ Application-based or semantic constraints or business rules: Expressed and enforced by the application programs or in some other way. Example: No employee may have a salary greater than that of her supervisor.
Domain Constraints: ✓ Specify that within each tuple, the value of each attribute A must be an atomic value from the domain dom(A). Key Constraints: ✓ A superkey SK specifies a uniqueness constraint that no two distinct tuples in any state r of R can have the same value for SK. ✓ A key k of a relation schema R is a superkey of R with the additional property that removing any attribute A from K leaves a set of attributes K′ that is not a superkey of R anymore. ✓ A superkey from which we cannot remove any attributes and still have the uniqueness constraint hold - minimal superkey. This minimality property is required for a key but is optional for a superkey. ✓ A key is a superkey but not vice versa. ✓ A superkey may be a key (if it is minimal) or may not be a key (if it is not minimal). ✓ Ex: The attribute set {Ssn} is a key of STUDENT because no two student tuples can have the same value for Ssn. Any set of attributes that includes Ssn—for example, {Ssn, Name, Age}—is a superkey. ✓ Any superkey formed from a single attribute is also a key ✓ A key with multiple attributes must require all its attributes together to have the uniqueness property.
Primary Key: ✓ Used to identify one and only one instance of an entity uniquely. ✓ The attributes that form the primary key of a relation schema are underlined Superkey ✓ A set of an attribute which can uniquely identify a tuple. ✓ For example: In the EMPLOYEE table, for(EMPLOEE_ID, EMPLOYEE_NAME) the name of two employees can be the same, but their EMPLYEE_ID can't be the same. Hence, this combination can also be a key. Candidate Key ✓ An attribute or set of an attribute which can uniquely identify a tuple. ✓ The remaining attributes except for primary key are considered as a candidate key. ✓ The candidate keys are as strong as the primary key. ✓ It is usually better to choose a primary key with a single attribute or a small number of attributes. The other candidate keys are designated as unique keys and are not underlined.
Foreign Key ✓ Foreign keys are the column of the table which is used to point to the primary key of another table. ✓ Eg: We add the primary key of the DEPARTMENT table, Department_Id as a new attribute in the EMPLOYEE table. Now in the EMPLOYEE table, Department_Id is the foreign key, and both the tables are related. Relational Model Constraints: ✓ Constraints on NULL Values: ✓ Ex: If every STUDENT tuple must have a valid, non-NULL value for the Name attribute, then Name of STUDENT is constrained to be NOT NULL. ✓ Relational Databases and Relational Database Schemas: ✓ A relational database schema S is a set of relation schemas S = {R₁, R₂, … , Rm} and a set of integrity constraints IC. ✓ A relational database state DB of S is a set of relation states DB = {r₁, r₂, … , rm} such that each ri is a state of Ri and such that the ri relation states satisfy the integrity constraints specified in IC. ✓ Relational database schema example: ✓ COMPANY ={EMPLOYEE, DEPARTMENT, DEPT_LOCATIONS, PROJECT, WORKS_ON, DEPENDENT} ✓ In each relation schema, the underlined attribute represents the primary key. ✓ One possible database state for the COMPANY relational database schema
✓ The entity integrity constraint states that no primary key value can be NULL. ✓ This is because the primary key value is used to identify individual tuples in a relation. ✓ Having NULL values for the primary key implies that we cannot identify some tuples. ✓ For example, if two or more tuples had NULL for their primary keys, we may not be able to distinguish them if we try to reference them from other relations. ✓ Key constraints and entity integrity constraints are specified on individual relations
✓ The referential integrity constraint is specified between two relations and is used to maintain the consistency among tuples in the two relations. ✓ For example, in the following Figure, the attribute Dno of EMPLOYEE gives the department number for which each employee works; hence, its value in every EMPLOYEE tuple must match the Dnumber value of some tuple in the DEPARTMENT relation. ✓ The attributes in FK have the same domain(s) as the primary key attributes PK of R₂; the attributes FK are said to reference or refer to the relation R₂. ✓ t1[FK] = t2[PK], the tuple t1 references or refers to the tuple t2. ✓ In this definition, R₁ is called the referencing relation and R₂ is the referenced relation. ✓ If these two conditions hold, a referential integrity constraint from R₁ to R₂ is said to hold. ✓ Foreign keys join tables and establish dependencies between tables. tables can form a hierarchy of dependencies in such a way that if you change or delete a row in one table, you destroy the meaning of rows in other tables. ✓ Referential integrity is the logical dependency of a foreign key on a primary key. ✓ Referential integrity constraints displayed on the COMPANY relational database schema.