Relational Model & Algebra for Database Systems (CPS 116), Slides of Introduction to Database Management Systems

An introduction to the relational model and algebra for database systems, including concepts such as relations, attributes, domains, tuples, schemas, instances, selection, projection, cross product, union, difference, and renaming. It also covers derived operators like join and natural join.

Typology: Slides

2011/2012

Uploaded on 01/29/2012

arold
arold 🇺🇸

4.7

(24)

372 documents

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Relational Model & Algebra
CPS 116
Introduction to Database Systems
2
Announcements
Please read news: duke.cs.cps116
Please sign up for DB2 accounts and indicate your
availability to attend discussion sessions (sign-up sheet is
circulating)
Lectures slides on Web
“Notes” version available in hardcopy and online an hour before
lecture
Complete version available online after the lecture
Homework #1 will be assigned next Tuesday
Office hours: see course Web page
Book update: Prentice-Hall sent something out yesterday
3
Relational data model
A database is a collection of relations (or tables)
Each relation has a list of attributes (or columns)
Set-valued attributes not allowed
Each attribute has a domain (or type)
Each relation contains a set of tuples (or rows)
Duplicate tuples are not allowed
Simplicity is a virtue!
4
Example
Student Course
Enroll
Ordering of rows doesn’t matter
(even though the output is
always in some order)
SID name age GPA
142 Bart 10 2.3
123 Milhouse 10 3.1
857 Lisa 8 4.3
456 Ralph 8 2.3
... ... ... ...
CID title
CPS116 Intro. to Database Systems
CPS130 Analysis of Algorithms
CPS114 Computer Networks
... ...
SID CID
142 CPS116
142 CPS114
123 CPS116
857 CPS116
857 CPS130
456 CPS114
... ...
5
Schema versus instance
Schema (metadata)
Specification of how data is to be structured logically
Defined at set-up
Rarely changes
Instance
Content
Changes rapidly, but always conforms to the schema
Compare to type and objects of type in a
programming language
6
Example
Schema
Student (SID integer, name string, age integer, GPA float)
Course (CID string, title string)
Enroll (SID integer, CID integer)
Instance
{ h142, Bart, 10, 2.3i, h123, Milhouse, 10, 3.1i, ...}
{ hCPS116, Intro. to Database Systemsi, ...}
{ h142, CPS116i, h142, CPS114i, ...}
pf3
pf4
pf5

Partial preview of the text

Download Relational Model & Algebra for Database Systems (CPS 116) and more Slides Introduction to Database Management Systems in PDF only on Docsity!

Relational Model & Algebra

CPS 116

Introduction to Database Systems

Announcements

™ Please read news: duke.cs.cps

™ Please sign up for DB2 accounts and indicate your

availability to attend discussion sessions (sign-up sheet is

circulating)

™ Lectures slides on Web

ƒ “Notes” version available in hardcopy and online an hour before

lecture

ƒ Complete version available online after the lecture

™ Homework #1 will be assigned next Tuesday

™ Office hours: see course Web page

™ Book update: Prentice-Hall sent something out yesterday

3

Relational data model

™ A database is a collection of relations (or tables)

™ Each relation has a list of attributes (or columns)

ƒ Set-valued attributes not allowed

™ Each attribute has a domain (or type)

™ Each relation contains a set of tuples (or rows)

ƒ Duplicate tuples are not allowed

) Simplicity is a virtue!

4

Example

Student Course

Enroll

Ordering of rows doesn’t matter

(even though the output is

always in some order)

SID name age GPA 142 Bart 10 2. 123 Milhouse 10 3. 857 Lisa 8 4. 456 Ralph 8 2. ... ... ... ...

CID title CPS116 Intro. to Database Systems CPS130 Analysis of Algorithms CPS114 Computer Networks ... ...

SID CID

142 CPS

142 CPS

123 CPS

857 CPS

857 CPS

456 CPS

5

Schema versus instance

™ Schema (metadata)

ƒ Specification of how data is to be structured logically

ƒ Defined at set-up

ƒ Rarely changes

™ Instance

ƒ Content

ƒ Changes rapidly, but always conforms to the schema

) Compare to type and objects of type in a

programming language

6

Example

™ Schema

ƒ Student ( SID integer, name string, age integer, GPA float)

ƒ Course ( CID string, title string)

ƒ Enroll ( SID integer, CID integer)

™ Instance

ƒ { h142, Bart, 10, 2.3i, h123, Milhouse, 10, 3.1i, ...}

ƒ { hCPS116, Intro. to Database Systemsi, ...}

ƒ { h142, CPS116i, h142, CPS114i, ...}

Relational algebra operators

™ Core set of operators:

ƒ Selection, projection, cross product, union, difference, and

renaming

™ Additional, derived operators:

ƒ Join, natural join, intersection, etc.

RelOp

RelOp

Selection

™ Input: a table R

™ Notation: σ p ( R )

ƒ p is called a selection condition/predicate

™ Purpose: filter rows according to some criteria

™ Output: same columns as R , but only rows of R that

satisfy p

9

Selection example

™ Students with GPA higher than 3.

σ GPA > 3.0 ( Student )

σ GPA > 3.

SID name age GPA 142 Bart 10 2. 123 Milhouse 10 3. 857 Lisa 8 4. 456 Ralph 8 2.

SID name age GPA 142 Bart 10 2. 123 Milhouse 10 3. 857 Lisa 8 4. 456 Ralph 8 2.

10

More on selection

™ Selection predicate in general can include any

column of R , constants, comparisons such as =, ·,

etc., and Boolean connectives ∧, ∨, and ¬

ƒ Example: straight A students under 18 or over 21

σ GPA ≥ 4.0 ∧ ( age < 18 ∨ age > 21) ( Student )

™ But you must be able to evaluate the predicate over

a single row of the input table

ƒ Example: student with the highest GPA

σ GPA ≥ all GPA in Student table ( Student )

11

Projection

™ Input: a table R

™ Notation: π L ( R )

ƒ L is a list of columns in R

™ Purpose: select columns to output

™ Output: same rows, but only the columns in L

12

Projection example

™ ID’s and names of all students

π SID , name ( Student )

π SID , name

SID name age GPA 142 Bart 10 2. 123 Milhouse 10 3. 857 Lisa 8 4. 456 Ralph 8 2.

SID name age GPA 142 Bart 10 2. 123 Milhouse 10 3. 857 Lisa 8 4. 456 Ralph 8 2.

Derived operator: natural join

™ Input: two tables R and S

™ Notation: R  S

™ Purpose: relate rows from two tables, and

ƒ Enforce equality on all common attributes

ƒ Eliminate one copy of common attributes

™ Shorthand for π L ( R  p S )

ƒ L is the union of all attributes from R and S , with

duplicate attributes removed

ƒ p equates all attributes common to R and S

SID name age GPA SID CID 142 Bart 10 2.3 142 CPS 142 Bart 10 2.3 142 CPS 142 Bart 10 2.3 123 CPS 123 Milhouse 10 3.1 142 CPS 123 Milhouse 10 3.1 142 CPS 123 Milhouse 10 3.1 123 CPS ... ... ... ... ... ...

Student. SID = Enroll. SID

SID name age GPA 142 Bart 10 2. 123 Milhouse 10 3. ... ... ... ...

SID CID

142 CPS

142 CPS

123 CPS

Natural join example

™ Student  Enroll = π? ( Student ? Enroll )

= π SID , name , age , GPA , CID ( Student  Student. SID = Enroll. SID Enroll )

21

Union

™ Input: two tables R and S

™ Notation: R ∪ S

ƒ R and S must have identical schema

™ Output:

ƒ Has the same schema as R and S

ƒ Contains all rows in R and all rows in S , with duplicates

eliminated

22

Difference

™ Input: two tables R and S

™ Notation: R − S

ƒ R and S must have identical schema

™ Output:

ƒ Has the same schema as R and S

ƒ Contains all rows in R that are not found in S

23

Derived operator: intersection

™ Input: two tables R and S

™ Notation: R ∩ S

ƒ R and S must have identical schema

™ Output:

ƒ Has the same schema as R and S

ƒ Contains all rows that are in both R and S

™ Shorthand for R − ( R − S )

™ Also equivalent to S − ( S − R )

™ And to R  S

24

Renaming

™ Input: a table R

™ Notation: ρ S ( R ), or ρ S ( A

1 ,^ A 2 , …)^

( R )

™ Purpose: rename a table and/or its columns

™ Output: a renamed table with the same rows as R

™ Used to

ƒ Avoid confusion caused by identical column names

ƒ Create identical columns names for natural joins

Renaming example

™ SID’s of students who take at least two courses

Enroll ? Enroll

π SID ( Enroll  Enroll. SID = Enroll. SID ∧ Enroll. CID ≠ Enroll. CID Enroll )

ρ Enroll 1( SID 1, CID 1) ρ Enroll 2( SID 2, CID 2)

Enroll Enroll

 SID 1 = SID 2 ∧ CID 1 ≠ CID 2

π SID 1

Summary of core operators

™ Selection: σ p ( R )

™ Projection: π L ( R )

™ Cross product: R × S

™ Union: R ∪ S

™ Difference: R − S

™ Renaming: ρ S ( A

1 ,^ A 2 , …)^

( R )

ƒ Does not really add to processing power

27

Summary of derived operators

™ Join: R  p S

™ Natural join: R  S

™ Intersection: R ∩ S

™ Many more

ƒ Semijoin, anti-semijoin, quotient, …

28

An exercise

™ CID’s of the courses that Lisa is NOT taking

CID’s of the courses

that Lisa IS taking

All CID’s

π CID

Course

Enroll

Student

σ name = “ Lisa”

π CID

29

Another exercise

™ Names of students in Lisa’s classes

Students in

Lisa’s classes Student

Their names π name

Enroll

π SID

Enroll

Lisa’s classes π CID

Student

σ name = “ Lisa”

Who’s Lisa?

30

A trickier exercise

™ Who has the highest GPA?

ƒ Who does NOT have the highest GPA?

ƒ Whose GPA is lower than somebody else’s?

π SID

Student

Student Student

ρ Student 1 ρ Student 2

 Student 1. GPA < Student 2. GPA

π Student 1. SID

A deeper question:

When (and why) is “−” needed?

Turing machine?

™ Relational algebra has no recursion

ƒ Example of something not expressible in relational

algebra: Given relation Parent ( parent , child ), who are

Bart’s ancestors?

™ Why not recursion?

ƒ Optimization becomes undecidable

ƒ You can always implement it at the application level

ƒ Recursion is added to SQL nevertheless