Winter 2003 Database Lecture Notes: Views, SQL Queries, and Relational Algebra, Study Guides, Projects, Research of Database Management Systems (DBMS)

Lecture notes from a winter 2003 database course. The notes cover views, sql queries, and relational algebra. Topics include the difference between physical, logical, and external schemas, creating views using the create view statement, using views for security, and sql query examples using selection, projection, set union, set difference, and cross product.

Typology: Study Guides, Projects, Research

Pre 2010

Uploaded on 08/19/2009

koofers-user-pwy
koofers-user-pwy 🇺🇸

10 documents

1 / 19

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Winter 2003 1
Today’s Lecture
A little bit on views
Basic SQL queries
Relational Algebra
Winter 2003 2
R ecom m en ded R eadi n g s
Chapter 3
Section 3.6.1, 3.7
Chapter 5
Section 5.2.1
Chapter 4
Section 4.1, 4.2
SQL for Web nerds – Simple queries
The link is on the class web page
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13

Partial preview of the text

Download Winter 2003 Database Lecture Notes: Views, SQL Queries, and Relational Algebra and more Study Guides, Projects, Research Database Management Systems (DBMS) in PDF only on Docsity!

Winter 2003 1

Today’s Lecture

  • A little bit on views
  • Basic SQL queries
  • Relational Algebra

Winter 2003 2

R ecom m en ded R eadi n g s

  • Chapter 3
    • Section 3.6.1, 3.
  • Chapter 5
    • Section 5.2.
  • Chapter 4
    • Section 4.1, 4.
  • SQL for Web nerds – Simple queries
    • The link is on the class web page

Winter 2003 3

V iew s

  • Physical schema
  • Logical/Conceptual schema
  • External schema
    • What applications can see
  • Views mechanism supports logical data independence

Disk

Students, Courses, Enrollments, Faculties

YoungActiveStudents, Faculty-info

Physical

Logical

External

End users Web apps

Winter 2003 4

V iew s

  • External schema is created by defining views over the

logical schema

  • Of course, sometimes the logical schema serves well as

the external schema but sometimes we need to define new

“relations” over existing ones

  • A view is a relation, defined in terms of other relations, but

its tuples are not stored.

  • Virtual views versus Materialized views

Winter 2003 7

R el ati on al M odel : Sum m ary

  • A tabular representation of data.
  • Simple and intuitive, currently the most widely used.
  • Integrity constraints can be specified by the DBA, based on application semantics. DBMS checks for violations. - Two important ICs: primary and foreign keys - In addition, we always have domain constraints.
  • Rules to translate ER to relational model
  • How do we query relational data?

Winter 2003 8

Q ueryi n g w ith SQ L

  • SQL is the most popular querying language for relational

DBMS

  • Basic form:

SELECT [DISTINCT] c 1 , c 2 , …, cm

FROM R 1 , R 2 , …, Rn

[WHERE condition ]

  • What is the semantics (or meaning) of such a query?

Attribute names

Relation names

Winter 2003 9

E x am p le 1 – a si m p ly q uery

SELECT age FROM Students

  • For every tuple in Students, emit only the age component.

2346 Jane 19

8992 John 21

2187 Ann 20

1124 John 21

Sid Name Age

Age

Students

equivalent to:

SELECT age FROM Students WHERE TRUE

Note the multiset semantics

Winter 2003 10

E x am p le 2 - D I STI N C T

SELECT DISTINCT age FROM Students

  • For every tuple in Students, emit only the age component. Take the set of resulting values (i.e., remove duplicates)

2346 Jane 19

8992 John 21

2187 Ann 20

1124 John 21

Sid Name Age

Age Students

Winter 2003

Example 5 – M u lt i ple R elat i o n s in F R O M

c lau s e

  • What does the following query compute?

SELECT S.name, E.cid

FROM Students S, Enrolled E

WHERE S.sid=E.sid AND E.grade=“A”

S.name E.cid

Smith Topology

sid cid grade

53831 Carnatic101 C

53831 Reggae203 B

53650 Topology112 A

53666 History105 B

53831 John 21

21870 Ann 20

53630 Smith 21

Sid Name Age

Students

Enrolled

S E

S

E

Winter 2003 14

C on di ti on

  • The WHERE clause consists of a boolean combination

of conditions using logical connectives AND, OR, NOT.

  • Each condition is of the form

expression op expression

  • expression is a column name, a constant, or an

arithmetic or string expression

  • op is a comparsion operator ( <, , =, <>, ≥, > )
  • Examples:
    • age > shoesize,
    • name <> “Jane” AND salary*0.1 > 100

Winter 2003 15

M ean i n g of an SQ L q uery

SELECT [DISTINCT] c 1 , c 2 , …, cm FROM R 1 , R 2 , …, Rn [WHERE condition ]

  • For every n -ary tuple t (one from R 1 , one from R 2 , …, one from Rn),
    • if t satisfies condition ( i.e., condition evaluates to true), then emit the c 1 , c 2 , …, cm components of t.
  • Let Result denote the collection of all emitted results
  • If DISTINCT is stated in the SELECT clause, remove duplicates in Result (i.e., Result is a set of tuples. Otherwise, Result is a bag of tuples)
  • Note that the order of emitted results is not important
  • Try out on your own on PostgreSQL!

UNION, DIFFERENCE

Winter 2003 16

M ore on SQ L

  • Many more features in SQL! (look at the size of the manual)
  • Before we leap …
    • what is the foundation/properties of SQL?
  • Recall that a data model consists of two parts
    • Formalism for describing data
    • Set of operations used to manipulate data
    • Relational algebra, relational calculus are two other different ways of describing operations in the relational model
  • relational algebra and relational calculus are terse and formal.
    • Without the verboseness of SQL and good for reasoning
    • Form the basis of SQL

Winter 2003 19

Sel ecti on - σcondition( R )

  • Unary operation
    • Input: R(A 1 , …, An)
    • Output relation has attributes A 1 , …, A n
      • Note:
        • named field notation: we have assumed that the attributes of input relation are known and “passed along” to the output. Easier to understand
        • Strictly speaking, the attributes are not known and we use position numbers instead to refer to the fields. (both are used in SQL)
    • Meaning: Takes a relation R and extracts only rows from R that satisfy the condition.
  • Output is always a set (no duplicates)

Winter 2003 20

E x am p le - Sel ect

  • σrating > 6 (Hotels)
  • Positional notation: σ$3>6 (Hotel)

ELodge 2nd E st 6.0 40

ELogde 39 W st 5.6 45

BestInn 45 th^ st 6.7 28

Astoria 5 th^ ave 8.0 231

Windsor 54 th^ ave 6.0 135

name address rating capacity

BestInn 45 th^ st 6.7 28

Astoria 5 th^ ave 8.0 231

name address rating capacity

Hotels

Winter 2003 21

E x am p le - Sel ect

  • σrating > 6 AND capacity > 50 (Hotel)

ELodge 2nd E st 6.0 40

ELogde 39 W st 5.6 45

BestInn 45 th^ st 6.7 28

Astoria 5 th^ ave 8.0 231

Windsor 54 th^ ave 6.0 135

name address rating capacity

Astoria 5 th^ ave 8.0 231

name address rating capacity

  • I s σ  (σ  (R )) =

σ   (R )? P r o v e

o r g iv e a c o u n t e r -

e x a m p le

  • I n c la s s

Winter 2003 22

P roj ecti on - πa ttr ib u te lis t ( R )

  • Unary operation
    • Input : R(A 1 , … , An)
    • Output relation has attributes according to attribute

list

  • Meaning: Emits only the attributes stated in attribute

list of every tuple in relation R.

  • Eliminate duplicates.

Winter 2003 25

Set U n ion – R ∪ S

  • Binary operation
    • Input: R and S must be union-compatible
      • They have the same set of arity, same number of columns
      • The ith column of R has the same type as the ith column of S for every column i.
      • Note that field names are not used in defining union- compatibility though we can also think that R and S is union- compatible if they having the same type (a set of record type).
    • Output has the same type as R
    • Meaning: the output consists of the set of all tuples in R and S

Winter 2003 26

E x am p le - Set U n i on

  • Dell_Desktops IBM_Desktops

20G 750Mhz Linux

30G 1.0Ghz Windows

20G 500Mhz Windows

Harddisk Speed OS

30G 1.2Ghz Windows 20G 500Mhz Windows

Harddisk Speed OS

30G 1.2Ghz Windows

20G 500Mhz Windows 30G 1.0Ghz Windows 20G 750Mhz Linux

Harddisk Speed OS IBM_desktops

Dell_desktops

All tuples in R occurs in R S

All tuples in S occurs in R S

R S contains tuples that either

occur in R or S (or both).

Winter 2003 27

Example - Set U n i o n

  • Dell_Desktops IBM_Desktops

20G 750Mhz Linux

30G 1.0Ghz Windows

20G 500Mhz Windows

Harddisk Speed OS

30G 1.2Ghz Windows 20G 500Mhz Windows

Harddisk Speed OS 30G 1.2Ghz Windows

20G 500Mhz Windows 30G 1.0Ghz Windows 20G 750Mhz Linux

Harddisk Speed OS

IBM_desktops

Dell_desktops

R S = S R (communtativity)

R (S T) = (R S) T (associativity)

Winter 2003 28

Set D if f er en c e – R - S

  • Binary operation
    • Input: R and S must be union-compatible
    • Output has the same type as R
    • Meaning: output consists of all tuples in R and not in S

Winter 2003 31

J o in o r C r o s s P r o d u c t – R x S

  • Join is a generic term for a variety of operations that

connect two relations that may not be union-compatible.

  • basic operation is the product, R x S
  • Binary operation
  • Input: R(A 1 , …, Am), S(B 1 , …, Bn)
  • Output is a relation with columns A 1 , …, Am, B 1 , …,

Bn

  • What happens if R and S contain common

attributes?

  • Meaning: concatenates every tuple in R with every

tuple in S.

Winter 2003 32

Example - C r o s s P r o d u c t

a 1 b 1 c 1 a 2 b 2 c 2

A B C

d 2 e 2 d 3 e 3

d 1 e 1

D E

R S

a 2 b 2 c 2 d 3 e 3

a 2 b 2 c 2 d 2 e 2

a 2 b 2 c 2 d 1 e 1

a 1 b 1 c 1 d 2 e 2

a 1 b 1 c 1 d 1 e 1

a 1 b 1 c 1 d 3 e 3

A B C D E

R x S

Is it commutative? Is it associative? Is it distributive? I.e,

Is Rx(S T) = (RxS) (RxT)

Winter 2003 33

Example - C r o s s P r o d u c t

a 2 b 2 c 2 d 3 e 3

a 2 b 2 c 2 d 2 e 2

a 2 b 2 c 2 d 1 e 1

a 1 b 1 c 1 d 2 e 2

a 1 b 1 c 1 d 1 e 1

a 1 b 1 c 1 d 3 e 3

A.1 B C A.2 E

  • What happens if R and S contain common attributes?

e.g., R(A,B,C) and S(A,E)

  • Answer varies. We may assume positional suffixes

exists to distinguish among conflicts

Winter 2003 34

M o r e c o mplex q u er i es

  • Relational queries can be composed to form more

complex queries

  • Enrollments(sid, cid, score), Courses(cid, cname,

instructor-name)

  • σscore>80 ( πsid, grade, instructor-name (

σEnrollments.cid = Courses.cid (Enrollments x Courses) ))

  • We can now find out the student and instructor pairs where the student scored well (more than 80 pts) in a course taught by the instructor

Winter 2003 37

A n alt er n at i v e plan

  • Find the student and course names where the student scored well (more than 80 pts) in the course

Enrollments (^) Courses

x

πsname, cname

Students

σEnrollments.sid = Students.sid AND Enrollments.cid = Courses.cid AND grade>

x

Which is a better execution plan?

Winter 2003 38

R en ami n g - ρx ( E)

  • Renaming is used to give a name to the result of a relational algebra expression.
  • Suppose the schema is R(A, B, C), we write ρB D(R) to change the schema to R(A, D, C)
  • Suppose R(A,B,C) and S(D,A) in the previous example. We have naming conflict on attribute A in RxS. We can write ρ 1 G (RxS) to correct the ambiguity. Therefore the result has schema RxS(G,B,C,D,A)