Download Winter 2003 Database Lecture Notes: Views, SQL Queries, and Relational Algebra and more Study Guides, Projects, Research Database Management Systems (DBMS) in PDF only on Docsity!
Winter 2003 1
Today’s Lecture
- A little bit on views
- Basic SQL queries
- Relational Algebra
Winter 2003 2
R ecom m en ded R eadi n g s
- Chapter 3
- Chapter 5
- Chapter 4
- SQL for Web nerds – Simple queries
- The link is on the class web page
Winter 2003 3
V iew s
- Physical schema
- Logical/Conceptual schema
- External schema
- What applications can see
- Views mechanism supports logical data independence
Disk
Students, Courses, Enrollments, Faculties
YoungActiveStudents, Faculty-info
Physical
Logical
External
End users … Web apps
Winter 2003 4
V iew s
- External schema is created by defining views over the
logical schema
- Of course, sometimes the logical schema serves well as
the external schema but sometimes we need to define new
“relations” over existing ones
- A view is a relation, defined in terms of other relations, but
its tuples are not stored.
- Virtual views versus Materialized views
Winter 2003 7
R el ati on al M odel : Sum m ary
- A tabular representation of data.
- Simple and intuitive, currently the most widely used.
- Integrity constraints can be specified by the DBA, based on application semantics. DBMS checks for violations. - Two important ICs: primary and foreign keys - In addition, we always have domain constraints.
- Rules to translate ER to relational model
- How do we query relational data?
Winter 2003 8
Q ueryi n g w ith SQ L
- SQL is the most popular querying language for relational
DBMS
SELECT [DISTINCT] c 1 , c 2 , …, cm
FROM R 1 , R 2 , …, Rn
[WHERE condition ]
- What is the semantics (or meaning) of such a query?
Attribute names
Relation names
Winter 2003 9
E x am p le 1 – a si m p ly q uery
SELECT age FROM Students
- For every tuple in Students, emit only the age component.
2346 Jane 19
8992 John 21
2187 Ann 20
1124 John 21
Sid Name Age
Age
Students
equivalent to:
SELECT age FROM Students WHERE TRUE
Note the multiset semantics
Winter 2003 10
E x am p le 2 - D I STI N C T
SELECT DISTINCT age FROM Students
- For every tuple in Students, emit only the age component. Take the set of resulting values (i.e., remove duplicates)
2346 Jane 19
8992 John 21
2187 Ann 20
1124 John 21
Sid Name Age
Age Students
Winter 2003
Example 5 – M u lt i ple R elat i o n s in F R O M
c lau s e
- What does the following query compute?
SELECT S.name, E.cid
FROM Students S, Enrolled E
WHERE S.sid=E.sid AND E.grade=“A”
S.name E.cid
Smith Topology
sid cid grade
53831 Carnatic101 C
53831 Reggae203 B
53650 Topology112 A
53666 History105 B
53831 John 21
21870 Ann 20
53630 Smith 21
Sid Name Age
Students
Enrolled
S E
S
E
Winter 2003 14
C on di ti on
- The WHERE clause consists of a boolean combination
of conditions using logical connectives AND, OR, NOT.
- Each condition is of the form
expression op expression
- expression is a column name, a constant, or an
arithmetic or string expression
- op is a comparsion operator ( <, , =, <>, ≥, > )
- Examples:
- age > shoesize,
- name <> “Jane” AND salary*0.1 > 100
Winter 2003 15
M ean i n g of an SQ L q uery
SELECT [DISTINCT] c 1 , c 2 , …, cm FROM R 1 , R 2 , …, Rn [WHERE condition ]
- For every n -ary tuple t (one from R 1 , one from R 2 , …, one from Rn),
- if t satisfies condition ( i.e., condition evaluates to true), then emit the c 1 , c 2 , …, cm components of t.
- Let Result denote the collection of all emitted results
- If DISTINCT is stated in the SELECT clause, remove duplicates in Result (i.e., Result is a set of tuples. Otherwise, Result is a bag of tuples)
- Note that the order of emitted results is not important
- Try out on your own on PostgreSQL!
UNION, DIFFERENCE
Winter 2003 16
M ore on SQ L
- Many more features in SQL! (look at the size of the manual)
- Before we leap …
- what is the foundation/properties of SQL?
- Recall that a data model consists of two parts
- Formalism for describing data
- Set of operations used to manipulate data
- Relational algebra, relational calculus are two other different ways of describing operations in the relational model
- relational algebra and relational calculus are terse and formal.
- Without the verboseness of SQL and good for reasoning
- Form the basis of SQL
Winter 2003 19
Sel ecti on - σcondition( R )
- Unary operation
- Input: R(A 1 , …, An)
- Output relation has attributes A 1 , …, A n
- Note:
- named field notation: we have assumed that the attributes of input relation are known and “passed along” to the output. Easier to understand
- Strictly speaking, the attributes are not known and we use position numbers instead to refer to the fields. (both are used in SQL)
- Meaning: Takes a relation R and extracts only rows from R that satisfy the condition.
- Output is always a set (no duplicates)
Winter 2003 20
E x am p le - Sel ect
- σrating > 6 (Hotels)
- Positional notation: σ$3>6 (Hotel)
ELodge 2nd E st 6.0 40
ELogde 39 W st 5.6 45
BestInn 45 th^ st 6.7 28
Astoria 5 th^ ave 8.0 231
Windsor 54 th^ ave 6.0 135
name address rating capacity
BestInn 45 th^ st 6.7 28
Astoria 5 th^ ave 8.0 231
name address rating capacity
Hotels
Winter 2003 21
E x am p le - Sel ect
- σrating > 6 AND capacity > 50 (Hotel)
ELodge 2nd E st 6.0 40
ELogde 39 W st 5.6 45
BestInn 45 th^ st 6.7 28
Astoria 5 th^ ave 8.0 231
Windsor 54 th^ ave 6.0 135
name address rating capacity
Astoria 5 th^ ave 8.0 231
name address rating capacity
σ (R )? P r o v e
o r g iv e a c o u n t e r -
e x a m p le
Winter 2003 22
P roj ecti on - πa ttr ib u te lis t ( R )
- Unary operation
- Input : R(A 1 , … , An)
- Output relation has attributes according to attribute
list
- Meaning: Emits only the attributes stated in attribute
list of every tuple in relation R.
Winter 2003 25
Set U n ion – R ∪ S
- Binary operation
- Input: R and S must be union-compatible
- They have the same set of arity, same number of columns
- The ith column of R has the same type as the ith column of S for every column i.
- Note that field names are not used in defining union- compatibility though we can also think that R and S is union- compatible if they having the same type (a set of record type).
- Output has the same type as R
- Meaning: the output consists of the set of all tuples in R and S
Winter 2003 26
E x am p le - Set U n i on
- Dell_Desktops IBM_Desktops
20G 750Mhz Linux
30G 1.0Ghz Windows
20G 500Mhz Windows
Harddisk Speed OS
30G 1.2Ghz Windows 20G 500Mhz Windows
Harddisk Speed OS
30G 1.2Ghz Windows
20G 500Mhz Windows 30G 1.0Ghz Windows 20G 750Mhz Linux
Harddisk Speed OS IBM_desktops
Dell_desktops
All tuples in R occurs in R S
All tuples in S occurs in R S
R S contains tuples that either
occur in R or S (or both).
Winter 2003 27
Example - Set U n i o n
- Dell_Desktops IBM_Desktops
20G 750Mhz Linux
30G 1.0Ghz Windows
20G 500Mhz Windows
Harddisk Speed OS
30G 1.2Ghz Windows 20G 500Mhz Windows
Harddisk Speed OS 30G 1.2Ghz Windows
20G 500Mhz Windows 30G 1.0Ghz Windows 20G 750Mhz Linux
Harddisk Speed OS
IBM_desktops
Dell_desktops
R S = S R (communtativity)
R (S T) = (R S) T (associativity)
Winter 2003 28
Set D if f er en c e – R - S
- Binary operation
- Input: R and S must be union-compatible
- Output has the same type as R
- Meaning: output consists of all tuples in R and not in S
Winter 2003 31
J o in o r C r o s s P r o d u c t – R x S
- Join is a generic term for a variety of operations that
connect two relations that may not be union-compatible.
- basic operation is the product, R x S
- Binary operation
- Input: R(A 1 , …, Am), S(B 1 , …, Bn)
- Output is a relation with columns A 1 , …, Am, B 1 , …,
Bn
- What happens if R and S contain common
attributes?
- Meaning: concatenates every tuple in R with every
tuple in S.
Winter 2003 32
Example - C r o s s P r o d u c t
a 1 b 1 c 1 a 2 b 2 c 2
A B C
d 2 e 2 d 3 e 3
d 1 e 1
D E
R S
a 2 b 2 c 2 d 3 e 3
a 2 b 2 c 2 d 2 e 2
a 2 b 2 c 2 d 1 e 1
a 1 b 1 c 1 d 2 e 2
a 1 b 1 c 1 d 1 e 1
a 1 b 1 c 1 d 3 e 3
A B C D E
R x S
Is it commutative? Is it associative? Is it distributive? I.e,
Is Rx(S T) = (RxS) (RxT)
Winter 2003 33
Example - C r o s s P r o d u c t
a 2 b 2 c 2 d 3 e 3
a 2 b 2 c 2 d 2 e 2
a 2 b 2 c 2 d 1 e 1
a 1 b 1 c 1 d 2 e 2
a 1 b 1 c 1 d 1 e 1
a 1 b 1 c 1 d 3 e 3
A.1 B C A.2 E
- What happens if R and S contain common attributes?
e.g., R(A,B,C) and S(A,E)
- Answer varies. We may assume positional suffixes
exists to distinguish among conflicts
Winter 2003 34
M o r e c o mplex q u er i es
- Relational queries can be composed to form more
complex queries
- Enrollments(sid, cid, score), Courses(cid, cname,
instructor-name)
- σscore>80 ( πsid, grade, instructor-name (
σEnrollments.cid = Courses.cid (Enrollments x Courses) ))
- We can now find out the student and instructor pairs where the student scored well (more than 80 pts) in a course taught by the instructor
Winter 2003 37
A n alt er n at i v e plan
- Find the student and course names where the student scored well (more than 80 pts) in the course
Enrollments (^) Courses
x
πsname, cname
Students
σEnrollments.sid = Students.sid AND Enrollments.cid = Courses.cid AND grade>
x
Which is a better execution plan?
Winter 2003 38
R en ami n g - ρx ( E)
- Renaming is used to give a name to the result of a relational algebra expression.
- Suppose the schema is R(A, B, C), we write ρB D(R) to change the schema to R(A, D, C)
- Suppose R(A,B,C) and S(D,A) in the previous example. We have naming conflict on attribute A in RxS. We can write ρ 1 G (RxS) to correct the ambiguity. Therefore the result has schema RxS(G,B,C,D,A)