lec7-sql-wrapup-and-RA-small.pdf, Summaries of Algebra

A subquery is a SQL query nested inside a larger query. • Such inner-outer queries are called nested queries. • A subquery may occur in: – A SELECT clause.

Typology: Summaries

2022/2023

Uploaded on 03/01/2023

nicoline
nicoline 🇺🇸

4.6

(12)

271 documents

1 / 11

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Introduction to Database Systems
CSE 414
Lecture 7: SQL Wrap-up and
Relational Algebra
1CSE 414 -Autumn 2018
Announcements
Additional Office Hours and room
changes
Website calendar is up-to-date
Check email for Microsoft Azure invite
“Action required: Accept your Azure lab assignment”
CSE 414 -Autumn 2018 2
Subqueries
A subquery is a SQL query nested inside a larger query
Such inner -outer queries are cal led nested queries
A subquery may oc cur in:
A SELECT clause
Must return single value
A FROM clause
Can return multi-valued relation
A WHERE clause
Rule of thumb: avoid nest ed queries when possible
But sometimes it’s impossible, as we will see
CSE 414 -Autumn 2018 3
FWGHOS
Subqueries in FROM
Sometimes we need to compute an
intermediate table only to use it later in a
SELECT-FROM-WHERE
Option 1: use a subquery in the FROM
clause
Option 2: use the WITH clause
See textbook for details
CSE 414 -Autumn 2018 4
2. Subqueries in FROM
CSE 414 -Autumn 2018 5
SELECT X.pname
FROM (SELECT *
FROM Product AS Y
WHERE price > 20) as X
WHERE X.price < 500
Product (pname, price, cid)
Company (cid, cname, city)
=
WITH myTable AS (SELECT * FROM Product AS Y WHERE price > 20)
SELECT X.pname
FROM myTable as X
WHERE X.price < 500
A subquery whose
result we called myTable
Subqueries in WHERE
SELECT ……….. WHERE EXISTS (sub);
SELECT ……….. WHERE NOT EXISTS (sub);
SELECT ……….. WHERE attribute IN (sub);
SELECT ……….. WHERE attribute NOT IN (sub);
SELECT ……….. WHERE attribute >ANY (sub);
SELECT ……….. WHERE attribute >ALL (sub);
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download lec7-sql-wrapup-and-RA-small.pdf and more Summaries Algebra in PDF only on Docsity!

Introduction to Database Systems

CSE 414

Lecture 7: SQL Wrap-up and

Relational Algebra

CSE 414 - Autumn 2018 1

Announcements

  • Additional Office Hours and room

changes

  • Website calendar is up-to-date
  • Check email for Microsoft Azure invite “Action required: Accept your Azure lab assignment” CSE 414 - Autumn 2018 2

Subqueries

  • A subquery is a SQL query nested inside a larger query
  • Such inner-outer queries are called nested queries
  • A subquery may occur in:
    • A SELECT clause
      • Must return single value
    • A FROM clause
      • Can return multi-valued relation
    • A WHERE clause
  • Rule of thumb: avoid nested queries when possible
    • But sometimes it’s impossible, as we will see CSE 414 - Autumn 2018 3

FWGHOS

Subqueries in FROM

Sometimes we need to compute an

intermediate table only to use it later in a

SELECT-FROM-WHERE

  • Option 1: use a subquery in the FROM

clause

  • Option 2: use the WITH clause
    • See textbook for details CSE 414 - Autumn 2018 4

2. Subqueries in FROM

CSE 414 - Autumn 2018 5 SELECT X.pname FROM (SELECT * FROM Product AS Y WHERE price > 20) as X WHERE X.price < 500 Product (pname, price, cid) Company (cid, cname, city)

WITH myTable AS (SELECT * FROM Product AS Y WHERE price > 20) SELECT X.pname FROM myTable as X WHERE X.price < 500 A subquery whose result we called myTable

Subqueries in WHERE

  • SELECT ……….. WHERE EXISTS (sub);
  • SELECT ……….. WHERE NOT EXISTS (sub);
  • SELECT ……….. WHERE attribute IN (sub);
  • SELECT ……….. WHERE attribute NOT IN (sub);
  • SELECT ……….. WHERE attribute > ANY (sub);
  • SELECT ……….. WHERE attribute > ALL (sub);

Monotone Queries

  • Definition A query Q is monotone if:
    • Whenever we add tuples to one or more input tables, the answer to the query will not lose any of the tuples CSE 414 - Autumn 2018 7 Product (pname, price, cid) Company (cid, cname, city) Monotone Queries
  • Theorem: If Q is a SELECT-FROM-WHERE query that does not have subqueries, and no aggregates, then it is monotone. CSE 414 - Autumn 2018 8 Monotone Queries
  • Theorem: If Q is a SELECT-FROM-WHERE query that does not have subqueries, and no aggregates, then it is monotone.
  • Proof. We use the nested loop semantics: if we insert a tuple in a relation Ri, this will not remove any tuples from the answer CSE 414 - Autumn 2018 9 SELECT a 1 , a 2 , …, ak FROM WHERE RConditions 1 AS x 1 , R 2 AS x 2 , …, Rn AS xn for x 1 in R 1 do for … x 2 in R 2 do for if x nConditions in Rn do output (a 1 ,…,ak) Monotone Queries
  • The query: is not monotone 10 Find all companies s.t. all their products have price < 200 Product (pname, price, cid) Company (cid, cname, city) CSE 414 - Autumn 2018 Monotone Queries
  • The query: is not monotone 11 Find all companies s.t. all their products have price < 200 pname price cid Gizmo 19.99 c cid cname city c001 Sunworks Bonn cname Sunworks Product (pname, price, cid) Company (cid, cname, city) CSE 414 - Autumn 2018 Monotone Queries
  • The query: is not monotone
  • Consequence: If a query is not monotonic, then we cannot write it as a SELECT-FROM-WHERE query without nested subqueries Find all companies s.t. all their products have price < 200 pname price cid Gizmo 19.99 c cid cname city c001 Sunworks Bonn cname Sunworks pname price cid Gizmo 19.99 c Gadget 999.99 c cid cname city c001 Sunworks Bonn cname Product (pname, price, cid) Company (cid, cname, city) CSE 414 - Autumn 2018 12

Finding Witnesses CSE 414 - Autumn 2018 19 To find the witnesses, compute the maximum price in a subquery (in FROM or in WITH) SELECT DISTINCT u.city, v.pname, v.price FROM Company u, Product v, (SELECT x.city, max(y.price) as maxprice FROM Company x, Product y WHERE x.cid = y.cid GROUP BY x.city) w WHERE u.cid = v.cid and u.city = w.city and v.price = w.maxprice; Product (pname, price, cid) Company (cid, cname, city) Finding Witnesses CSE 414 - Autumn 2018 20 There is a more concise solution here: SELECT u.city, v.pname, v.price FROM Company u, Product v, Company x, Product y WHERE u.cid = v.cid and u.city = x.city and x.cid = y.cid GROUP BY u.city, v.pname, v.price HAVING v.price = max(y.price) Product (pname, price, cid) Company (cid, cname, city) SQL: Our first language for the relational model

  • Projections
  • Selections
  • Joins (inner and outer)
  • Inserts, updates, and deletes
  • Aggregates
  • Grouping
  • Ordering
  • Nested queries CSE 414 - Autumn 2018 21 Relational Algebra CSE 414 - Autumn 2018 22 Relational Algebra
  • Set-at-a-time algebra, which

manipulates relations

  • In SQL we say what we want
  • In RA we can express how to get it
  • Every DBMS implementation converts a

SQL query to RA in order to execute it

  • An RA expression is called a query plan CSE 414 - Autumn 2018 23 Why study another relational query language?
  • RA is how SQL is implemented in

DBMS

  • We will see more of this in a few weeks
  • RA opens up opportunities for query

optimization

CSE 414 - Autumn 2018 24

Basics CSE 414 - Autumn 2018 25

  • Relations and attributes
  • Functions that are applied to relations
    • Return relations R2 = σ (R1)
    • Can be composed together R3 = p (σ (R1))
    • Often displayed using a tree rather than linearly
    • Use Greek symbols: σ, p, δ, etc Sets v.s. Bags
  • Sets: {a,b,c}, {a,d,e,f}, { },...
  • Bags: {a, a, b, c}, {b, b, b, b, b},...

Relational Algebra has two flavors:

  • Set semantics = standard Relational Algebra
  • Bag semantics = extended Relational Algebra

DB systems implement bag semantics (Why?)

CSE 414 - Autumn 2018 26 Relational Algebra Operators

  • Union ∪, intersection ∩, difference -
  • Selection σ
  • Projection π
  • Cartesian product X, join ⨝
  • (Rename ρ)
  • Duplicate elimination δ
  • Grouping and aggregation ɣ
  • Sorting # CSE 414 - Autumn 2018 27
RA

Extended RA

All operators take in 1 or more relations as inputs

and return another relation

Union and Difference CSE 414 - Autumn 2018 28

What do they mean over bags?

R1 ∪ R

R1 – R

Only make sense if R1, R2 have the same schema What about Intersection?

  • Derived operator using minus
  • Derived using join CSE 414 - Autumn 2018 29

R1 ∩ R2 = R1 – (R1 – R2)

R1 ∩ R2 = R1 ⨝ R

Selection

  • Returns all tuples which satisfy a condition
  • Examples
    • σSalary > 40000 (Employee)
    • σname = “Smith” (Employee)
  • The condition c can be =, <, <=, >, >=, <>

combined with AND, OR, NOT

CSE 414 - Autumn 2018 30 σc(R)

Renaming

  • Changes the schema, not the instance
  • Example:
    • Given Employee(Name, SSN)
    • ρN, S(Employee) à Answer(N, S) CSE 414 - Autumn 2018 37 ρB1,…,Bn (R) Natural Join
  • Meaning: R1⨝ R2 = PA(sq (R1  R2))
  • Where:
  • Selection sq checks equality of all common

attributes (i.e., attributes with same names)

  • Projection PA eliminates duplicate common

attributes CSE 414 - Autumn 2018 38

R1 ⨝ R Natural Join Example CSE 414 - Autumn 2018 39 A B X Y X Z Y Z Z V B C Z U V W Z V A B C X Z U X Z V Y Z U Y Z V Z V W

R S
R ⨝ S =

PABC(sR.B=S.B(R  S)) Natural Join Example 2 CSE 414 - Autumn 2018 40 age zip disease 54 98125 heart 20 98120 flu AnonPatient P Voters V P V name age zip Alice 54 98125 Bob 20 98120 age zip disease name 54 98125 heart Alice 20 98120 flu Bob Natural Join

  • Given schemas R(A, B, C, D), S(A, C, E),

what is the schema of R ⨝ S?

  • Given R(A, B, C), S(D, E), what is R ⨝ S?
  • Given R(A, B), S(A, B), what is R ⨝ S? CSE 414 - Autumn 2018 41 Theta Join
  • A join that involves a predicate
  • Here q can be any condition
  • No projection in this case!
  • For our voters/patients example: 42 R1 ⨝q R2 = sq (R1 X R2)

P ⨝ P.zip = V.zip and P.age >= V.age - 1 and P.age <= V.age +1 V

AnonPatient (age, zip, disease) Voters (name, age, zip) CSE 414 - Autumn 2018

Equijoin

  • A theta join where q is an equality predicate
  • By far the most used variant of join in practice
  • What is the relationship with natural join? CSE 414 - Autumn 2018 43 R1 ⨝q R2 = sq (R1  R2) Equijoin Example CSE 414 - Autumn 2018 44 age zip disease 54 98125 heart 20 98120 flu AnonPatient P Voters V P (^) P.age=V.age V name age zip p1 54 98125 p2 20 98120 P.age P.zip P.disease V.name V.age V.zip 54 98125 heart p1 54 98125 20 98120 flu p2 20 98120 Join Summary
  • Theta-join : R ⨝q S = σq (R × S)
  • Join of R and S with a join condition θ
  • Cross-product followed by selection θ
  • No projection
  • Equijoin : R ⨝θ S = σθ (R × S)
  • Join condition θ consists only of equalities
  • No projection
  • Natural join : R ⨝ S = πA (σθ (R × S))
  • Equality on all fields with same name in R and in S
  • Projection πA drops all redundant attributes CSE 414 - Autumn 2018 45 So Which Join Is It?

When we write R ⨝ S we usually mean an

equijoin, but we often omit the equality

predicate when it is clear from the context

CSE 414 - Autumn 2018 46 More Joins

  • Outer join
    • Include tuples with no matches in the output
    • Use NULL values for missing attributes
    • Does not eliminate duplicate columns
  • Variants
    • Left outer join
    • Right outer join
    • Full outer join CSE 414 - Autumn 2018 47 Outer Join Example CSE 414 - Autumn 2018 48 age zip disease 54 98125 heart 20 98120 flu 33 98120 lung AnonPatient P P ⋊ J P.age P.zip P.diseas e J.job J.age J.zip 54 98125 heart lawyer 54 98125 20 98120 flu cashier 20 98120 33 98120 lung null null null AnnonJob J job age zip lawyer 54 98125 cashier 20 98120

Using Extended RA Operators CSE 414 - Autumn 2018 55 SELECT city, sum(quantity) FROM Sales GROUP BY city HAVING count() > 100 Sales(product, city, quantity) g city, sum(quantity)→q, count() → c s (^) c > 100 P (^) city, q Answer Typical Plan for a Query (1/2) CSE 414 - Autumn 2018 56

R S

join condition σselection condition πfields join condition …

SELECT-PROJECT-JOIN

Query Answer SELECT fields FROM R, S, … WHERE condition Typical Plan for a Query (1/2) 57 πfields ɣfields, sum/count/min/max(fields) σhaving condition σwhere condition join condition … … SELECT fields FROM R, S, … WHERE condition GROUP BY fields HAVING condition CSE 414 - Autumn 2018 How about Subqueries? CSE 414 - Autumn 2018 58 Supplier(sno,sname,scity,sstate) Part(pno,pname,psize,pcolor) Supply(sno,pno,price) SELECT Q.sno FROM Supplier Q WHERE Q.sstate = ‘WA’ and not exists (SELECT * FROM Supply P WHERE P.sno = Q.sno and P.price > 100) How about Subqueries? CSE 414 - Autumn 2018 59 Supplier(sno,sname,scity,sstate) Part(pno,pname,psize,pcolor) Supply(sno,pno,price) SELECT Q.sno FROM Supplier Q WHERE Q.sstate = ‘WA’ and not exists (SELECT * FROM Supply P WHERE P.sno = Q.sno and P.price > 100) σ sstate=‘WA’ Supplier π sno Option 1: create nested plans not exists σ price> Supplier SELECT Q.sno FROM Supplier Q WHERE Q.sstate = ‘WA’ and not exists (SELECT * FROM Supply P WHERE P.sno = Q.sno and P.price > 100) How about Subqueries? CSE 414 - Autumn 2018 60 Correlation! Supplier(sno,sname,scity,sstate) Part(pno,pname,psize,pcolor) Supply(sno,pno,price)

SELECT Q.sno FROM Supplier Q WHERE Q.sstate = ‘WA’ and not exists (SELECT * FROM Supply P WHERE P.sno = Q.sno and P.price > 100) How about Subqueries? CSE 414 - Autumn 2018 61 De-Correlation SELECT Q.sno FROM Supplier Q WHERE Q.sstate = ‘WA’ and Q.sno not in (SELECT P.sno FROM Supply P WHERE P.price > 100) Supplier(sno,sname,scity,sstate) Part(pno,pname,psize,pcolor) Supply(sno,pno,price) SELECT Q.sno FROM Supplier Q WHERE Q.sstate = ‘WA’ and Q.sno not in (SELECT P.sno FROM Supply P WHERE P.price > 100) How about Subqueries? CSE 414 - Autumn 2018 62 (SELECT Q.sno FROM Supplier Q WHERE Q.sstate = ‘WA’) EXCEPT (SELECT P.sno FROM Supply P WHERE P.price > 100) EXCEPT = set difference Supplier(sno,sname,scity,sstate) Part(pno,pname,psize,pcolor) Supply(sno,pno,price) Un-nesting (SELECT Q.sno FROM Supplier Q WHERE Q.sstate = ‘WA’) EXCEPT (SELECT P.sno FROM Supply P WHERE P.price > 100) How about Subqueries? CSE 414 - Autumn 2018 63 Supply σ sstate=‘WA’ Supplier σ Price > 100

Finally… π sno^ π sno Supplier(sno,sname,scity,sstate) Part(pno,pname,psize,pcolor) Supply(sno,pno,price) Summary of RA and SQL

  • SQL = a declarative language where we

say what data we want to retrieve

  • RA = an algebra where we say how we

want to retrieve the data

  • Theorem : SQL and RA can express

exactly the same class of queries

RDBMS translate SQLCSE 414 - Autumn 2018à RA, then optimize RA 64

Summary of RA and SQL

  • SQL (and RA) cannot express ALL queries

that we could write in, say, Java

  • Example:
    • Parent(p,c): find all descendants of ‘Alice’
    • No RA query can compute this!
    • This is called a recursive query
  • Next lecture: Datalog is an extension that can

compute recursive queries

CSE 414 - Autumn 2018 65