Relational Algebra-Databases-Lecture 12 Slides-Computer Science, Slides of Database Management Systems (DBMS)

Relational Algebra, SQL, Duplicate Rows, Multiset Semantics, Bag Operations, Bag Union, Bag Intersection, Bag Difference, Laws for Bags, Extended Relational Algebra, Duplicate-elimination Operator, Sorting, Extended Projection, Arithmetic, Aggregation, Grouping, Group By in SQL, Semantics, Null Values, Entity Integrity Constraint, Full Outer Join, Outer Joins, Views, Querying Views

Typology: Slides

2011/2012

Uploaded on 02/12/2012

tiuw
tiuw 🇺🇸

4.7

(18)

286 documents

1 / 28

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Lecture 12:
Further relational algebra,
further SQL
www.cl.cam.ac.uk/Teaching/current/Databases/
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c

Partial preview of the text

Download Relational Algebra-Databases-Lecture 12 Slides-Computer Science and more Slides Database Management Systems (DBMS) in PDF only on Docsity!

Lecture 12: Further relational algebra, further SQL

www.cl.cam.ac.uk/Teaching/current/Databases/

Today’s lecture

  • Where does SQL differ from relational model?
  • What are some other features of SQL?
  • How can we extend the relational algebra to match more closely SQL?

Multiset semantics

  • A relation in SQL is really a multiset or bag , rather than a set as in the relational model - A multiset has no order (unlike a list), but allows duplicates - E.g. {1,2,1,3} is a bag - select, project and join work for bags as well as sets - Just work on a tuple-by-tuple basis

Bag operations

  • Bag union :
    • Sum the number of times that an element appears in the two bags, e.g. - {1,2,1} {1,2,3} = {1,1,1,2,2,3}
  • Bag intersection :
    • Take the minimum of the number of occurrences in each bag, e.g. - {1,2,1} {1,2,3,3} = {1,2}
  • Bag difference :
    • Proper-subtract the number of occurrences in the two bags, e.g. - {1,2,1}-{1,2,3,3} = {1}

Extended relational

algebra

Add features needed for SQL

  1. Bag semantics
  2. Duplicate elimination operator,
  3. Sorting operator,
  4. Grouping and aggregation operator,
  5. Outerjoin operators, oV, Vo, oVo

Duplicate-elimination

operator

  • (R) = relation R with any duplicated tuples removed
  • R= (R)=
  • This is used to model the DISTINCT feature of SQL

A B 1 2 3 4 1 2

A B 1 2 3 4

Extended projection

  • SQL allows us to use arithmetic operators SELECT age* FROM Sailors;
  • We extend the projection operator to allow the columns in the projection to be functions of one or more columns in the argument relation, e.g.
  • R= (^) A+B,A,A(R)=

A B 1 2 3 4

A+B A.1 A. 3 1 1 7 3 3

Arithmetic

  • Arithmetic (and other expressions) can not be used at the top level - i.e. 2+2 is not a valid SQL query
  • How would you get SQL to compute 2+2?

Grouping

  • These aggregation operators have been applied to all qualifying tuples. Sometimes we want to apply them to each of several groups of tuples, e.g. - For each rating, find the average age of the sailors - For each rating, find the age of the youngest sailor

GROUP BY in SQL

SELECT [DISTINCT] target-list FROM relation-list WHERE qualification GROUP BY grouping-list ;

  • The target-list contains
    1. List of column names
    2. Aggregate terms
    • NOTE: The variables in target-list must be contained in grouping-list

Grouping and aggregation

  • (^) L(R) where L is a list of elements that are either - Individual column names (“Grouping attributes”), or - Of the form (A), where is an aggregation operator (MIN, SUM, …) and A is the column it is applied to
  • For example,

rating,AVG(age)(Sailors)

Semantics

  • Group R according to the grouping attributes
  • Within each group, compute (A)
  • Result is the relation consisting of one tuple for each group. The components of that tuple are the values associated with each element of L for that group

19

Example cont.

  1. Group according to the grouping attribute, beer:
  2. Compute average of price within groups:

bar beer price Anchor 6X 2. Mill 6X 2. Anchor Adnam‟s 2. Mill Fosters 2. Eagle Fosters 2. beer price 6X 2. Adnam‟s 2. Fosters 2.

NULL values

  • Sometimes field values are unknown (e.g. rating not known yet), or inapplicable (e.g. no spouse name)
  • SQL provides a special value, NULL , for both these situations
  • This complicates several issues
    • Special operators needed to check for NULL
    • Is NULL>8? Is (NULL OR TRUE)=TRUE?
    • We need a three-valued logic
    • Need to carefully re-define semantics