Homework 2 with 4 Problems - Advanced Data Management | CS 511, Assignments of Deductive Database Systems

Material Type: Assignment; Class: Advanced Data Management; Subject: Computer Science; University: University of Illinois - Urbana-Champaign; Term: Fall 2007;

Typology: Assignments

Pre 2010

Uploaded on 03/11/2009

koofers-user-wpq-1
koofers-user-wpq-1 🇺🇸

4

(1)

10 documents

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CS 511 Design of Database Management System
Homework 2
Due: 2:00 PM CST on September 26, 2007
NOTE:Please submit a hard copy of your homework. Bring it to the lecture table at the
beginning of the lecture on 26th September 2007. The hard copy should be as clearly
readable as possible. You may be subtracted points for unreadability and ugly
presentation.
I2CS Students: You should email your solutions to the TA([email protected]) in the
pdf or the word format. Please send your email with subject CS511-HW2” and solution
file attached by 2PM UIUC time (CST).
Problem 1: GiST [25 Pts]
This problem asks you to build a GiST search tree for the date data type. Each date object
can be either a one-day or a period. For simplicity, we assume all dates are in the year of
2007. The following are some example date object.
d1 = 01/02 // a one-day example
d2 = (01/20, 02/25) // a period example
d3= (01/01, 01/10)
d4= (01/01, 02/10)
We want to use GiST to build a search tree for supporting the following query predicate,
in which x and y are two date objects. That is, given a predicate below, we want to search
the index tree, for some given y, to find all x that satisfy the predicate.
Overlap(x,y): True if x and y overlap, and False otherwise; e.g., Overlap(d1,d2) =
True, Overlap(d1,d3) = False
Before(x,y): True if x ends before y starts, and False otherwise; e.g.,
Before(d3,d2) = True, Before(d1,d2) = False.
Describe how Consistent can be implemented for each of the above query predicates.
Problem 2: Query Optimization [25 Pts]
Consider the following relational schema and SQL query:
Suppliers(sid,sname,address,category)
Supply(sid,pid)
Parts(pid,pname,brand)
SELECT S.sname, P.pname
pf2

Partial preview of the text

Download Homework 2 with 4 Problems - Advanced Data Management | CS 511 and more Assignments Deductive Database Systems in PDF only on Docsity!

CS 511 Design of Database Management System

Homework 2

Due: 2:00 PM CST on September 26, 2007

NOTE : Please submit a hard copy of your homework. Bring it to the lecture table at the beginning of the lecture on 26th September 2007. The hard copy should be as clearly readable as possible. You may be subtracted points for unreadability and ugly presentation.

I2CS Students : You should email your solutions to the TA([email protected]) in the pdf or the word format. Please send your email with subject “ CS511-HW2 ” and solution file attached by 2PM UIUC time (CST).

Problem 1: GiST [25 Pts]

This problem asks you to build a GiST search tree for the date data type. Each date object can be either a one-day or a period. For simplicity, we assume all dates are in the year of

  1. The following are some example date object.

d1 = 01/02 // a one-day example d2 = (01/20, 02/25) // a period example d3= (01/01, 01/10) d4= (01/01, 02/10)

We want to use GiST to build a search tree for supporting the following query predicate, in which x and y are two date objects. That is, given a predicate below, we want to search the index tree, for some given y , to find all x that satisfy the predicate.

Overlap(x,y) : True if x and y overlap, and False otherwise; e.g., Overlap(d1,d2) = True, Overlap(d1,d3) = FalseBefore(x,y): True if x ends before y starts, and False otherwise; e.g., Before(d3,d2) = True, Before(d1,d2) = False.

Describe how Consistent can be implemented for each of the above query predicates.

Problem 2: Query Optimization [25 Pts]

Consider the following relational schema and SQL query: Suppliers(sid,sname,address,category) Supply(sid,pid) Parts(pid,pname,brand)

SELECT S.sname, P.pname

FROM Suppliers S, Parts P, Supply Y WHERE sS.sid = Y.sid AND Y.pid = P.pid

(a) Enumerate all joins orderings that System R considers. Assume that the optimizer follows the heuristic of never considering orderings that require the computation of cross-products. (b) Suppose relation Supply has a hash index on attribute sid. What plan would you choose? Explain why and how this index might be helpful. (c) Give two examples orderings that System R will not consider. How can you change the optimizer to consider such ordering? If not possible, explain why not.

Problem 3: Query Optimization [15 Pts]

Consider relations r(A,B) and s(B,C). Assume that r contains 2,000 tuples, and that s contains 5,000 tuples. We want to compute v=r (^)  (^) r.B=s.B s

(a) Without any further assumptions, what is the maximum number of tuples that v may contain? (b) Now assume that we know that V(B,r)=500 (That is, in r the attribute B takes on 500 different values). What is now a reasonable estimate on the size of v? (c) Finally, assume we know that s satisfies the functional dependency B C. What is now a reasonable estimate on the size of v?

Problem 4: Query Execution [35 Pts]

There are other types of join in addition to equi-join where the predicate is < (smaller than) or > (greater than). For instance, consider a query like “ For each manager, show employees whose salaries are more than the manager’s”. Database management system has to join table Manager with table Employee on Salary attribute where join predicate is < (less than). (a) Explain how the hybrid-hash join algorithm could be extended to implement join with above predicates. (b) Compare the memory usage in the extended algorithm of part (a) with the case of equi-join (c) Compare the extended algorithm of part (a) with sort-merge algorithm for new join operators in terms of speed and memory usage.