

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Assignment; Class: Advanced Data Management; Subject: Computer Science; University: University of Illinois - Urbana-Champaign; Term: Fall 2007;
Typology: Assignments
1 / 2
This page cannot be seen from the preview
Don't miss anything!


NOTE : Please submit a hard copy of your homework. Bring it to the lecture table at the beginning of the lecture on 26th September 2007. The hard copy should be as clearly readable as possible. You may be subtracted points for unreadability and ugly presentation.
I2CS Students : You should email your solutions to the TA([email protected]) in the pdf or the word format. Please send your email with subject “ CS511-HW2 ” and solution file attached by 2PM UIUC time (CST).
Problem 1: GiST [25 Pts]
This problem asks you to build a GiST search tree for the date data type. Each date object can be either a one-day or a period. For simplicity, we assume all dates are in the year of
d1 = 01/02 // a one-day example d2 = (01/20, 02/25) // a period example d3= (01/01, 01/10) d4= (01/01, 02/10)
We want to use GiST to build a search tree for supporting the following query predicate, in which x and y are two date objects. That is, given a predicate below, we want to search the index tree, for some given y , to find all x that satisfy the predicate.
Overlap(x,y) : True if x and y overlap, and False otherwise; e.g., Overlap(d1,d2) = True, Overlap(d1,d3) = False Before(x,y): True if x ends before y starts, and False otherwise; e.g., Before(d3,d2) = True, Before(d1,d2) = False.
Describe how Consistent can be implemented for each of the above query predicates.
Problem 2: Query Optimization [25 Pts]
Consider the following relational schema and SQL query: Suppliers(sid,sname,address,category) Supply(sid,pid) Parts(pid,pname,brand)
SELECT S.sname, P.pname
FROM Suppliers S, Parts P, Supply Y WHERE sS.sid = Y.sid AND Y.pid = P.pid
(a) Enumerate all joins orderings that System R considers. Assume that the optimizer follows the heuristic of never considering orderings that require the computation of cross-products. (b) Suppose relation Supply has a hash index on attribute sid. What plan would you choose? Explain why and how this index might be helpful. (c) Give two examples orderings that System R will not consider. How can you change the optimizer to consider such ordering? If not possible, explain why not.
Problem 3: Query Optimization [15 Pts]
Consider relations r(A,B) and s(B,C). Assume that r contains 2,000 tuples, and that s contains 5,000 tuples. We want to compute v=r (^) (^) r.B=s.B s
(a) Without any further assumptions, what is the maximum number of tuples that v may contain? (b) Now assume that we know that V(B,r)=500 (That is, in r the attribute B takes on 500 different values). What is now a reasonable estimate on the size of v? (c) Finally, assume we know that s satisfies the functional dependency B C. What is now a reasonable estimate on the size of v?
Problem 4: Query Execution [35 Pts]
There are other types of join in addition to equi-join where the predicate is < (smaller than) or > (greater than). For instance, consider a query like “ For each manager, show employees whose salaries are more than the manager’s”. Database management system has to join table Manager with table Employee on Salary attribute where join predicate is < (less than). (a) Explain how the hybrid-hash join algorithm could be extended to implement join with above predicates. (b) Compare the memory usage in the extended algorithm of part (a) with the case of equi-join (c) Compare the extended algorithm of part (a) with sort-merge algorithm for new join operators in terms of speed and memory usage.