Assignment 4 for Database Systems | CS 411, Assignments of Deductive Database Systems

Material Type: Assignment; Class: Database Systems; Subject: Computer Science; University: University of Illinois - Urbana-Champaign; Term: Fall 2007;

Typology: Assignments

Pre 2010

Uploaded on 03/11/2009

koofers-user-l5t
koofers-user-l5t 🇺🇸

5

(1)

10 documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CS411 Database Systems
Fall 2007
HW#4
Due: 3:00pm CST, 12/05/07
Note: Print your name and NetID in the upper right corner of every page of your submission.
Handin your stapled homework to Donna Coleman in 2120 SC. In case Donna is not in office,
slide your homework under the door.
To grade homeworks faster, the homework is partitioned into two parts. Please, submit
each part separately. For each part, make sure to write down your name and NetID.
This homework is partitioned into two parts as follows:
Part 1: Problem 1 - Problem 3
Part 2: Problem 4 - Problem 6
There is a bonus credit of 2% if your homework is formatted correctly: Each part must be
separable so that it can be independently graded. If you submit a paper copy then each part
should i) be separately stapled and ii) include your netid.
Handwritten submissions will be graded but they will take longer to grade. For clarity,
machine formatted text is preferable: Expect to lose points if your handwritten answer is
unclear or misread by the grader.
pf3
pf4
pf5

Partial preview of the text

Download Assignment 4 for Database Systems | CS 411 and more Assignments Deductive Database Systems in PDF only on Docsity!

CS411 Database Systems

Fall 2007

HW#

Due: 3:00pm CST, 12/05/

Note: Print your name and NetID in the upper right corner of every page of your submission. Handin your stapled homework to Donna Coleman in 2120 SC. In case Donna is not in office, slide your homework under the door.

To grade homeworks faster, the homework is partitioned into two parts. Please, submit each part separately. For each part, make sure to write down your name and NetID.

This homework is partitioned into two parts as follows:

  • Part 1: Problem 1 - Problem 3
  • Part 2: Problem 4 - Problem 6

There is a bonus credit of 2% if your homework is formatted correctly: Each part must be separable so that it can be independently graded. If you submit a paper copy then each part should i) be separately stapled and ii) include your netid.

Handwritten submissions will be graded but they will take longer to grade. For clarity, machine formatted text is preferable: Expect to lose points if your handwritten answer is unclear or misread by the grader.

Part 1

Problem 1 Query Execution (20 points; 5 points for each)

Consider the join R ⊲⊳R.a=S.a S, giving the following information about the relations to be joined. The cost metric is the number of I/Os unless otherwise noted, and the cost of writing out the result should be ignored.

  • Relation R contains 10,000 tuples and has 10 tuples per block.
  • Relation S contains 2000 tuples and also has 10 tuples per block.
  • 51 buffer blocks are available.

a) What is the cost of joining R and S using a block-based nested loop join? (suppose relation S is used at the outer loop)

b) Can a two-pass merge-sort join be used to join R and S? What is the cost?

c) Can a a two-pass partitioned hash join be used to join R and S? What is the cost?

d) Assume S is clustered, and R has an index on a. If V(R,a)=200, what is the cost of a index based join if index is clustered? what is the cost of a index based join if index is unclustered.

Problem 3 Algebraic Laws (15 points; 5 points for each)

Exercise 16.2.4 (b,c and d)(Page 809)

Problem 4 Dynamic Programming 16 points, 8 for each question

Compute the optimal plan for R ⊲⊳ S ⊲⊳ T ⊲⊳ U using the dynamic programming algorithm, assuming the following: B(R) = 300, B(S) = 700, B(T) = 500, B(U) = 600

The size of a join is estimated as: B(A ⊲⊳ B) = 0. 01 ∗ B(A) ∗ B(B)

The cost of a join is estimated to be the cost of the subplans plus the size of the intermediate results (the same as the cost model we have in Lecture).

(a) Please draw the table for dynamic programming, to show how you compute the optimal plan for all possible join orders allowing all trees.

(b) Please draw the table for dynamic programming, to show how you compute the optimal plan for all possible join orders allowing left-deep trees only.

Problem 5 Cost Estimation 16 points, 2 points for each question

Consider two relations R(a,b,c,d) and S(d,e) with the following statistics: T(R) = 100; V(R,a) = 100; V(R,b) = 10; V(R,c)=1; V(R,d)= T(S)= 500; V(S,d)=30; V(S,e)=

(a) Estimate the number of tuples in σb=25(R)

(b) Estimate the number of tuples in σc=30(R)

(c) Estimate the number of tuples in σ(b=25)∧(c=30)(R)

(d) Estimate the number of tuples in σb> 25 (R)

(e) Estimate the number of tuples in σ(b>25)∧(b=11)(R)

(f) Estimate the number of tuples in σ(b=25)∨(d=13)(R)

(g) Estimate the number of tuples in R ⊲⊳ S

(h) Estimate the number of tuples in R ⊲⊳R.d=S.d AN D R.b=S.e S