Introduction to Database Systems-Home Work 04-Computer Science, Study notes of Introduction to Database Management Systems

This course is intended to give students a solid background in databases, with a focus on relational database management systems. Topics include data modeling, database design theory, data definition and manipulation languages, storage and indexing techniques, query processing and optimization, concurrency control and recovery, and database programming interfaces. In addition to these traditional topics, this course covers a sample of emerging topics such as XML and Web data management as well a

Typology: Study notes

2011/2012

Uploaded on 01/29/2012

arold
arold 🇺🇸

4.7

(24)

372 documents

1 / 3

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CPS 116 Fall 2004
Homework #4 (8.75% of course grade: 100 points)
Assigned: Thursday, November 18
Due: Thursday, December 2
Problem 1 (20 points).
For each of the following modifications, show the result B+-tree obtained by applying the
modification to the B+-tree shown below. Suppose that the maximum fan-out is 4. (Always
start with the B+-tree shown below; do not apply the modifications to the result of previous
modifications.)
50
20
60 70 8 0
5 10
20 45
50 51 5 5
60 65
70 71 7 5
80 85
(a) Insert 21.
(b) Delete 50.
(c) Insert 79.
(d) Delete 10.
Problem 2 (12 points).
A table R(K, A, …) with 100,000 rows is stored in 10,000 disk blocks. The rows are sorted by
K, but not by A. There is a dense, secondary B+-tree index on R(A), which has 3 levels and
500 leaves.
Suppose we want to sort R by A. We have 101 memory blocks at our disposal. Method 1
performs an external-memory merge sort using all memory available. Method 2 takes
advantage of the fact that the values of A are already sorted in the B+-tree index on R(A): It
simply scans the leaves of the index to retrieve and output R rows in order.
How many disk I/O’s do these two methods require? Which one is the winner?
pf3

Partial preview of the text

Download Introduction to Database Systems-Home Work 04-Computer Science and more Study notes Introduction to Database Management Systems in PDF only on Docsity!

CPS 116 Fall 2004 Homework #4 (8.75% of course grade: 100 points) Assigned: Thursday, November 18 Due: Thursday, December 2

Problem 1 (20 points).

For each of the following modifications, show the result B +-tree obtained by applying the modification to the B +-tree shown below. Suppose that the maximum fan-out is 4. (Always start with the B+-tree shown below; do not apply the modifications to the result of previous modifications.)

50

20 60 70 8 0

5 10 20 45 50 51 5 5 60 65 70 71 7 5 80 85

(a) Insert 21. (b) Delete 50. (c) Insert 79. (d) Delete 10.

Problem 2 (12 points).

A table R ( K , A , …) with 100,000 rows is stored in 10,000 disk blocks. The rows are sorted by K , but not by A. There is a dense, secondary B+-tree index on R ( A ), which has 3 levels and 500 leaves.

Suppose we want to sort R by A. We have 101 memory blocks at our disposal. Method 1 performs an external-memory merge sort using all memory available. Method 2 takes advantage of the fact that the values of A are already sorted in the B +-tree index on R ( A ): It simply scans the leaves of the index to retrieve and output R rows in order.

How many disk I/O’s do these two methods require? Which one is the winner?

Problem 3 (35 points).

Consider tables R ( A , B , C ), S ( C , D ), and T ( D , E ). Transform the following query into an equivalent query that:

  • Contains no cross products;
  • Performs projections and selections as early as possible.

(a) π R. B , S. D , T. E σ( R. A =10) and ( R. C = S. C ) and ( S. D = T. D ) and ( R. A > T. E ) ( R × S × T )

Suppose we have the following statistics:

  • | R | = 1,000; | π A R | = 1,000; | π B R | = 100; | π C R | = 500;
  • | S | = 5,000; | π C S | = 300; | π D S | = 10;
  • | T | = 4,000; | π D T | = 4,000; | π E T | = 1,500.

Estimate the number of the tuples returned by the following queries:

(b) σ A =10 R (c) σ A =10 and B = “Bart” R (d) σ A =10 or B = “Bart” R (e) R  S (f) R  S  T

For the following question, further suppose that:

  • Each disk/memory block can hold up to 10 tuples;
  • All tables are stored compactly on disk (10 tuples per block) in no particular order;
  • No indexes are available;
  • 11 memory blocks are available for query processing.

(g) What is the best execution plan (in terms of number of I/O’s performed) you can come up with for the query σ R. B = “Bart” and S. D = 100 ( R  S )? Describe your plan and show the calculation of its I/O cost.

Problem 4 (15 points).

For each schedule below, tell whether it is conflict-serializable. If yes, also tell:

  • Whether it is recoverable;
  • Whether it avoids cascading rollbacks;
  • Whether it is possible under strict 2PL.

(a) T 1 .write( B ), T 2 .read( A ), T 2 .write( A ), T 1 .read( A ), T 1 .write( A ), T 1 .commit, T 2 .commit (b) T 1 .write( B ), T 2 .read( A ), T 2 .write( A ), T 1 .read( A ), T 1 .write( A ), T 2 .commit, T 1 .commit (c) T 1 .write( B ), T 2 .read( A ), T 2 .write( A ), T 2 .commit, T 1 .read( A ), T 1 .write( A ), T 1 .commit (d) T 1 .write( B ), T 2 .read( A ), T 1 .read( A ), T 2 .write( A ), T 1 .write( A ), T 2 .commit, T 1 .commit (e) T 2 .write( B ), T 2 .read( A ), T 2 .write( A ), T 1 .write( B ), T 2 .commit, T 1 .read( A ), T 1 .commit