



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The instructions and questions for Exam 2 of the Database Systems course, offered as CPSC 310 and CPSC 603 at Texas A&M University. The exam covers topics such as disk I/O, indexing, query execution, and query compilation. It includes True/False questions, as well as problems that require calculations and analysis.
Typology: Exams
1 / 7
This page cannot be seen from the preview
Don't miss anything!




CPSC 310: Database Systems / CSPC 603: Database Systems and Applications Exam 2 November 16, 2005
Name:
Instructions:
Please sign the academic integrity statement: “On my honor, as an Aggie, I have neither given nor received unauthorized aid on this academic work. In particular, I certify that I have not received or given any assistance that is contrary to the letter or the spirit of the guidelines for this exam.”
Signature:
References for these problems:
(a) (3 pts) On average, repeated random disk I/O’s are faster than repeated sequential disk I/O’s because random I/O’s tend to access different cylinders and therefore cause less contention.
(b) (3 pts) For any data file, it is possible to construct two separate sparse indexes on different keys.
(c) (3 pts) There is a benefit to constructing a two-level index that has a dense first level and a dense second level.
(d) (3 pts) RAID Level 5 (having each data disk serve as the spare disk for some blocks) allows the system to tolerate up to 5 disk crashes at the same time.
(e) (3 pts) Since there is no downside to having indexes, we may as well build an index on every attribute of every relation to speed up as many queries as possible.
(f) (3 pts) A drawback of hashing indexes is that they are slower than B-trees for queries of the form “attribute equals constant”.
(h) (3 pts) Say a randomly chosen block is being fetched from disk. The overall access delay per byte decreases as the block size increases.
execute the query than to perform a sequential scan on the file.
(j) (3 pts) Since natural join is associative and commutative, the same number of disk I/O’s are executed no matter what order a series of joins are done in.
(c) (6 pts) Which storage organization is better for queries in which all Company records need to be scanned in order of company id? Justify your answer.
Now consider indexing these relations. Suppose we want a primary index on Company.Number. Each index entry associates a 10 byte pointer with a key. The blocks available for use by the index hold 4096 bytes each, with 96 bytes reserved for header information.
(d) (7 pts) For the sequential organization described above, how many index blocks are needed for a sparse primary index? Justify your answer.
that they would be output by the join algorithm.
(b) (10 pts) Hash join. Assume there are two hash buckets, numbered 0 and 1, and that the hash function sends even values to bucket 0 and odd values to bucket 1. In the second phase, assume that bucket 0 is processed before bucket 1 and that the contents of a bucket are read in the same order as they were written.
(c) (9 pts) Consider the following logical query plan, which finds the names of all customers who ordered a book that was shipped after Jan 1, 2005.
Label each relational operator in the tree with the expected result size in terms of the number of tuples. Assume Containment of Value Sets and Preservation of Value Sets. Use the heuristics discussed in lecture, which are from the textbook.
Use these statistics: