






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Exam; Class: Database Systems; Subject: Computer Science; University: University of Illinois - Urbana-Champaign; Term: Fall 2005;
Typology: Exams
1 / 11
This page cannot be seen from the preview
Don't miss anything!







Problem 1 (12 points) Misc. Concepts
For each of the following statements, indicate whether it is TRUE or FALSE by circling your choice. You will get 1 point for each correct answer, -0.5 point for each incorrect answer, and 0 point for each answer left blank.
(1) T rue F alse Transaction management was one of the concepts Ted Codd created in his seminal work of defining the relational model.
(2) T rue F alse A relation R in 3NF is also in 4NF.
(3) T rue F alse The second and higher levels must be sparse in an index of multiple levels on sequential files.
(4) T rue F alse Hash index is efficient in answering range queries, such as “finding products with price higher than 100”.
(5) T rue F alse The order of insertions into a B+ tree will affect the tree’s final structure at the end.
(6) T rue F alse The time needed to access a page on disk is comprised of two components: disk seek time and data transfer time.
(7) T rue F alse If possible, we should always use a bushy join tree instead of a left-deep join tree, because the former is more efficient than the latter.
(8) T rue F alse In SQL, without GROUP BY, we cannot use HAVING.
(9) T rue F alse We can optimize query plans by pushing a selection down an expression tree, but not by moving a selection up the tree.
(10) T rue F alse Query optimization is a major concept that enables declarative SQL queries, because it au- tomatically generates query plans, without having users to write procedural queries.
(11) T rue F alse For buffer management, a policy like LRU may be inefficient for transaction processing in RDBMS, but it will not affect correctness.
(12) T rue F alse Pointer swizzling refers to the techniques that convert disk pointers to memory addresses.
Problem 3 (10 points) Schema Decomposition [HW ]
Consider a relation R with five attributes ABCDE. The following dependencies are given: A → B, BC → E, ED → A.
(a) List all keys for R. (3 points)
(b) Is R in 3NF? Briefly explain why. (3 points)
(c) Is R in BCNF? If yes, please explain why. Otherwise, decompose R into relations that are in BCNF. (4 points)
Problem 4 (10 points) Query Languages [HW ]
The following questions refer to the database schema below: Product(pid, price), Order (oid, cid, pid, quantity), Customer (cid, name, age).
(a) Write a query, in relational algebra, to return the names of senior customers (older than 60). (2 points)
(b) Write a query, in relational algebra, to return the names of customers who didn’t order any product. (3 points)
(c) Write a SQL query, to list each product along with its total ordered quantities, in descending order of the amount. (5 points)
Problem 6 (10 points) Query Processing [HW ]
Consider joining two relations R(x,y) and S (x,z) on their common attribute x. The size of relation R is 150 blocks and the size of relation S is 100 blocks. Attribute x has 50 different values and is evenly distributed in both R and S. Suppose that both relations are not sorted by attribute x.
(a) Suppose the memory buffer has 15 blocks, compute the cost of join using a block-nested loop join. (3 points)
(b) Suppose the memory buffer has 15 blocks, compute the cost of join using a sort-merge join. (3 points)
(c) Can you estimate the size of the output relation by joining R and S on x. If yes, estimate it. If no, explain why. (4 points)
Problem 7 (15 points) Query Optimization
Consider the following query that joins Student(sid, sname, sdept), Enrollment(sid, cid), Courses(cid, ctitle, iid), Instructor (iid, iname, iaddr).
select sname, ctitle, iname from Student S, Enrollment E, Course C, Instructor I where join-conditions AND selection-conditions
In the where-clause, we have the following conditions:
(a) Use this example, explain why join ordering is important for minimizing the cost of a query plan. Give an example scenario to support your explanation. (5 points)
(b) Consider dynamic programming for generating the optimal join order. Without actually working through the whole process, calculate how many subqueries each iteration needs to consider. (Note: we are asking for logical queries, NOT physical plans) (5 points)
Problem 8 (15 points) Failure Recovery [HW ]
Consider the following logs. Note that, in this log sequence, we also show various Output actions (at the time points when the actions are carried out).
Log ID Log 1 〈START T 1 〉
2 〈T 1, A, 10〉
3 〈START T 2 〉
4 〈T 1, B, 10〉
5 〈COMMIT T 1 〉
6 〈T 2, B, 10〉
7 〈COMMIT T 2 〉
// Output(A); Output(B);
8 〈START T 3 〉
9 〈T 3, A, 10〉
10 〈START T 4 〉
11 〈T 3, B, 20〉
12 〈COMMIT T 3 〉
13 〈T 4, C, 10〉
14 〈START T 5 〉
// Output(A); Output(B);
15 〈COMMIT T 4 〉
// Output(C)
16 〈T 5, D, 10〉
17 〈COMMIT T 5 〉
// Output(D)
(a) Can this be an undo log? Explain why or why not. (3 points)
Note: For the questions below, let’s assume that this log sequence is a redo log.
(b) Suppose we want to perform the outputs as early as possible, show on the figure how we should change the output actions. (4 points)
(c) Suppose we want to start checkpointing right after Action 13: First, show on the figure this start checkpointing log record. (2 points) Then, show on the figure the end checkpointing log record. (2 points)
(d) Continue from above. Suppose the system crashes after Action 16. Specify the log ID that we need to start to do redo actions. (4 points)