



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Exam; Professor: Chang; Class: Advanced Data Management; Subject: Computer Science; University: University of Illinois - Urbana-Champaign; Term: Fall 2010;
Typology: Exams
1 / 6
This page cannot be seen from the preview
Don't miss anything!




∙ For each True/False choice question, indicate whether the statement is true or false by circling your choice, and provide a brief explanation. ∙ For each Short Answer question, provide a brief answer. ∙ You will get 4 points for each correct answer, and 0 point otherwise. There is no penalty for wrong answers.
(1) 𝑇 𝑟𝑢𝑒 𝐹 𝑎𝑙𝑠𝑒 [HW ] In Gamma, all queries start their processing at a well-known, pre-defined host.
Answer: A dispatch process is used to assign an node to process a query. This aspect of Gamma differs it from a true distributed DBMS. (2) 𝑇 𝑟𝑢𝑒 𝐹 𝑎𝑙𝑠𝑒 [HW ] Postgres is an example of an object-relational database system.
Answer: PostgreSQL is a powerful, open source object-relational database system. It sup- ports OO concepts, such as classes and ADTs. (3) ShortAnswer [HW ] The buffer replacement policy Most Recently Used (MRU) discards the most recently used items first. Give an example to show that it may not be a right policy for a DBMS.
Answer: For sort-merge joins, it is better to to use LRU policy rather than MRU. (4) ShortAnswer What do you think is the most significant difference between an OS and a DBMS? Why?
Answer: There are several differences. A DBMS primarily manages data, while an OS primarily manages hardware. A DBMS is a specific service to manage data, while an OS provides general services. (5) 𝑇 𝑟𝑢𝑒 𝐹 𝑎𝑙𝑠𝑒 [HW ] In comparison, SEQUEL is more declarative than SQUARE, its predecessor.
Answer: SEQUEL introduces English-keyword format, which is even more convenient to users than the terse mathematical notation of SQUARE. SQUARE is essentially relational algebra and thus more procedural than SEQUEL. (6) 𝑇 𝑟𝑢𝑒 𝐹 𝑎𝑙𝑠𝑒 [HW ] In comparison, Predicate Calculus is more declarative than SQUARE.
Answer: Predicate Calculus is logic representation, which is purely declarative. In contrast, SQUARE is based on relational algebra, thus more procedural than Predicate Calculus.
(13) ShortAnswer [HW ] Who Invented the Probability Ranking Principle? A. Stephen E. Robertson B. Thomas Bayes C. William Cooper D. Some of the above E. None of the above
Answer: “Some of the above” or “None of the above”. No one really invented it entirely, but Cooper first communicated about probability ranking hypothesis, Later, Robertson put the Cooper’s hypothesis into a principle.
(14) [HW ] When justifying the Probability Ranking Principle, Robertson measures the “overall effec- tiveness” of an information retrieval system by recall and fallout.
Answer: Either false or true is right. If the answer is false, the explanation is : “overall effectiveness” is measured by expected recall and expected fallout.
(15) ShortAnswer Does any database system that is available today support R-tree indexing?
Answer: Yes. E.g. MySQL, SQLite, and PostgreSQL.
For node splitting in R-tree, the paper suggests to minimize the total area of bounding boxes after the split. Let’s call this strategy Total-Area.
Part 1(6 points) Why is strategy Total-Area desired? Give an example scenario when Total-Area is desired.
Answer: There are many ways to answer this question– and we accept any reasonable answers. Below is a sample answer.
A good strategy makes it less likely that subsequent searches will visit both new nodes, since the new node will be visited if it overlaps with the search area, a good heuristic is to minimize total area of the new nodes’ rectangles.
Figure 1: example
As Figure1 shows a is bettern than b, as a has a smaller total area. A search rectangle will less likely to overlap with both new nodes if split a is used.
Part 2(14 points) The strategy does not always work best. Give a counter example to show when Total-Area may not be desired. (7 points)
Answer: There are many ways to answer this question– and we accept any reasonable answers. Below is a sample answer. If the overlapping area of the rectangles bounding the new nodes are too much, it might not work well sespite of the small total area, since if the search areas overlap with one of them, it is very likely it also overlaps iwth the other.
Further, suggest a strategy that would work better for the scenario. (7 points)
Answer: There are many ways to answer this question– and we accept any reasonable answers. Below is a sample answer. One strategy for this scenario is to minimize the overlapping area of the rectangles of the two new nodes.