Solved Problems for Midterm Exam 2 - Advanced Data Management | CS 511, Exams of Deductive Database Systems

Material Type: Exam; Professor: Chang; Class: Advanced Data Management; Subject: Computer Science; University: University of Illinois - Urbana-Champaign; Term: Fall 2010;

Typology: Exams

2010/2011

Uploaded on 06/14/2011

koofers-user-zpe-1
koofers-user-zpe-1 🇺🇸

3

(2)

10 documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
NetID:
CS511 Advanced Database Systems
Fall 2010, Prof. Chang
Department of Computer Science
University of Illinois at Urbana-Champaign
Midterm Examination 2
November 19, 2010
Time Limit: 75 minutes
Print your name and NetID below. In addition, print your NetID in the upper right
corner of every page.
Name: NetID:
Including this cover page, this exam booklet contains 6pages. Check if you have
missing pages.
The exam is open book and open notes (any and all books/notes). Scientific calculators
of any kinds are allowed. No other electronic devices are permitted. Any form of
cheating on the examination will result in a zero grade.
Please write your solutions in the spaces provided on the exam. You may use the blank
areas and backs of the exam pages for additional space or scratch work.
Please make your answers clear and succinct; you will lose credit for verbose, convo-
luted, or confusing answers. Simplicity does count!
Each problem has different weight. You should look through the entire exam before
getting started, to plan your strategy.
Problems that are related to homework or study-guide problems (e.g., in terms of
concepts covered) are marked with [HW ].
Problem 1 2 3 Total
Points 60 20 20 100
Score
Grader
1
pf3
pf4
pf5

Partial preview of the text

Download Solved Problems for Midterm Exam 2 - Advanced Data Management | CS 511 and more Exams Deductive Database Systems in PDF only on Docsity!

CS511 Advanced Database Systems

Fall 2010, Prof. Chang

Department of Computer Science

University of Illinois at Urbana-Champaign

Midterm Examination 2

November 19, 2010

Time Limit: 75 minutes

∙ Print your name and NetID below. In addition, print your NetID in the upper right

corner of every page.

Name: NetID:

∙ Including this cover page, this exam booklet contains 6 pages. Check if you have

missing pages.

∙ The exam is open book and open notes (any and all books/notes). Scientific calculators

of any kinds are allowed. No other electronic devices are permitted. Any form of

cheating on the examination will result in a zero grade.

∙ Please write your solutions in the spaces provided on the exam. You may use the blank

areas and backs of the exam pages for additional space or scratch work.

∙ Please make your answers clear and succinct; you will lose credit for verbose, convo-

luted, or confusing answers. Simplicity does count!

∙ Each problem has different weight. You should look through the entire exam before

getting started, to plan your strategy.

∙ Problems that are related to homework or study-guide problems (e.g., in terms of

concepts covered) are marked with [HW ].

Problem 1 2 3 Total

Points 60 20 20 100

Score

Grader

Problem 1 (60 points) Misc. Concepts

∙ For each True/False choice question, indicate whether the statement is true or false by circling your choice, and provide a brief explanation. ∙ For each Short Answer question, provide a brief answer. ∙ You will get 4 points for each correct answer, and 0 point otherwise. There is no penalty for wrong answers.

(1) 𝑇 𝑟𝑢𝑒 𝐹 𝑎𝑙𝑠𝑒 [HW ] In Gamma, all queries start their processing at a well-known, pre-defined host.

Answer: A dispatch process is used to assign an node to process a query. This aspect of Gamma differs it from a true distributed DBMS. (2) 𝑇 𝑟𝑢𝑒 𝐹 𝑎𝑙𝑠𝑒 [HW ] Postgres is an example of an object-relational database system.

Answer: PostgreSQL is a powerful, open source object-relational database system. It sup- ports OO concepts, such as classes and ADTs. (3) ShortAnswer [HW ] The buffer replacement policy Most Recently Used (MRU) discards the most recently used items first. Give an example to show that it may not be a right policy for a DBMS.

Answer: For sort-merge joins, it is better to to use LRU policy rather than MRU. (4) ShortAnswer What do you think is the most significant difference between an OS and a DBMS? Why?

Answer: There are several differences. A DBMS primarily manages data, while an OS primarily manages hardware. A DBMS is a specific service to manage data, while an OS provides general services. (5) 𝑇 𝑟𝑢𝑒 𝐹 𝑎𝑙𝑠𝑒 [HW ] In comparison, SEQUEL is more declarative than SQUARE, its predecessor.

Answer: SEQUEL introduces English-keyword format, which is even more convenient to users than the terse mathematical notation of SQUARE. SQUARE is essentially relational algebra and thus more procedural than SEQUEL. (6) 𝑇 𝑟𝑢𝑒 𝐹 𝑎𝑙𝑠𝑒 [HW ] In comparison, Predicate Calculus is more declarative than SQUARE.

Answer: Predicate Calculus is logic representation, which is purely declarative. In contrast, SQUARE is based on relational algebra, thus more procedural than Predicate Calculus.

(13) ShortAnswer [HW ] Who Invented the Probability Ranking Principle? A. Stephen E. Robertson B. Thomas Bayes C. William Cooper D. Some of the above E. None of the above

Answer: “Some of the above” or “None of the above”. No one really invented it entirely, but Cooper first communicated about probability ranking hypothesis, Later, Robertson put the Cooper’s hypothesis into a principle.

(14) [HW ] When justifying the Probability Ranking Principle, Robertson measures the “overall effec- tiveness” of an information retrieval system by recall and fallout.

Answer: Either false or true is right. If the answer is false, the explanation is : “overall effectiveness” is measured by expected recall and expected fallout.

(15) ShortAnswer Does any database system that is available today support R-tree indexing?

Answer: Yes. E.g. MySQL, SQLite, and PostgreSQL.

Problem 2 (20 points) R-Trees [HW ]

For node splitting in R-tree, the paper suggests to minimize the total area of bounding boxes after the split. Let’s call this strategy Total-Area.

Part 1(6 points) Why is strategy Total-Area desired? Give an example scenario when Total-Area is desired.

Answer: There are many ways to answer this question– and we accept any reasonable answers. Below is a sample answer.

A good strategy makes it less likely that subsequent searches will visit both new nodes, since the new node will be visited if it overlaps with the search area, a good heuristic is to minimize total area of the new nodes’ rectangles.

Figure 1: example

As Figure1 shows a is bettern than b, as a has a smaller total area. A search rectangle will less likely to overlap with both new nodes if split a is used.

Part 2(14 points) The strategy does not always work best. Give a counter example to show when Total-Area may not be desired. (7 points)

Answer: There are many ways to answer this question– and we accept any reasonable answers. Below is a sample answer. If the overlapping area of the rectangles bounding the new nodes are too much, it might not work well sespite of the small total area, since if the search areas overlap with one of them, it is very likely it also overlaps iwth the other.

Further, suggest a strategy that would work better for the scenario. (7 points)

Answer: There are many ways to answer this question– and we accept any reasonable answers. Below is a sample answer. One strategy for this scenario is to minimize the overlapping area of the rectangles of the two new nodes.