



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Exam; Professor: Chang; Class: Advanced Data Management; Subject: Computer Science; University: University of Illinois - Urbana-Champaign; Term: Fall 2009;
Typology: Exams
1 / 7
This page cannot be seen from the preview
Don't miss anything!




For each of the following statements:
You will get 4 point for each correct answer with correct explanations, and no penalty (of negative points) for wrong answers.
(1) T rue F alse [HW ] Predicate calculus is more high level than relational algebra. ⇒ Explain:
(2) T rue F alse [HW ] When deleting a node from an R-tree, reinsertion is chosen as a way to deal with “orphaned entries” because merging (as in B-tree) is infeasible for R-tree. ⇒ Explain:
(3) T rue F alse We can use R-tree to index multiple-attribute data items like (salary, age), but the indexing will not be effective. ⇒ Explain:
(4) T rue F alse [HW ] If a transaction releases its read lock before the end of the transaction, there is a danger of cascading rollback. ⇒ Explain:
(5) T rue F alse [HW ] Precision and recall as two major IR metrics were coined in the SMART project in 1960’s. ⇒ Explain:
(6) T rue F alse [HW ] In the discrimination value model, the value of an index term is based on its ”discrimination value”—which is predicted by its IDF. ⇒ Explain:
This problem will exercise your insight for the notion of PageRank. Consider each of the following graphs representing the Web, where nodes represent pages and directed edges hyperlinks. For simplicity, let’s use the same simplified PageRank definition as given in class.
Our question will be based on the following graph, which we call Circular, as the starting point, upon which we will make some changes.
a
b
c d
e
f
Part 1(8 points)
For the following Web graph, which is a slight change to Circular, speculate what would be the relative PageRank for each page by identifying pages with non-zero PageRank and their rank ratios, and explain why. Note that, we ask you to only speculate intuitively– without performing iterative fixpoint computation. If there is no clear intuition to speculate, state so and explain why.
a
b
c d
e
f
Part 2(12 points)
Can you change the graph Circular, so that some nodes have twice the PageRank of others? Is this possible? If so, propose minimal change to achieve this. Explain why.
Part 2(18 points)
Let’s continue such generalization for different types of data. Consider data type S as a set of integers, e.g., s1 = {1, 2, 6, 8}, s2 = {-1, 4, 15}, s3 = { 20 }, s4 = {48, 60, 102}.
Sketch, concisely, your design of an index tree for such data, by further generalizing the concepts of R-tree. Describe your design clearly—what types of queries are reasonable to support, what each node means, how to split a node, and how to perform search for a query.