



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Exam; Professor: McQuain; Class: Data Structs & OO Development; Subject: Computer Science; University: Virginia Polytechnic Institute And State University; Term: Fall 2007;
Typology: Exams
1 / 6
This page cannot be seen from the preview
Don't miss anything!




Name (Last, First) printed
Pledge: On my honor, I have neither given nor received unauthorized aid on this examination.
signed
For questions 1 and 2, consider the following graph G 1 :
a) [12 points] Carefully describe how the program should use the buffer pool in order to take advantage of spatial locality.
Spatial locality is exhibited if, when a particular record is accessed, there is a high probability that a nearby record will be accessed in the near future. Taking advantage of spatial locality requires that the program pre- fetch some nearby records when a record is retrieved from the file to satisfy a query. Naturally, the fetched records would be stored in the buffer pool. The replacement policy might seem to be a concern. However, nothing special needs to be done; if the client tells the buffer pool to load a collection of records, rather than just one, then each of those "extra" records will naturally reflect either a proper recent-use time (LRU) or a proper frequency count (LFU).
b) [12 points] Carefully describe how the program should use the buffer pool in order to take advantage of temporal locality.
Temporal locality is exhibited, if when a particular record is accessed, there is a high probability that same record will be accessed again in the near future. Simply using the buffer pool in the usual manner will take advantage of temporal locality so long as the replacement policy is reasonable.
A number of suggestions were discussed in class, including:
Note: the question is about binsort, not radix sort.
Let B be the number of bins; in any case B must be at least as large as N and typically B is much larger than N.
The pass of binsort through the data elements will leave then in sorted order, but scattered throughout the B bins. It is still necessary to collect the elements into a single list, and that will require a linear pass through the bins, which would be Θ(B).
That makes the total cost of binsort Θ(N + B). But this will only be Θ(N) if B is bounded by kN for some constant k. This, unfortunately, isn't guaranteed, even if the number of bins is feasible.
For a specific example, suppose that we are sorting a collection of N = 1000 integers in the range 0 to 1,000,000. We would need 10^6 bins, which might well be feasible, but the cost of the collecting pass would be on the order of N^2.
The size of the integers is of absolutely no concern here; binsort makes only one pass to sort the values and a second pass to collect them; and, if the integers are large but confined to a small range (e.g., 1000 integers between 900,000 and 1,000,000) then any intelligent use of binsort will take that into account and use a suitable number of bins (e.g., 100,000 instead of 1,000,000).
The occurrence of duplicate values is also irrelevant. Duplicates would simply go to the same slot; that's easily accommodated by making the slot a simple linked stack structure so insertion is still Θ(1). And, as far as binsort is concerned, it doesn't matter what order the duplicate elements occur in the slot. In the specific scenario described above, since the values being sorted are simple integers, all that's needed in each slot is actually just a counter.