


Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Solutions to problem 1-4 of cs411 database systems fall 2005 homework 4. The problems cover query execution using block-nested loop join and b+ tree index, merge sort, and cost estimation. The document also includes answers to algebraic laws and dynamic programming subproblems.
Typology: Assignments
1 / 4
This page cannot be seen from the preview
Don't miss anything!



Note: Print your name and NetID in the upper right corner of every page of your submission. Handin your homework to Donna Coleman in 2120 SC.
Consider the following two relations:
Students(name, sid, latest-course-id) TakeCourses(sid, cid, cname) where sid, cid, cname are student id, course id, and course name, respectively. sid is the key of relation Students, and (sid,cid) form the key of TakeCourses. Students has 10,000 tuples, with 25 tuples fitting on a block. TakeCourses has 5,000 tuples, with 50 tuples fitting on a block. There is no index on any attribute of the relations. (a) Suppose the memory buffer has 101 blocks, compute the cost of using a block-nested loop join to join the above two relations. (b) Suppose the memory buffer has 61 blocks. Compute the cost of using a block-nested loop join to join the above two relations. (c) Suppose we join the above two relations on attribute sid. Suppose that there is a B+ tree index on sid for Students, with the fanout of 20 (i.e., each tree node points to 20 other nodes). Compute the worst-case cost of doing a block-nested loop join that utilizes this index. Answer: (a)
Cost = B(T ) + d
eB(S) = 100 + d
e400 = 500
(b)
Cost = B(T ) + d
eB(S) = 100 + d
e400 = 100 + 800 = 900
(c) The worse case is when all the B+ tree indices are on disk. Given that there are 10000 tuples in Students, and that the fanout is 20, we need at least 4 levels in the B+ tree. (Level 1
can have at most 20 pointers, Level 2 can have at most 400 pointers, Level 3 can have at most 8000 pointers, Level 4 can have at most 160000 pointers). Therefore to get to a tuple in the students relation, we need 5 disk I/Os (4 for fetching blocks in B+ tree and 1 for fetching the data block). Therefore, the worse-case cost is 100+5000*5=25100.
Relation R(x,y) occupies 1000 blocks, and relation S(x,z) occupies 900 blocks. First, we will use two-phase, multiply merge sort to sort relation R(x,y) and S(x,z). This will require 4(1000+900) = 7600 disk I/Os. Then will we will merge the sorted R and S. For x=1, 4, 5, we can apply the standard algorithm described on page 744, because the tuples in both relations taking these values can fit into the memory. This will induce 300 disk I/Os. For x=2, tuples in R can fit into the memory but tuples in S could not. Therefore, we will load all the tuples in R taking value x=2 into memory, and read the blocks of S taking value x=2 one at a time. This will induce 500 disk I/Os. Essentially using R as the outer relation in nested-loop join. For x=3, neither tuples in R nor tuples in S could fit into memory. Therefore, we will need to perform nested-loop join, which will induce 300 + 300 ∗ d 300 / 200 e = 900 disk I/Os. For x=6, tuples in S can fit into the memory but tuples in R could not. Therefore, we will load all the tuples in S taking value x=6 into memory, and read the blocks of R taking value x=6 one at a time. This will induce 500 disk I/Os. Essentially using S as the outer relation in nested-loop join. Therefore, altogether we need 7600 + 300 + 500 + 900 + 500 = 9800 disk I/Os.
(a) False. Counter example, R = {x}, S = {x}. However, R∪S = {x, x} 6 = S (b) True. We know that for each x that appears in R, its frequency fR(x) is less or equal to the frequency fS (x) of it appearing in S. Therefore, in R∩S, the frequency of x will be fR(x). This guarantees that all the elements in R will appear in R∩S for the same number of times. Also, according to the semantics of intersection, any elements not in R won’t be in R∩S. These imply that R∩S = R (c) True. From R ⊆ S, We know that for each x that appears in R, its frequency fR(x) is less or equal to the frequency fS (x) of it appearing in S. Then for the same x, from S ⊆ R we could knot that its frequency fS (x) of it appearing in S is less or equal to the frequency fR(x) of it appearing in R. Therefore, we know fR(x)= fS (x). Also from the semantics of ⊆, all elements in R will appear in S, and vice versa. These imply that R and S have the same elements, and each element appear the same number of times, in other words R=S.
(e) Estimate the number of tuples in σ(b>25)∧(b=11)(R) Ans: 0 (as conditions contradict) (f) Estimate the number of tuples in σ(b=25)∨(d=13)(R) Ans: T(R)/V(R,b) + T(R)/V(R,d) - T(R)/(V(R,b)*V(R,d))= (g) Estimate the number of tuples in R ./ S Ans: T(R)T(s)/max(V(R,d), V(S,d))= (h) Estimate the number of tuples in R ./R.d=S.d AN D R.b=S.e S Ans: T(R)T(s)/(max(V(R,d), V(S,d)) max(V(R,b), V(S,e)))=
and d) don’t need to be graded)
(a) Transactions that need to be undone: T5, T4, T2. The following actions needs to be undone (order is from left to right): <T5,D,60>, <T2,E,50>, <T4,F,70>, <T2,C,30>, <T2,B,20>, <T2,A,10>. (b) Transaction that needs to be undone: T5. The following actions needs to be undone (order is from left to right): <T5,F,90>, <T5,D,60> (c) Transactions that need to be redone: T1, T The following actions needs to be redone order is from left to right): <T1,A,60>, <T3,D,40> (d) Transactions that need to be redone: T2, T3, T The following actions needs to be redone order is from left to right): <T2,A,10>, <T2,B,20>, <T2,C,30>, <T3,D,40>, <T4,F,70>, <T2,E,50>, <T4,B,80>