Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Database Systems Homework 4: Query Execution and Cost Estimation, Assignments of Deductive Database Systems

University of Illinois - Urbana-Champaign Deductive Database Systems

Solutions to problem 1-4 of cs411 database systems fall 2005 homework 4. The problems cover query execution using block-nested loop join and b+ tree index, merge sort, and cost estimation. The document also includes answers to algebraic laws and dynamic programming subproblems.

Typology: Assignments

Pre 2010

Uploaded on 03/16/2009

koofers-user-nm2 🇺🇸

2.5

(1)

10 documents

1 / 4

This page cannot be seen from the preview

Don't miss anything!

CS411 Database Systems

Fall 2005

HW #4

Due: 11:30am CST, 12/9/05

Note: Print your name and NetID in the upper right corner of every page of your submission.

Handin your homework to Donna Coleman in 2120 SC.

Problem 1 Query Execution 15 points, 5 points each subproblem

Consider the following two relations:

Students(name, sid, latest-course-id)

TakeCourses(sid, cid, cname)

where sid,cid,cname are student id, course id, and course name, respectively. sid is the key

of relation Students, and (sid,cid) form the key of TakeCourses.Students has 10,000 tuples,

with 25 tuples fitting on a block. TakeCourses has 5,000 tuples, with 50 tuples fitting on a block.

There is no index on any attribute of the relations.

(a) Suppose the memory buffer has 101 blocks, compute the cost of using a block-nested loop

join to join the above two relations.

(b) Suppose the memory buffer has 61 blocks. Compute the cost of using a block-nested loop

join to join the above two relations.

(c) Suppose we join the above two relations on attribute sid. Suppose that there is a B+ tree

index on sid for Students, with the fanout of 20 (i.e., each tree node points to 20 other nodes).

Compute the worst-case cost of doing a block-nested loop join that utilizes this index.

Answer:

(a)

B(S) = 10000

25 = 400

B(T) = 5000

50 = 100

Cost =B(T) + dB(T)

M−1eB(S) = 100 + d100

101 −1e400 = 500

(b)

Cost =B(T) + dB(T)

M−1eB(S) = 100 + d100

61 −1e400 = 100 + 800 = 900

(c) The worse case is when all the B+ tree indices are on disk. Given that there are 10000

tuples in Students, and that the fanout is 20, we need at least 4 levels in the B+ tree. (Level 1

1

Discover Assignments of Deductive Database Systems University of Illinois - Urbana-Champaign

Partial preview of the text

Download Database Systems Homework 4: Query Execution and Cost Estimation and more Assignments Deductive Database Systems in PDF only on Docsity!

CS411 Database Systems

Fall 2005

HW

Due: 11:30am CST, 12/9/

Note: Print your name and NetID in the upper right corner of every page of your submission. Handin your homework to Donna Coleman in 2120 SC.

Problem 1 Query Execution 15 points, 5 points each subproblem

Consider the following two relations:

Students(name, sid, latest-course-id) TakeCourses(sid, cid, cname) where sid, cid, cname are student id, course id, and course name, respectively. sid is the key of relation Students, and (sid,cid) form the key of TakeCourses. Students has 10,000 tuples, with 25 tuples fitting on a block. TakeCourses has 5,000 tuples, with 50 tuples fitting on a block. There is no index on any attribute of the relations. (a) Suppose the memory buffer has 101 blocks, compute the cost of using a block-nested loop join to join the above two relations. (b) Suppose the memory buffer has 61 blocks. Compute the cost of using a block-nested loop join to join the above two relations. (c) Suppose we join the above two relations on attribute sid. Suppose that there is a B+ tree index on sid for Students, with the fanout of 20 (i.e., each tree node points to 20 other nodes). Compute the worst-case cost of doing a block-nested loop join that utilizes this index. Answer: (a)

B(S) =

B(T ) =

Cost = B(T ) + d

B(T )

M − 1

eB(S) = 100 + d

e400 = 500

(b)

Cost = B(T ) + d

B(T )

M − 1

eB(S) = 100 + d

e400 = 100 + 800 = 900

(c) The worse case is when all the B+ tree indices are on disk. Given that there are 10000 tuples in Students, and that the fanout is 20, we need at least 4 levels in the B+ tree. (Level 1

can have at most 20 pointers, Level 2 can have at most 400 pointers, Level 3 can have at most 8000 pointers, Level 4 can have at most 160000 pointers). Therefore to get to a tuple in the students relation, we need 5 disk I/Os (4 for fetching blocks in B+ tree and 1 for fetching the data block). Therefore, the worse-case cost is 100+5000*5=25100.

Problem 2 Query Execution Cont 20 points

Relation R(x,y) occupies 1000 blocks, and relation S(x,z) occupies 900 blocks. First, we will use two-phase, multiply merge sort to sort relation R(x,y) and S(x,z). This will require 4(1000+900) = 7600 disk I/Os. Then will we will merge the sorted R and S. For x=1, 4, 5, we can apply the standard algorithm described on page 744, because the tuples in both relations taking these values can fit into the memory. This will induce 300 disk I/Os. For x=2, tuples in R can fit into the memory but tuples in S could not. Therefore, we will load all the tuples in R taking value x=2 into memory, and read the blocks of S taking value x=2 one at a time. This will induce 500 disk I/Os. Essentially using R as the outer relation in nested-loop join. For x=3, neither tuples in R nor tuples in S could fit into memory. Therefore, we will need to perform nested-loop join, which will induce 300 + 300 ∗ d 300 / 200 e = 900 disk I/Os. For x=6, tuples in S can fit into the memory but tuples in R could not. Therefore, we will load all the tuples in S taking value x=6 into memory, and read the blocks of R taking value x=6 one at a time. This will induce 500 disk I/Os. Essentially using S as the outer relation in nested-loop join. Therefore, altogether we need 7600 + 300 + 500 + 900 + 500 = 9800 disk I/Os.

Problem 3 Algebraic Laws 15 points, 5 points each subproblem

(a) False. Counter example, R = {x}, S = {x}. However, R∪S = {x, x} 6 = S (b) True. We know that for each x that appears in R, its frequency fR(x) is less or equal to the frequency fS (x) of it appearing in S. Therefore, in R∩S, the frequency of x will be fR(x). This guarantees that all the elements in R will appear in R∩S for the same number of times. Also, according to the semantics of intersection, any elements not in R won’t be in R∩S. These imply that R∩S = R (c) True. From R ⊆ S, We know that for each x that appears in R, its frequency fR(x) is less or equal to the frequency fS (x) of it appearing in S. Then for the same x, from S ⊆ R we could knot that its frequency fS (x) of it appearing in S is less or equal to the frequency fR(x) of it appearing in R. Therefore, we know fR(x)= fS (x). Also from the semantics of ⊆, all elements in R will appear in S, and vice versa. These imply that R and S have the same elements, and each element appear the same number of times, in other words R=S.

Problem 4 Dynamic Programming 10 points, 5 points each subproblem

(e) Estimate the number of tuples in σ(b>25)∧(b=11)(R) Ans: 0 (as conditions contradict) (f) Estimate the number of tuples in σ(b=25)∨(d=13)(R) Ans: T(R)/V(R,b) + T(R)/V(R,d) - T(R)/(V(R,b)*V(R,d))= (g) Estimate the number of tuples in R ./ S Ans: T(R)T(s)/max(V(R,d), V(S,d))= (h) Estimate the number of tuples in R ./R.d=S.d AN D R.b=S.e S Ans: T(R)T(s)/(max(V(R,d), V(S,d)) max(V(R,b), V(S,e)))=

Problem 6 Failure Recovery 16 points, 8 points for (a) and 8 points for (b) (notice, c)

and d) don’t need to be graded)

(a) Transactions that need to be undone: T5, T4, T2. The following actions needs to be undone (order is from left to right): <T5,D,60>, <T2,E,50>, <T4,F,70>, <T2,C,30>, <T2,B,20>, <T2,A,10>. (b) Transaction that needs to be undone: T5. The following actions needs to be undone (order is from left to right): <T5,F,90>, <T5,D,60> (c) Transactions that need to be redone: T1, T The following actions needs to be redone order is from left to right): <T1,A,60>, <T3,D,40> (d) Transactions that need to be redone: T2, T3, T The following actions needs to be redone order is from left to right): <T2,A,10>, <T2,B,20>, <T2,C,30>, <T3,D,40>, <T4,F,70>, <T2,E,50>, <T4,B,80>

Database Systems Homework 4: Query Execution and Cost Estimation, Assignments of Deductive Database Systems

Related documents

Partial preview of the text

Download Database Systems Homework 4: Query Execution and Cost Estimation and more Assignments Deductive Database Systems in PDF only on Docsity!

CS411 Database Systems

Fall 2005

HW

Due: 11:30am CST, 12/9/

Problem 1 Query Execution 15 points, 5 points each subproblem

B(S) =

B(T ) =

B(T )

M − 1

B(T )

M − 1

Problem 2 Query Execution Cont 20 points

Problem 3 Algebraic Laws 15 points, 5 points each subproblem

Problem 4 Dynamic Programming 10 points, 5 points each subproblem

Problem 6 Failure Recovery 16 points, 8 points for (a) and 8 points for (b) (notice, c)