Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

COSC 6340 Homework Solutions - Database Systems, Assignments of Database Management Systems (DBMS)

University of Houston (UH)Database Management Systems (DBMS)

Solutions to graded homework1 for cosc 6340 database systems course offered in spring 2003. The homework includes problems on implementing joins, relational database design, reasoning with dependencies, physical database design, and object similarity assessment. Each problem is followed by detailed explanations and computations.

Typology: Assignments

Pre 2010

Uploaded on 08/18/2009

koofers-user-9sb 🇺🇸

9 documents

1 / 2

This page cannot be seen from the preview

Don't miss anything!

Dr. Christoph F. Eick

Graded Homework1 COSC 6340

Spring 2003

Available Points: 72

Due Dates:

Problems 1-3: Submit solutions electronically to [email protected] (deadline We., March

19, 9p; no late solutions accepted!) and bring a hardcopy to the Th., March 20 class.

Some solutions will be discussed in the March 20, 2003 class.

Problems 4-5: Tu., April 1, 2003 in class or during my office hours (no late solutions

accepted!)

1) Implementing Joins [15]

Assume we have to implement a natural join, denoted by ‘



‘, between two relations

R1(A, B, C) and R2(A, D). A is the primary key for R2 and R1.A is a foreign key

(referring to R2.A) in R1 (R1[A] . R2[A]). R1 and R2 are stored as an unordered file

(blocks are assumed to be 100% full); R1 contains 200000 tuples that are stored in 500

blocks, and R2 has 50000 tuples that are stored in 1000 blocks (that is, every R2.A value

occurs at an average four time as a value of R1.A). Moreover, assume that only a small

buffer size of 8 blocks is available. Based on these assumptions answer the following

questions (indicate, if you assume in your computations that the output relation is written

back to disk or not):

a. How many tuples will the output relation R=R1



R2 contain? [1]

b. What is the cost for implementing the natural join using the block-nested loop join? [2]

c. Now assume that either a hashed index on A of R1 or a hashed index on A of R2 is

available (assume that there are no overflow pages). Compute the cost for using the index

nested loops join using the index for R1.A and for using the index nested loops join for

R2.A (2 computations][4]

d. Is it possible to apply the hash-join in this case (explain your answer!)? [2]

e. Which of the 4 proposed implementations is the best? [1]

f. Now assume that the response time of the methods you evaluated is not satisfactory.

What else could be done, to speed up R1



R2? [5]

2) Relational Database Design [12]

Assume a relation R(A, B, C, D, E) with the following functional dependencies is given:

(1) A  BC

(2) B  A

(3) D  A

(4) E  D

a) What is (are) the candidate key(s) for R? [1]

1

Discover Assignments of Database Management Systems (DBMS) University of Houston (UH)

Partial preview of the text

Download COSC 6340 Homework Solutions - Database Systems and more Assignments Database Management Systems (DBMS) in PDF only on Docsity!

Dr. Christoph F. Eick

Graded Homework1 COSC 6340

Spring 2003

Available Points: 72 Due Dates : Problems 1-3 : Submit solutions electronically to [email protected] (deadline We., March 19, 9p; no late solutions accepted! ) and bring a hardcopy to the Th., March 20 class. Some solutions will be discussed in the March 20, 2003 class. Problems 4-5 : Tu., April 1, 2003 in class or during my office hours ( no late solutions accepted! )

1) Implementing Joins [15]

Assume we have to implement a natural join, denoted by ‘ ‘, between two relations

R1(A, B, C) and R2(A, D). A is the primary key for R2 and R1.A is a foreign key (referring to R2.A) in R1 (R1[A] . R2[A]). R1 and R2 are stored as an unordered file (blocks are assumed to be 100% full); R1 contains 200000 tuples that are stored in 500 blocks, and R2 has 50000 tuples that are stored in 1000 blocks (that is, every R2.A value occurs at an average four time as a value of R1.A). Moreover, assume that only a small buffer size of 8 blocks is available. Based on these assumptions answer the following questions (indicate, if you assume in your computations that the output relation is written back to disk or not):

a. How many tuples will the output relation R=R1  R2 contain? [1]

b. What is the cost for implementing the natural join using the block-nested loop join? [2] c. Now assume that either a hashed index on A of R1 or a hashed index on A of R2 is available (assume that there are no overflow pages). Compute the cost for using the index nested loops join using the index for R1.A and for using the index nested loops join for R2.A (2 computations][4] d. Is it possible to apply the hash-join in this case (explain your answer!)? [2] e. Which of the 4 proposed implementations is the best? [1] f. Now assume that the response time of the methods you evaluated is not satisfactory.

What else could be done, to speed up R1  R2? [5]

2) Relational Database Design [12]

Assume a relation R(A, B, C, D, E) with the following functional dependencies is given: (1) A  BC (2) B  A (3) D  A (4) E  D a) What is (are) the candidate key(s) for R? [1] 1

b) Transform R into a relational schema that is in BCNF and which does not have any lost dependencies; if this is not possible decompose R into a schema that is in BCNF and has the fewest number of lost functional dependencies. Give the derivation tree for your solution and specify clearly, if and what functional dependencies are lost for a particular relational schema. [11]

3) Reasoning with Dependencies [13]

Consider the following relation R(A,B,C,D,E) with the following dependencies is given (  denotes a functional dependency and  denotes a multi-valued dependency): (1) A  C (2) B  DE a) Assume we decompose R into R1(A,B, C) and R2(C, D, E). Does this decomposition have the lossless join property --- is it possible to reconstruct R from R1 and R2 using a natural join? Give reasons for your answer! [6] b) What is (are) the candidate key(s) of R? [1] c) Does ED hold for R? Give reasons for your answer (either show that the dependency can be inferred from (1) and (2), or give a counter example of the contents of R that satisfies (1) and (2) but which violates ED)! [6]

4) Physical Database Design and Database Tuning [18]

Solve problem 20.6 (problem 16.8 second edition) of the textbook (pages 689/493)

5) Object Similarity Assessment [14]

Assume the following relation Students(ssn, age, gpa, avg_class_rank, Country, Continent) that contains students that were admitted in the year 2000 into our undergraduate program is given. You can assume that  age is an integer; the maximum age is 50 the minimum age is 20, and the average age is 28 and the mean absolute deviation is 10.  gpa denotes the UH COSC gpa; the average gpa is 2.8 and the mean absolute deviation is 0.6; the maximum gpa is 4.0 the minimum gpa is 0.  Avg_class_rank has 5 values (4=top-5%, 3=top-15%, 2=top-25% 1=top_half, 0=bottom half)  Assume that students from the same continent are more similar than students from different continents. a) Define a student similarity measure that considers gpa and class_rank of being of major importance, and age and country/continent of being of minor importance. [11] b) Using your (dis)similarity measure compute the (dis)similarity for the following 4 student tuples [3] : (111111111, 25, 2.8, 2, China, Asia) (222222222, 24, 3.7, 3, India, Asia) (333333333, 37, 2.9, 4, USA, America) (444444444, 24, 3.8, 4, USA, America) 2

COSC 6340 Homework Solutions - Database Systems, Assignments of Database Management Systems (DBMS)

Related documents

Partial preview of the text

Download COSC 6340 Homework Solutions - Database Systems and more Assignments Database Management Systems (DBMS) in PDF only on Docsity!

Graded Homework1 COSC 6340

Spring 2003

1) Implementing Joins [15]

Assume we have to implement a natural join, denoted by ‘ ‘, between two relations

a. How many tuples will the output relation R=R1  R2 contain? [1]

What else could be done, to speed up R1  R2? [5]

2) Relational Database Design [12]

3) Reasoning with Dependencies [13]

4) Physical Database Design and Database Tuning [18]

5) Object Similarity Assessment [14]