Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Database Systems Homework 3: Problem Set, Assignments of Deductive Database Systems

University of Illinois - Urbana-Champaign Deductive Database Systems

Problems related to database systems, including merge-sort for large relations, pointer swizzling for data storage, data representation for student records, indexing on sequential files, and b+tree operations. Problems involve calculating minimum memory size, expected pointer following cost, record layout, minimal number of tuples for an index, and b+tree modifications.

Typology: Assignments

Pre 2010

Uploaded on 03/10/2009

koofers-user-ctn-1 🇺🇸

9 documents

1 / 4

This page cannot be seen from the preview

Don't miss anything!

CS411 Database Systems

Fall 2005

HW#3

Due: 11:30am CST, 11/18/05

Note: Print your name and NetID in the upper right corner of every page of your submission.

Problem 1 Merge-Sort

Suppose we have a relation with 15000 tuples and each tuple requires 10 bytes. We have a

machine whose main memory and disk-block size (80 bytes) are sufficient to sort the 15000

tuples using TPMMS (Two-Phase, Multiway Merge-Sort). What is the minimum size of the

main memory?

Problem 2 Pointer Swizzling

Suppose that the important actions related to data storage take the following times, in some

arbitrary time units:

•On-demand swizzling of a pointer: 30;

•Automatic swizzling of pointers: 20 per pointer;

•Following a swizzled pointer: 1;

•Following a unswizzled pointer: 10.

Suppose we design a pointer-swizzling control scheme like the following. At the beginning,

we automatically swizzle 20% of the pointers and leave the rest unswizzled. Once a pointer

is followed, we swizzle it by probability 0.5. If an unswizzled pointer has been followed twice,

we swizzle it. Suppose there are 100 pointers in our data. The number of times that they

are followed by a program is distributed according to the following histogram.

times of being followed 0 1 2 3

number of pointers 10 40 30 20

What’s the expected cost of this program in terms of pointer following?

Problem 3 Data Representation

A student record consists of the following fixed-length fields: date of birth and SSN, each

8 bytes long. It also has the following variable-length fields: name and office. The records

are augmented by an additional repeating field that represents exams. Each exam requires a

date (12 bytes) and an integer result (4 bytes) of the exam. Pointers within a record require

4 bytes, and the record length is a 4-byte integer. You may assume that no alignment of

fields is required.

(a) Show the layout of students records if:

1

Discover Assignments of Deductive Database Systems University of Illinois - Urbana-Champaign

Partial preview of the text

Download Database Systems Homework 3: Problem Set and more Assignments Deductive Database Systems in PDF only on Docsity!

CS411 Database Systems

Fall 2005

HW#

Due: 11:30am CST, 11/18/

Note: Print your name and NetID in the upper right corner of every page of your submission.

Problem 1 Merge-Sort

Suppose we have a relation with 15000 tuples and each tuple requires 10 bytes. We have a machine whose main memory and disk-block size (80 bytes) are sufficient to sort the 15000 tuples using TPMMS (Two-Phase, Multiway Merge-Sort). What is the minimum size of the main memory?

Problem 2 Pointer Swizzling

Suppose that the important actions related to data storage take the following times, in some arbitrary time units:

On-demand swizzling of a pointer: 30;
Automatic swizzling of pointers: 20 per pointer;
Following a swizzled pointer: 1;
Following a unswizzled pointer: 10.

Suppose we design a pointer-swizzling control scheme like the following. At the beginning, we automatically swizzle 20% of the pointers and leave the rest unswizzled. Once a pointer is followed, we swizzle it by probability 0.5. If an unswizzled pointer has been followed twice, we swizzle it. Suppose there are 100 pointers in our data. The number of times that they are followed by a program is distributed according to the following histogram.

times of being followed 0 1 2 3 number of pointers 10 40 30 20

What’s the expected cost of this program in terms of pointer following?

Problem 3 Data Representation

A student record consists of the following fixed-length fields: date of birth and SSN, each 8 bytes long. It also has the following variable-length fields: name and office. The records are augmented by an additional repeating field that represents exams. Each exam requires a date (12 bytes) and an integer result (4 bytes) of the exam. Pointers within a record require 4 bytes, and the record length is a 4-byte integer. You may assume that no alignment of fields is required.

(a) Show the layout of students records if:

(1) The variable-length fields and repeating exams are kept within the record itself.

(2) The variable-length fields and repeating exams are stored outside of the record, with pointers to them in the record.

(b) The exam results of a student has a probability p of being inqueried. There are n test results per student in average. The average lengths of name and office are 8 bytes and 12 bytes, respectively. Suppose the cost function of the above scheme (1) is len, where len is the record length. The cost of scheme (2) is len + 100 × p, for penalizing accessing exam results on seperate blocks. What is the minimal value of n as a function of p, so that the above scheme (2) is better than scheme (1).

Problem 4 Indexes on Sequential Files

Given a relation of 200,000 tuples. Suppose each block could hold 10 tuples or 20 key-pointer pairs. The index is built on the key field of the relation (thus no duplicate search keys) and the file is sorted according to the key. Answer the following questions:

a) How many blocks do we need for a dense index of this relation?

b) How many blocks do we need for a sparse index of this relation?

c) How many blocks do we need for a 2-level index of this relation (given that the first level is dense)?

d) Suppose we use a 3-level index whose first level is sparse. If at most 12000 blocks can be used to store the index, what is the maximal number of tuples that the relation can have?

Problem 5 B+tree

Execute the following operations on Fig. 13.23 (page 635 of the textbook). Show the detailed steps, using Example 13.23 and Example 13.24 as examples.

(a) Lookup the record with key 29.

(b) Lookup the record with key 38.

(c) Lookup all records in the range 25 to 38.

(d) Lookup all records with keys less than 25.

Problem 6 B+tree

Consider the B+ tree index of degree 2 shown in the figure below. For each problem below, you only need to show the final resulting B+ Tree.

a) Show the tree that would result from inserting a data entry with key 40 into this tree.

b) Show the B+ tree that would result from inserting a data entry with key 30 into the original tree.

c) Show the B+ tree that would result from deleting the data entry with key 45 from the original tree.

documents that contain the keyword ”database”, all the documents that contain both ”in- formation” and ”retrieval”, and so on. Indexing documents is essential in processing IR queries efficiently.

The following questions (a) and (b) require you to do some research on the textbook, liter- ature and Web.

a) What kind of index structure is often used for documents?

b) Briefly summarize how such index is utilized to index documents and why it is helpful in IR queries.

The following question require you to think and compare.

c) Why general indexing structures such as B+tree and hash table are inappropriate for indexing documents?

Database Systems Homework 3: Problem Set, Assignments of Deductive Database Systems

Related documents

Partial preview of the text

Download Database Systems Homework 3: Problem Set and more Assignments Deductive Database Systems in PDF only on Docsity!

CS411 Database Systems

Fall 2005

HW#

Due: 11:30am CST, 11/18/

Problem 1 Merge-Sort

Problem 2 Pointer Swizzling

Problem 3 Data Representation

Problem 4 Indexes on Sequential Files

Problem 5 B+tree

Problem 6 B+tree