Programming Assignment 1: A Data Structure For VLSI Applications | CMSC 420, Assignments of Data Structures and Algorithms

Material Type: Assignment; Professor: Samet; Class: Data Structures; Subject: Computer Science; University: University of Maryland; Term: Fall 2003;

Typology: Assignments

Pre 2010

Uploaded on 07/30/2009

koofers-user-vx6-1
koofers-user-vx6-1 🇺🇸

4

(1)

9 documents

1 / 9

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Fall 2003 CMSC 420
Hanan Samet
Programming Assignment 1:
A Data Structure For VLSI Applications1
Abstract
In this assignment you are required to implement an information management system for
handling data similar to that used in VLSI (very large scale integration) applications. In such
an environment the primaryentities are small rectangles and the problemin which we are inter-
ested is how to manage a large collection of them. In the following we trace the development
of a variant of the quadtree data structure that has been found to be useful for such a problem.
Your task is to implement this data structure in such a way that a number of operations can be
efficiently handled. An example JAVA applet for the data structure can be found on the home
page of the class.
This assignment is dividedinto fourparts. PASCALis the preferredprogramminglanguage
although you may use C or C++. For the first two parts, you must read the attached description
of the problem and data structure. A detailed explanation of the assignment including the
specification of the operations which you are to implement is found at the end of the description.
After you have done this, you are to turn in a proposed implementation of the data structure
using PASCAL’s (or C or C++) record (structure) definition facility. One week later you must
turn in a PASCAL (or C or C++) program for the command decoder (i.e., scanner for the
commands corresponding to the operations which are to be performed on the data structure).
For the third part, you are to write a PASCAL (or C or C++) program to implement the data
structure and operations (1)-(8). For the fourth part, you are to implement operations (9)-(13).
Operations (14)-(16) are optional and you will get extra credit if you turn them in with part four.
1Copyright c
2003 by Hanan Samet. No part of this document may be reproduced, stored in a retrieval system, or
transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the
express prior permission of the author.
pf3
pf4
pf5
pf8
pf9

Partial preview of the text

Download Programming Assignment 1: A Data Structure For VLSI Applications | CMSC 420 and more Assignments Data Structures and Algorithms in PDF only on Docsity!

Fall 2003 CMSC 420

Hanan Samet

Programming Assignment 1: A Data Structure For VLSI Applications^1

Abstract In this assignment you are required to implement an information management system for handling data similar to that used in VLSI (very large scale integration) applications. In such an environment the primary entities are small rectangles and the problem in which we are inter- ested is how to manage a large collection of them. In the following we trace the development of a variant of the quadtree data structure that has been found to be useful for such a problem. Your task is to implement this data structure in such a way that a number of operations can be efficiently handled. An example JAVA applet for the data structure can be found on the home page of the class. This assignment is divided into four parts. PASCAL is the preferred programming language although you may use C or C++. For the first two parts, you must read the attached description of the problem and data structure. A detailed explanation of the assignment including the specification of the operations which you are to implement is found at the end of the description. After you have done this, you are to turn in a proposed implementation of the data structure using PASCAL’s (or C or C++) record (structure) definition facility. One week later you must turn in a PASCAL (or C or C++) program for the command decoder (i.e., scanner for the commands corresponding to the operations which are to be performed on the data structure). For the third part, you are to write a PASCAL (or C or C++) program to implement the data structure and operations (1)-(8). For the fourth part, you are to implement operations (9)-(13). Operations (14)-(16) are optional and you will get extra credit if you turn them in with part four.

(^1) Copyright c 2003 by Hanan Samet. No part of this document may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the express prior permission of the author.

1 Region-Based Quadtrees

The quadtree is a member of a class of hierarchical data structures that are based on the principle of recursive decomposition. As an example, consider the point quadtree of Finkel and Bentley [1] which should be familiar to you as it is simply a multidimensional generalization of a binary search

tree. In two dimensions each node has four subtrees corresponding to the directions  ,  ,  , and

. Each subtree is commonly referred to as a quadrant or subquadrant. For example, see Figure

1.6^2 where a point quadtree of 8 nodes is presented. In our presentation we shall only discuss two- dimensional quadtrees although it should be clear that what we say can be easily generalized to more than two dimensions. For the point quadtree the points of decomposition are the data points themselves (i.e., in Figure 1.6, Chicago at location (35,40) subdivides the two dimensional space into four rectangular regions). Requiring the regions to be of equal size leads to the region quadtree of Klinger [5,6,7]. This data structure was developed for representing homogeneous spatial data and is used in computer graphics, image processing, geographical information systems, pattern recognition, and other applications. For a history and review of the quadtree representation, see pp. 1–16 in [6].

As an example of the region quadtree, consider the region shown in Figure 1.1a which is rep- resented by a 2^3 by 2^3 binary array in Figure 1.1b. Observe that 1’s correspond to picture elements (termed pixels) which are in the region and 0’s correspond to picture elements that are outside the region. The region quadtree representation is based on the successive subdivision of the array into four equal-size quadrants. If the array does not consist entirely of 1’s or 0’s (i.e., the region does not cover the entire array), then we subdivide it into quadrants, subquadrants, ... until we obtain blocks (possibly single pixels) that consist entirely of 1’s or entirely of 0’s. For example, the result- ing blocks for the region of Figure 1.1b are shown in Figure 1.1c. This process is represented by a quadtree in which the root node corresponds to the entire array, the four sons of the root node repre- sent the quadrants, and the leaf nodes correspond to those blocks for which no further subdivision is necessary. Leaf nodes are said to be BLACK or WHITE depending on whether their corresponding blocks are entirely within or outside of the region respectively. All non-leaf nodes are said to be GRAY. The region quadtree for Figure 1.1c is shown in Figure 1.1d.

2 MX Quadtrees

There are a number of ways of adapting the region quadtree to represent point data. If the domain of data points is discrete, then we can treat data points as if they were BLACK pixels in a region quadtree. An alternative characterization is to treat the data points as non-zero elements in a square matrix. We shall use this characterization in the subsequent discussion. To avoid confusion with the point and region quadtrees, we call the resulting data structure an MX quadtree (MX for matrix).

The MX quadtree is organized in a similar way to the region quadtree. The difference is that leaf nodes are BLACK or empty (i.e., WHITE) corresponding to the presence or absence, respectively,

of a data point in the appropriate position in the matrix. For example, Figure 2.30 is the 2^3  23 MX

quadtree corresponding to the data of Figure 1.6. It is obtained by applying the mapping f such that

f z z 12  5 to both x and y coordinates. The result of the mapping is reflected in the coordinate

values in the figure.

Each data point in an MX quadtree corresponds to a 1  1 square. For ease of notation and op-

eration using modulo and integer division operations, the data point is associated with the lower left (^2) All numbered figures and page numbers refer to [6].

tion. One major difference is that in the MX-CIF quadtree, unlike the MX quadtree, all nodes are of the same type. Thus, data is associated with both leaf and non-leaf nodes of the MX-CIF quadtree. Empty nodes in the MX-CIF quadtree are analogous to WHITE nodes in the MX quadtree. An empty node is like an empty son and is represented by a NIL pointer in the direction of a quadrant that contains no rectangles.

The set of rectangles that intersect the lines passing through a subdivision point is subdivided into two sets. For example, consider subdivision point P centered at ( CX , CY ) which partitions a

2 LX  2 LY area. All input rectangles that intersect the line x CX form one set and all input

rectangles that intersect the line y CY form the other set. Equivalently, these sets correspond to the rectangles intersecting the y and x axes, respectively, passing through ( CX , CY ). If a rectangle intersects both axes (i.e., it contains the subdivision point P ), then we adopt the convention that it is stored with the set associated with the y -axis.

These subsets are implemented as binary trees, which in actuality are one-dimensional analogs of the MX-CIF quadtree. For example, Figure 3.21 illustrates the binary tree associated with the x and y axes passing through A, the root of the MX-CIF quadtree of Figure 3.20. The subdivision points of the axis lines are shown by the tick marks in Figure 3.20.

Insertion and deletion of rectangles in an MX-CIF quadtree are described on pp. 202–209. The most common search query is one that seeks to determine if a given rectangle overlaps (i.e., intersects) any of the existing rectangles. This operation is a prerequisite to the successful insertion of a rectangle. Range queries can also be performed. However, they are more usefully cast in terms of finding all the rectangles in a given area (i.e., a window query). Another popular query is one that seeks to determine if one collection of rectangles can be overlaid on another collection without any of the component rectangles intersecting one another.

These two operations can be implemented by using variants of algorithms developed for han- dling set operations (i.e., union and intersection) in region-based quadtrees [3,8]. The range query is answered by intersecting the query rectangle with the MX-CIF quadtree. The overlay query is an- swered by a two-step process. First, intersect the two MX-CIF quadtrees. If the result is empty, then they can be safely overlaid and we merely need to perform a union of the two MX-CIF quadtrees. It should be clear that Boolean queries can be easily handled. An example JAVA applet for the MX-CIF quadtree data structure can be found on the home page of the class.

4 Assignment

This assignment has four parts. It is to be programmed in PASCAL or C or C++. The first part is concerned with data structure selection. The second part requires the construction of a command decoder. The third and fourth parts require that you implement a given set of operations.

The first part is to be turned in one week after this assignment has been distributed to you. It is worth 10 points. The second part is also worth 10 points. It is to be turned in two weeks after this assignment has been distributed to you. There will be NO late submissions accepted for these two parts of the assignment. While doing parts one and two you are also to start thinking and coding the program necessary to implement the operations. This should be done in such a way that the data structure is a BLACK BOX. Thus you need to specify your primitives in such a way that they are independent of the data structure finally chosen. You are strongly advised to begin implementing some of the operations. For example, you should implement an output routine so that you can see whether your program is working properly.

For the third and fourth parts of the assignment, you are to write a PASCAL (or C or C++) program to implement the data structure and the specified operations. Together they are worth 60 points. Part three consists of operations (1)-(8) given below. They are worth 30 points. Part four consists of operations (9)-(13) given below. They are worth 30 points. Operations (14)-(16) are for extra credit and are to be turned in with part four. They are worth up to 7 points apiece.

In order to facilitate grading and your task, you are to use the data structure implementation that will be given to you in class on the first meeting date after you turn in the first two parts of the assignment. For any operation that is not implemented, say

, your command decoder must output

a message of the form 

In order to facilitate your program as well as lend some realism to your task you are to implement the MX-CIF quadtree in a raster-based graphics environment. This means that you are dealing with

a world of pixels. The size of the world can be varied, and is a 2 w^^  2 w^ array of pixels. As a default,

you should assume w 7, i.e., a size of 128  128. The pixel at the lower left corner has coordinate

values (0,0) and the pixel at the upper right corner has coordinate values (2 w^^  1,2 w^^  1). Each pixel

serves as the center of a square of size 1  1. This is the smallest unit into which our quadtrees

will decompose the world. In order to simplify the project and for optional operation (16) (i.e.,

for connected component labeling) to be meaningful, we stipulate that the centroids and the distances from the centroids to the borders of the rectangles are integers. All rectangles are of size

i  j , where 3  i  2 w^ and 3  j  2 w. In other words, the smallest rectangle is of size 3 by 3 and

the largest is 2 w^^  2 w. One class meeting date before the due date of each part of the project you will

be informed of the availability of and name of the test data file which you are to use in exercising your program for grading purposes. You should also prepare your own test data. A sample file for this purpose will also be provided.

4.1 Data Structure Selection

You are to select a data structure to implement the MX-CIF quadtree. Turn in a definition in the form of a set of PASCAL (or C or C++) records (structures). In doing this part of the assignment you should bear in mind the type of data that is being represented and the type of operations that will be performed on it. In order to ease your task, remember that the primitive entity is the rectangle. We specify a rectangle by giving the x and y coordinate values of its centroid, and the horizontal and vertical distances from the centroid to its borders. The rest of your task is to build on this entity adding any other information that is necessary. The nature of the operations is described in Sections 4.3–4.5.

From the description of the operations you will see that a name is associated with each rectangle. At times, the operations are specified in terms of these names. Thus you will also need a mechanism to efficiently keep track of these names. It should be integrated with the rest of your data structures.

4.2 Command Decoder

You are to turn in a working command decoder written in PASCAL (or C or C++) for all the commands (including the optional ones) given in Sections 4.3–4.5. You are not expected to do error recovery and can assume that the commands are syntactically correct. All commands will fit on one line. Lengths of names are restricted to 6 characters or less and can be any combination of letters

or digits (e.g., , ,  ,

, etc.). However, for your own safety you may wish to incorporate some primitive error handling. Test data for this part of the assignment will be found in a file specified by

false. You are only to check against the rectangles that are in the MX-CIF quadtree of existing rectangles, and not the rectangles that existed at some time in the past and have been deleted by the time this command is executed.

(6) Insert a rectangle in the MX-CIF quadtree. If the rectangle intersects an existing rectangle, then do not make the insertion and report this fact by returning the name of the intersecting rectangle. Also, if any part of the rectangle is outside the space spanned by the MX-CIF quadtree, then do not

make the insertion and report this fact by a suitable message such as 

. Otherwise,

return the name of the rectangle that is being inserted as well as output a message indicating that

this has been done. It is invoked by the command 

where is the name of a rectangle. It

should be clear that the MX-CIF quadtree is built by a sequence of 

and 

operations.

(7) Given a point, return the name of the rectangle that contains it. It is invoked by the command

where

 and

are the x and y coordinate values, respectively, of the point. If no such rectangle exists, then output a message indicating that the point is not contained in any of the rectangles.

(8) Delete a rectangle or a set of rectangles from the MX-CIF quadtree. This operation has two

variants, 

 and 

. The command 

deletes

the rectangle named . It returns if it was successful in deleting the specified rectangle and outputs

a message indicating it. Otherwise, it outputs an appropriate message. The command 

 ^   

has as its argument a point within the rectangle to be deleted whose x and y coor- dinate values are given by

 and

, respectively. 

returns as its value the name of the rectangle that has been deleted and prints an appropriate message indicating its name. If the point is not in any rectangle, then an appropriate message indicating this is output. The code for

should make use of 

. Note that rectangle is only deleted from the

MX-CIF quadtree and not from the database of rectangles.

4.4 Part Four: Advanced Operations

(9) Determine if a given rectangle touches (i.e., is adjacent along a side or a corner) an exist- ing rectangle in the MX-CIF quadtree. It is invoked by the command

where is the

name of a rectangle. The command returns the names of all the touched rectangles in conjunc-

tion with the following messages  

& . Otherwise the command re-

turns

. Rectangle need not necessarily be in the MX-CIF quadtree. For each rectangle r

that touches , display (i.e., highlight) the point in r for which the x and y coordinate values are

minimum (i.e., the lower-leftmost corner).

(10) Determine if there exists a rectangle in the MX-CIF quadtree within a given distance of a given rectangle. This is the so-called ‘lambda’ problem. Given a distance d , it is invoked by the command

 where is the name of the query rectangle, and is a number (integer here although

it could also be a real number in the more general case) corresponding to the distance. In essence, this operation constructs a query rectangle

with the same centroid as and distances

 ^ ^ and

 to the border. Now, the query returns the identity of all rectangles whose intersection with the region formed by the difference of

and is not empty (i.e,, any rectangle r that has at least

one point in common with

 ). In other words, we have a shell of width around and we want

all the rectangles that have a point in common with this shell. Rectangle need not necessarily

be in the MX-CIF quadtree. Note that for this operation you must recursively traverse the tree to

find the rectangles that overlap the query region. You will NOT be given credit for a solution that

uses neighbor finding, such as one (but not limited to) that starts at the centroid of and finds its

neighbors in increasing order of distance. This is the basis of another operation.

(11) Find the nearest neighboring rectangle in the horizontal and vertical directions, respectively, to a given rectangle. To locate a horizontal neighbor, use the command

where

is the name of the query rectangle.  

locates a vertical neighbor. By “nearest” horizontal (vertical) neighboring rectangle, it is meant the rectangle whose vertical (horizontal) side, or extension, is closest to a vertical (horizontal) side of the query rectangle. If the vertical (horizontal) extension of a side of rectangle r causes the extended side of r to intersect the query rectangle, then r is deemed to be at distance 0 and is thus not a candidate neighbor. In other words, the distance has to be greater than zero. The commands return as their value the name of the

neighboring rectangle if one exists and

otherwise as well as an appropriate message. Rectangle

need not necessarily be in the MX-CIF quadtree. If more than one rectangle is at the same distance,

then return the name of just one of them.

(12) Given a point, return the name of the nearest rectangle. By “nearest,” it is meant the rectangle whose side or corner is closest to the point. Note that this rectangle could also be a rectangle that

contains the point. In this case, the distance is zero, It is invoked by the command 

where

 and

are the x and y coordinate values, respectively, of the point. If no such rectangle exists (e.g., when the tree is empty), then output an appropriate message (i.e., that the tree is empty). If more than one rectangle is at the same distance, then return the name of just one of them.

(13) Find all rectangles in a rectangular window anchored at a given point. It is invoked by the

command  

where

and

are the x and y coordinate values, respec- tively, of the lower left corner of the window and

 and

are the horizontal and vertical distances, respectively, to its borders from the corner. Your output is a list of the names of the rectangles that are completely inside the window, and a display of the MX-CIF quadtree that only shows the rect- angles that are in the window. This is similar to a clipping operation. Draw the boundary of the window using a dashed rectangle. Do not show quadrant lines within the window. All arguments to

 are integers (i.e.,

, and

). Note that for this operation you must recursively traverse the tree to find the rectangles that overlap the query region. You will NOT be given credit for a solution that uses neighbor finding, such as one (but not limited to) that starts at the centroid of the window and finds its neighbors in increasing order of distance. This is the basis of another operation.

4.5 Optional Operations

(14) Find the nearest neighbor in all directions to the boundary of a given rectangle. It is invoked

by the command 

where is the name of a rectangle. By “nearest,” it is

meant the rectangle C with a point on its side, say P , such that the distance from P to a side of the

query rectangle is a minimum. 

returns as its value the name of the neighboring

rectangle if one exists and

otherwise as well as an appropriate message. Rectangle need not

necessarily be in the MX-CIF quadtree. If more than one rectangle is at the same distance, then

return the name of just one of them. Note that rectangles that are inside are not considered by this

query. In order to facilitate grading of this operation, please provide a trace output of the execution of the operation which lists the nodes (both leaf and nonleaf) that have been visited as well as the nearest neighbor. In order for the output to be concise, you are to represent each node that has been visited by a unique number which is formed as follows. The root of the quadtree is assigned