Programming Assignment 1 for Data Structures - Fall 2007 | CMSC 420, Assignments of Data Structures and Algorithms

Material Type: Assignment; Professor: Samet; Class: Data Structures; Subject: Computer Science; University: University of Maryland; Term: Unknown 1989;

Typology: Assignments

Pre 2010

Uploaded on 02/13/2009

koofers-user-mvw
koofers-user-mvw 🇺🇸

10 documents

1 / 11

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Fall 2007 CMSC 420
Hanan Samet
Programming Assignment 1:
A Data Structure For Game Programming 1
Abstract
In this assignment you are required to implement a system for handling data similar to that
used in game programming. In such an environment the primary entities are polygons and the
problem we are interested is how to manage a large collection of them. In this assignment, we
will restrict input to a collection of small rectangles. In the following we trace the development
of a variant of the quadtree data structure that has been found to be useful for such problems.
Your task is to implement this data structure in such a way that a number of operations can be
handled efficiently. An example JAVA applet for the data structure can be found on the home
page of the class.
This assignment is divided into four parts. C++ is the preferred programming language
although you may use C or PASCAL. For the first two parts, you must read the attached de-
scription of the problem and data structure. A detailed explanation of the assignment including
the specification of the operations which you are to implement is found at the end of the de-
scription. After you have done this, you are to turn in a proposed implementation of the data
structure using C++ (or C or PASCAL) class definitions. One week later you must turn in a
C++ (or C or PASCAL) program for the command decoder (i.e., scanner for the commands
corresponding to the operations which are to be performed on the data structure). For the third
part, you are to write a C++ (or C or PASCAL) program to implement the data structure and
operations (1)-(9). For the fourth part, you are to implement operations (10)-(15). Operations
(16)-(18) are optional and you will get extra credit if you turn them in with part four. If you are
a graduate student, part four is not optional.
1Copyright c
2007 by Hanan Samet. No part of this document may be reproduced, stored in a retrieval system, or
transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the
express prior permission of the author.
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Programming Assignment 1 for Data Structures - Fall 2007 | CMSC 420 and more Assignments Data Structures and Algorithms in PDF only on Docsity!

Fall 2007 CMSC 420

Hanan Samet

Programming Assignment 1:

A Data Structure For Game Programming^1

Abstract In this assignment you are required to implement a system for handling data similar to that used in game programming. In such an environment the primary entities are polygons and the problem we are interested is how to manage a large collection of them. In this assignment, we will restrict input to a collection of small rectangles. In the following we trace the development of a variant of the quadtree data structure that has been found to be useful for such problems. Your task is to implement this data structure in such a way that a number of operations can be handled efficiently. An example JAVA applet for the data structure can be found on the home page of the class. This assignment is divided into four parts. C++ is the preferred programming language although you may use C or PASCAL. For the first two parts, you must read the attached de- scription of the problem and data structure. A detailed explanation of the assignment including the specification of the operations which you are to implement is found at the end of the de- scription. After you have done this, you are to turn in a proposed implementation of the data structure using C++ (or C or PASCAL) class definitions. One week later you must turn in a C++ (or C or PASCAL) program for the command decoder (i.e., scanner for the commands corresponding to the operations which are to be performed on the data structure). For the third part, you are to write a C++ (or C or PASCAL) program to implement the data structure and operations (1)-(9). For the fourth part, you are to implement operations (10)-(15). Operations (16)-(18) are optional and you will get extra credit if you turn them in with part four. If you are a graduate student, part four is not optional.

(^1) Copyright ©c 2007 by Hanan Samet. No part of this document may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the express prior permission of the author.

1 Region-Based Quadtrees

The quadtree is a member of a class of hierarchical data structures that are based on the principle of recursive decomposition. As an example, consider the point quadtree of Finkel and Bentley [1] which should be familiar to you as it is simply a multidimensional generalization of a binary search tree. In two dimensions each node has four subtrees corresponding to the directions NW, NE, SW, and SE. Each subtree is commonly referred to as a quadrant or subquadrant. For example, see Figure 1.14^2 where a point quadtree of 8 nodes is presented. In our presentation we shall only discuss two-dimensional quadtrees although it should be clear that what we say can be easily generalized to more than two dimensions. For the point quadtree the points of decomposition are the data points themselves (i.e., in Figure 1.14, Chicago at location (35,40) subdivides the two dimensional space into four rectangular regions). Requiring the regions to be of equal size leads to the region quadtree of Klinger [7–9]. This data structure was developed for representing homogeneous spatial data and is used in computer graphics, image processing, geographical information systems, pattern recognition, and other applications. For a history and review of the quadtree representation, see pp. 28–48 and 423–426 in [10].

As an example of the region quadtree, consider the region shown in Figure 1.28a which is rep- resented by a 2^3 × 23 binary array in Figure 1.28b. Observe that 1’s correspond to picture elements (termed pixels) which are in the region and 0’s correspond to picture elements that are outside the region. The region quadtree representation is based on the successive subdivision of the array into four equal-size quadrants. If the array does not consist entirely of 1’s or 0’s (i.e., the region does not cover the entire array), then we subdivide it into quadrants, subquadrants, ... until we obtain blocks (possibly single pixels) that consist entirely of 1’s or entirely of 0’s. For example, the resulting blocks for the region of Figure 1.28b are shown in Figure 1.28c. This process is represented by a quadtree in which the root node corresponds to the entire array, the four sons of the root node repre- sent the quadrants, and the leaf nodes correspond to those blocks for which no further subdivision is necessary. Leaf nodes are said to be BLACK or WHITE depending on whether their corresponding blocks are entirely within or outside of the region respectively. All non-leaf nodes are said to be GRAY. The region quadtree for Figure 1.28c is shown in Figure 1.28d.

2 MX Quadtrees

There are a number of ways of adapting the region quadtree to represent point data. If the domain of data points is discrete, then we can treat data points as if they were BLACK pixels in a region quadtree. An alternative characterization is to treat the data points as non-zero elements in a square matrix. We shall use this characterization in the subsequent discussion. To avoid confusion with the point and region quadtrees, we call the resulting data structure an MX quadtree (MX for matrix).

The MX quadtree is organized in a similar way to the region quadtree. The difference is that leaf nodes are BLACK or empty (i.e., WHITE) corresponding to the presence or absence, respectively, of a data point in the appropriate position in the matrix. For example, Figure 1.29 is the 2^3 × 23 MX quadtree corresponding to the data of Figure 1.1. It is obtained by applying the mapping f such that f ( z ) = z ÷ 12 .5 to both x and y coordinates. The result of the mapping is reflected in the coordinate values in the figure.

Each data point in an MX quadtree corresponds to a 1 × 1 square. For ease of notation and op- eration using modulo and integer division operations, the data point is associated with the lower left (^2) All numbered figures and page numbers refer to [10].

(a) (^) (b)

A

E

G

F

D

B C

{F}

{G}

{A,E}

{B,C,D}

E

D B

(c) (d)

Figure A: (a) Collection of rectangles and the decomposition induced by the MX-CIF quadtree; (b) the tree representation of (a); the binary trees for the y axes passing through the root of the tree in (b), and (d) the NE son of the root of the tree in (b).

tion. One major difference is that in the MX-CIF quadtree, unlike the MX quadtree, all nodes are of the same type. Thus, data is associated with both leaf and non-leaf nodes of the MX-CIF quadtree. Empty nodes in the MX-CIF quadtree are analogous to WHITE nodes in the MX quadtree. An empty node is like an empty son and is represented by a NIL pointer in the direction of a quadrant that contains no rectangles. For more details on MX quadtrees, see pp. 466–473.

One of the main drawbacks of the MX-CIF quadtree is that the size of the block c corresponding to the minimum enclosing quadtree block of object o ’s minimum enclosing bounding box b is not a function of the size of b or o. Instead, it is dependent on the position of o. In fact, c is often considerably larger than b thereby causing inefficiency in search operations due to a reduction in the ability to prune objects from further consideration. This situation arises whenever b overlaps the axes lines that pass through the center of c.

The cover fieldtree [2, 3], and the equivalent loose quadtree ( loose octree in three dimen- sions) [12], overcome this drawback by expanding the size of the space that is spanned by each quadtree block c of width w by a block expansion factor p ( p > 0) so that the expanded block is of width ( 1 + p ) · w. Thus instead of associating (inserting) objects with (into) their minimum en- closing quadtree blocks, they are associated with (inserted into) their minimum enclosing quadtree block.

The ideal value for p is 1 [12]. The rationale is that using block expansion factors much smaller than 1 increases the likelihood that the minimum enclosing expanded quadtree block is large, and that letting p be much larger than 1 results in the areas spanned by the expanded quadtree blocks being too large, thereby having much overlap. For example, letting p = 1, Figure B is the loose quadtree corresponding to the collection of objects in Figure A(a) and its MX-CIF quadtree in Fig- ure A(b). In this example, there are only two differences between the loose and MX-CIF quadtrees:

  1. Rectangle object E is associated with the SW child of the root of the loose quadtree instead of with the root of the MX-CIF quadtree.
  2. Rectangle object B is associated with the NW child of the NE child of the root of the loose quadtree instead of with the NE child of the root of the MX-CIF quadtree.

Insertion and deletion of rectangles in a loose quadtree are similar to those of MX-CIF, which is described on pp. 466–473 in [10] and the solutions to the exercises. The most common search

(a) (b)

A

E

G

F

D

B C

{G}

{A}

{B}

{E}

{C,D}

{F}

Figure B: (a) Block decomposition induced by the loose quadtree for a collection of rectangle objects identical to those in Figure A(a), and (b) its tree representation.

query is one that seeks to determine if a given rectangle overlaps (i.e., intersects) any of the existing rectangles. Such an operation is similar to a “range query”. However, they are more usefully cast in terms of finding all the rectangles in a given area (i.e., a window query). Another popular query is one that seeks to determine if one collection of rectangles can be overlaid on another collection without any of the component rectangles intersecting one another.

These two operations can be implemented by using variants of algorithms developed for han- dling set operations (i.e., union and intersection) in region-based quadtrees [5, 11]. The range query is answered by intersecting the query rectangle with the loose quadtree. An example JAVA applet for the loose quadtree data structure can be found on the home page of the class.

4 Assignment

This assignment has four parts. It is to be programmed in C++ (or C or PASCAL). The first part is concerned with data structure selection. The second part requires the construction of a command decoder. The third and fourth parts require that you implement a given set of operations.

The first part is to be turned in one week after this assignment has been distributed to you. It is worth 10 points. The second part is also worth 10 points. It is to be turned in two weeks after this assignment has been distributed to you. There will be NO late submissions accepted for these two parts of the assignment. While doing parts one and two you are also to start thinking and coding the program necessary to implement the operations. This should be done in such a way that the data structure is a BLACK BOX. Thus you need to specify your primitives in such a way that they are independent of the data structure finally chosen. You are strongly advised to begin implementing some of the operations. For example, you should implement an output routine so that you can see whether your program is working properly.

For the third and fourth parts of the assignment, you are to write a C++ (or C or PASCAL) program to implement the data structure and the specified operations. Together they are worth 60 points. Part three consists of operations (1)-(9) given below. They are worth 30 points. Part four consists of operations (10)-(15) given below. They are worth 30 points. Operations (16)-(18) are for extra credit and are to be turned in with part four. They are worth up to 5 points apiece.

4.3 Part Three: Basic Operations

In order to facilitate grading of these operations as well as the optional operations in Section 4.5, please provide a trace output of the execution of the operations which lists the nodes (both leaf and nonleaf) that have been visited while executing the operation. This trace is initiated by the command TRACE ON and is terminated by the command TRACE OFF. In order for the trace output to be concise, you are to represent each node of the loose quadtree that has been visited by a unique number which is formed as follows. The root of the quadtree is assigned the number 0. Given a node with number N , its NW, NE, SW ,and SE children are labeled 4 · N + 1, 4 · N + 2, 4 · N + 3, and 4 · N + 4, respectively. For example, starting at the root, the NE child is numbered 2, while the SE child of the NW child of the root is numbered 4(40+1)+4=8.

(1) Initialize the quadtree. The command INIT_QUADTREE(WIDTH,P) is always the first command in the input stream. WIDTH determines the length of each side of the square are covered by the quadtree. Each side has the length 2WIDTH. It also has the effect of starting with a fresh data set. P is the expansion factor of the loose quadtree.

(2) Generate a display of a 2WIDTH^ × 2 WIDTH^ square from the loose quadtree. It is invoked by the command DISPLAY(). To draw the loose quadtree, you are to use the drawing routines provided. We will provide you with an handout that describes their use, and the working of utilities SHOWQUAD and PRINTQUAD, that can be used to render the output of your programs on a screen or a printer. A dashed (broken) line should be used to draw quadrant lines, but the rectangles should be solid (i.e., not dashed). Rectangle names should be output somewhere near the rectangle or within the rectangle. Along with the name of a rectangle R , you should also print the node-number of node containing R. When this convention causes the output of a quadrant line to coincide with the output of the boundary of a rectangle, then the output of the rectangle takes precedence and the coincident part of the quadrant line is not output.

(3) List all the rectangles in the data base in alphanumerical order. This means that letters come before digits in the collating sequence. Similarly, shorter identifiers precede longer ones. For ex- ample, a sorted list is A, AB, A3D, 3DA, 5. It is invoked by the command LIST_RECTANGLES() and yields for each rectangle its name, the x and y coordinate values of its centroid, and the horizontal and vertical distances to its borders from the centroid. This is of use in interpreting the display since sometimes it is not possible to distinguish the boundaries of the rectangles from the display. You should list all of the rectangles in the database whether or not they have been deleted.

(4) Create a rectangle by specifying the coordinate values of its centroid and the distances from the centroid to its borders, and assign it a name for subsequent use. It is invoked by the command CREATE_RECTANGLE(N,CX,CY,LX,LY) where N is the name to be associated with the rectangle, CX and CY are the x and y coordinate values, respectively, of its centroid, and LX and LY are the horizontal and vertical distances, respectively, to its borders from the centroid. CX, CY, LX, and LY may be real or integer numbers. However, in the case of this assignment, we stipulate that the centroids and the distances from the centroids to the borders of the rectangles are integers. Output an appropriate message indicating that the rectangle has been created as well as its name and endpoints. Note that any rectangle can be created — even if it is outside the space spanned by the loose quadtree.

(5) Insert a rectangle in the loose quadtree. If any part of the rectangle is outside the space spanned by the loose quadtree, then do not make the insertion and report this fact by a suitable mes- sage such as INSERTION OF RECTANGLE N FAILED AS N LIES PARTIALLY OUTSIDE SPACE SPANNED BY LOOSE QUADTREE. Otherwise, return the name of the rectangle that is being inserted

as well as output a message indicating that this has been done. It is invoked by the command INSERT(N) where N is the name of a rectangle. It should be clear that the loose quadtree is built by a sequence of CREATE_RECTANGLE and INSERT operations.

(6) Move a rectangle in the loose quadtree. The command is invoked by MOVE(N,CX,CY) where N is the name of the rectangle, CX, CY are the translation of the centroid of N across the x and y coordinate axes. The command returns N if it was successful in moving the specified rectangle and outputs a message indicating it. Otherwise, output appropriate error messages if N was not found in the loose quadtree, or if after the operation N lies outside the space spanned by the loose quadtree.

(7) Given a point, return the names of the rectangles that contain it. The names of the rectangles are returned in a lexicographical order. It is invoked by the command SEARCH_POINT(PX,PY) where PX and PY are the x and y coordinate values, respectively, of the point. If no such rectangle exists, then output a message indicating that the point is not contained in any of the rectangles.

(8) Delete a rectangle or a set of rectangles from the loose quadtree. This operation has two vari- ants, DELETE_RECTANGLE and DELETE_POINT. The command DELETE_RECTANGLE(N) deletes the rectangle named N. It returns N if it was successful in deleting the specified rectangle and outputs a message indicating it. Otherwise, it outputs an appropriate message. The command DELETE_- POINT(PX,PY) has as its argument a point within the rectangle to be deleted whose x and y coordi- nate values are given by PX and PY, respectively. DELETE_POINT returns as its value the names of the rectangles that have been deleted and prints an appropriate message indicating their names. If the point is not in any rectangle, then an appropriate message indicating this is output. The code for DELETE_POINT should make use of SEARCH_POINT. Note that rectangle N is only deleted from the loose quadtree and not from the database of rectangles.

(9) Determine whether a query rectangle intersects (i.e., overlaps) any of the existing rectangles. This operation is invoked by the command REGION_SEARCH(N) where N is a name of a rectan- gle. If the rectangle does not intersect an existing rectangle, then REGION_SEARCH returns a value of false and outputs an appropriate message such as ‘‘N DOES NOT INTERSECT AN EXISTING RECTANGLE’’. Otherwise, it returns the value true and the names of the intersecting rectangles (i.e., if it intersects more than one rectangle) to output one of the following two messages: ‘‘N INTERSECTS RECTANGLE [NAMES OF RECTANGLES]’’. The names of the intersecting rectangles in the output are sorted in a lexicographical order. Note that if an endpoint of the query rectangle touches the endpoint of an existing rectangle, then REGION_SEARCH returns false. You are only to check against the rectangles that are in the loose quadtree of existing rectangles, and not the rect- angles that existed at some time in the past and have been deleted by the time this command is executed.

4.4 Part Four: Advanced Operations

(10) Determine all the rectangles in the loose quadtree that touch (i.e., are adjacent along a side or a corner) a given rectangle. This operation is invoked by the command TOUCH(N) where N is the name of a rectangle. Since rectangle N is referenced by name, N thus must be in the database for the operation to work but it need not necessarily be in the loose quadtree. The command returns the names of all the touched rectangles in conjunction with the following message ‘‘N SHARES ENDPOINT [X AND Y COORDINATE VALUES OF ENDPOINT] WITH THE RECTANGLES [NAME OF RECTANGLES]’’. Otherwise, the command returns NIL. For each rect- angle r that touches N, display (i.e., highlight) the point in r for which the x and y coordinate values are minimum (i.e., the lower-leftmost corner). It should be clear that the intersection of r with N is

4.5 Optional Operations

(16) Find the nearest neighbor in all directions to the boundary of a given rectangle. It is invoked by the command NEAREST_NEIGHBOR(N) where N is the name of a rectangle. By “nearest,” it is meant the rectangle C with a point on its side or corner, say P , such that the distance from P to a side or corner of the query rectangle is a minimum. NEAREST_NEIGHBOR returns as its value the name of the neighboring rectangle if one exists and NIL otherwise as well as an appropriate message. Rectangle N need not necessarily be in the loose quadtree. If more than one rectangle is at the same distance, then return the name of just one of them. Note that rectangles that are inside N are not considered by this query.

(17) Given a rectangle, find its nearest neighbor with a name that is lexicographically greater. It is invoked by the command LEXICALLY_GREATER_NEAREST_NEIGHBOR(N) where N is the name of a rectangle. By “lexically greater nearest” it is meant the rectangle C whose name is lexically greater than that of N with a point on C’s side, say P , such that the distance from P to a side of the query rectangle is a minimum. LEXICALLY_GREATER_NEAREST_NEIGHBOR returns as its value the name of the neighboring rectangle if one exists and NIL otherwise as well as an appropriate message. Rectangle N need not necessarily be in the loose quadtree. If more than one rectangle is at the same distance, then return the name of just one of them. Note that rectangles that are inside N are not considered by this query. This operation should not examine more than the minimum number of rectangles that are necessary to determine the lexically greater nearest neighbor. Thus you should use an incremental nearest neighbor algorithm (e.g., [4] which is described on pages 490–501 in [10]).

(18) Perform simple ray tracing operation on the loose quadtree. To keep things simple, let us reduce this problem to that of finding rectangles in the loose quadtree that intersect a line segment. In particular, we are only interested in finding the “first” rectangle that the “ray” intersects, assuming the ray enters the scene from (A,0) and leaves the scene from (B, 2 w -1), where A and B are positive numbers. This operation is invoked by the command RAYTRACE(A,B). Output the first rectangle that intersects the ray. If no such rectangle exists, then output a suitable message.

References

[1] R. A. Finkel and J. L. Bentley. Quad trees: a data structure for retrieval on composite keys. Acta Informatica , 4(1):1–9, 1974. [2] A. Frank. Problems of realizing LIS: storage methods for space related data: the fieldtree. Technical Report 71, Institute for Geodesy and Photogrammetry, ETH, Zurich, Switzerland, June 1983. [3] A. U. Frank and R. Barrera. The Fieldtree: a data structure for geographic information systems. In A. Buchmann, O. G¨unther, T. R. Smith, and Y.-F. Wang, editors, Design and Implementation of Large Spatial Databases—1st Symposium, SSD’89 , vol. 409 of Springer-Verlag Lecture Notes in Computer Science, pages 29–44, Santa Barbara, CA, July 1989. [4] G. R. Hjaltason and H. Samet. Distance browsing in spatial databases. ACM Transactions on Database Systems , 24(2):265–318, June 1999. Also University of Maryland Computer Science Technical Report TR–3919, July 1998. [5] G. M. Hunter and K. Steiglitz. Operations on images using quad trees. IEEE Transactions on Pattern Analysis and Machine Intelligence , 1(2):145–153, Apr. 1979.

[6] G. Kedem. The quad-CIF tree: a data structure for hierarchical on-line algorithms. In Pro- ceedings of the 19th Design Automation Conference , pages 352–357, Las Vegas, NV, June

  1. Also University of Rochester Computer Science Technical Report TR–91, September

[7] A. Klinger. Patterns and search statistics. In J. S. Rustagi, editor, Optimizing Methods in Statistics , pages 303–337. Academic Press, New York, 1971.

[8] H. Samet. Applications of Spatial Data Structures: Computer Graphics, Image Processing, and GIS. Addison-Wesley, Reading, MA, 1990.

[9] H. Samet. The Design and Analysis of Spatial Data Structures. Addison-Wesley, Reading, MA, 1990.

[10] H. Samet. Foundations of Multidimensional and Metric Data Structures. Morgan-Kaufmann, San Francisco, 2006.

[11] M. Shneier. Calculations of geometric properties using quadtrees. Computer Graphics and Image Processing , 16(3):296–302, July 1981. Also University of Maryland Computer Science Technical Report TR–770, May 1979.

[12] T. Ulrich. Loose octrees. In M. A. DeLoura, editor, Game Programming Gems , pages 444–

  1. Charles River Media, Rockland, MA, 2000.