Problem Set 7 Solutions for Data Structures | CS 230, Assignments of Data Structures and Algorithms

Material Type: Assignment; Class: LAB: Data Structures; Subject: Computer Science; University: Wellesley College; Term: Fall 2002;

Typology: Assignments

Pre 2010

Uploaded on 08/18/2009

koofers-user-pwo
koofers-user-pwo 🇺🇸

3

(1)

10 documents

1 / 13

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CS230 Data Structures Handout # 17
Prof. Lyn Turbak Sunday, November 10
Wellesley College
Problem Set 7
Due: Friday, November 15
Exam 2 Notice:
In class on Friday, November 15, the second take-home exam will be handed out. It will be
due at 11:59pm on Monday, November 25. This is a hard deadline. No extensions will be
given after this time. The exam will cover the material in lecture through Lecture 19 (Tue.
Nov. 12) and the material in problem sets through PS7, including immutable and mutable lists,
binary trees, and binary search trees; doubly-linked lists; “hand-wavy” performance comparisons of
data structures; and use and implementation of standard data structures, including stacks, queues,
priority queues, sets, bags, sequences, and tables. Because you should focus on the exam, it is
strongly recommended that you submit PS7 on time (11:59pm on Friday November 15).
Overview: In this problem set, you will get experience with using and implementing sets and
bags.
Download: To begin this assignment, you should download a copy of the directory ~cs230/download/ps7.
Submission:
For Problem 1, your hardcopy submission should be your final version of Frequency.java.
For Problem 2, your hardcopy submission should include:
your final version of BagDLLSortedFrontBack.java;
your testing transcripts showing that this work as expected;
your list of the running times requested in part (b).
For Problem 3, your hardcopy submission should include:
your final version of BagMBSTEntries.java;
your testing transcripts showing that this work as expected;
your list of the running times requested in part (b).
Your softcopy submission for this problem should be your entire ps7 directory.
Remember to include a signed cover sheet (found at the end of this problem set description) at
the beginning of your hardcopy submission.
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd

Partial preview of the text

Download Problem Set 7 Solutions for Data Structures | CS 230 and more Assignments Data Structures and Algorithms in PDF only on Docsity!

CS230 Data Structures Handout # 17 Prof. Lyn Turbak Sunday, November 10 Wellesley College

Problem Set 7

Due: Friday, November 15

Exam 2 Notice: In class on Friday, November 15, the second take-home exam will be handed out. It will be due at 11:59pm on Monday, November 25. This is a hard deadline. No extensions will be given after this time. The exam will cover the material in lecture through Lecture 19 (Tue. Nov. 12) and the material in problem sets through PS7, including immutable and mutable lists, binary trees, and binary search trees; doubly-linked lists; “hand-wavy” performance comparisons of data structures; and use and implementation of standard data structures, including stacks, queues, priority queues, sets, bags, sequences, and tables. Because you should focus on the exam, it is strongly recommended that you submit PS7 on time (11:59pm on Friday November 15).

Overview: In this problem set, you will get experience with using and implementing sets and bags.

Download: To begin this assignment, you should download a copy of the directory ~cs230/download/ps7.

Submission:

  • For Problem 1, your hardcopy submission should be your final version of Frequency.java.
  • For Problem 2, your hardcopy submission should include:
    • your final version of BagDLLSortedFrontBack.java;
    • your testing transcripts showing that this work as expected;
    • your list of the running times requested in part (b).
  • For Problem 3, your hardcopy submission should include:
    • your final version of BagMBSTEntries.java;
    • your testing transcripts showing that this work as expected;
    • your list of the running times requested in part (b).

Your softcopy submission for this problem should be your entire ps7 directory. Remember to include a signed cover sheet (found at the end of this problem set description) at the beginning of your hardcopy submission.

Problem 1 [30]: Frequency Revisited In this problem, we revisit the problem of determining the frequency with which words appear in a file, which we first considered in Problem 3 of PS2. This time around, the problem is simplified by our ability to use powerful abstract data types like sets, bags, and priority queues. You should begin this problem by creating from scratch a new public class named Frequency that will be used as a repository for many of the static methods you define in this problem.

a. [10]: byAlpha In your Frequency class, implement the following method:

public static void byAlpha (String filename); Prints the lower case version of each distinct word appearing in the file named filename one per line, along with the frequency with which it appears in the file. The words should be listed in alphabetical order.

In your implementation, you should use the standard abstract data types we have been studying (e.g., sets, bags, priority queues) to simplify your implementation. You should also implement a main method in Frequency that allows byAlpha to be tested on a file named filename by invoking java Frequency byAlpha filename. For example, Fig. 1 shows the result of executing byAlpha on the file initial.txt, which contains an initial segment of Dr.Seuss’s timeless classic Green Eggs and Ham, and Fig. 2 shows the result on the entire poem, which is in the file green.txt.

Notes:

  • Don’t forget to import java.util.Enumeration.
  • Use the FileWords class to extract words from a file.
  • Be sure to convert all words to lower case. Which String method should you use for this?
  • Think carefully about where you need to use ordered sets and bags, and where unordered ones will do.
  • For this problem, the following are the “default” implementations of sets, bags, and priority queues that you should use: BagVectorSorted MinPQVector MaxPQVector OrderedBagVectorSorted OrderedSetVectorSorted SetVectorSorted Note that you do not have to use all of these, only some of them. Indeed, some solutions only need to use one such structure.
  • In general, if you first make an enumeration of a collection and then delete elements from the collection, the effect of the deletions on the enumeration are unpredictable. For instance, consider the following: OrderedSet oset = new OrderedSetVectorSorted(); oset.insert("a");

$ java Frequence byAlpha green.txt % $

b. [20]: byFreq

In your Frequency class, implement the following method:

public static void byFreq (String filename); Prints the lower case version of each distinct word appearing in the file named filename one per line, along with the frequency with which it appears in the file. The words should be listed in order from high frequency words to low frequency words; words with the same frequency should be listed in alphabetical order.

In your implementation, you should use the standard abstract data types we have been studying (e.g., sets, bags, priority queues) to simplify your implementation.

You should also extend the main method in Frequency to allow byFreq to be tested on a file named filename by invoking java Frequency byFreq filename.

For example, Fig. 3 shows the result of executing byFreq on initial.txt and Fig. 4 shows the result on the file green.txt.

Notes:

  • All the notes from part (a) hold here as well.
  • You have been provided with the following WordEntry class, which you may find useful: public class WordEntry {

public String word; public int freq;

public WordEntry (String w, int f) { word = w; freq = f; }

} If you use this class, you will probably also want to define a comparator class that compares instances of WordEntry. The details of such a class are left up to you.

$ java Frequency byFreq green.txt % $

  • a: Frequency of words in green.txt, ordered alphabetically:
  • am:
  • and:
  • anywhere:
  • are:
  • be:
  • boat:
  • box:
  • car:
  • could:
  • dark:
  • do:
  • eat:
  • eggs:
  • fox:
  • goat:
  • good:
  • green:
  • ham:
  • here:
  • house:
  • i:
  • if:
  • in:
  • let:
  • like:
  • may:
  • me:
  • mouse:
  • not:
  • on:
  • or:
  • rain:
  • sam:
  • sam-i-am:
  • say:
  • see:
  • so:
  • thank:
  • that:
  • the:
  • them:
  • there:
  • they:
  • train:
  • tree:
  • try:
  • will:
  • with:
  • would:
  • you:
  • i: Frequency of words in initial.txt, ordered by frequency:
  • do:
  • like:
  • sam-i-am:
  • am:
  • not:
  • sam:
  • that:
  • and:
  • eggs:
  • green:
  • ham:
  • them:
  • you:
  • not: Frequency of words in green.txt, ordered by frequency:
  • i:
  • them:
  • a:
  • like:
  • in:
  • do:
  • you:
  • would:
  • and:
  • eat:
  • will:
  • with:
  • could:
  • sam-i-am:
  • eggs:
  • green:
  • ham:
  • here:
  • the:
  • there:
  • train:
  • anywhere:
  • house:
  • mouse:
  • or:
  • box:
  • car:
  • dark:
  • fox:
  • on:
  • sam:
  • tree:
  • say:
  • so:
  • be:
  • goat:
  • let:
  • may:
  • me:
  • rain:
  • see:
  • try:
  • am:
  • boat:
  • that:
  • are:
  • good:
  • thank:
  • they:
  • if:

Problem 2 [30]: Doubly-linked List Implementation of Bags In this problem, you are to implement a class BagDLLSortedFrontBack that represents bags as doubly-linked lists of sorted elements. It turns out to be helpful to maintain pointers to both the first and last nodes of the doubly-linked list, much as in the front/back implementation of a queue as a mutable list. To improve the running time of size() and count(), the values to be returned by these methods should be cached in instance variables. So instances of BagDLLSortedFrontBack should have the following instances variables:

  • comp: a comparator for determining the order of elements.
  • front: the first node of a doubly-linked list of elements sorted from low to high by comp.
  • back: the last node of the doubly-linked list of elements.
  • size: the number of elements currently in the bag (includes duplicates).
  • count: the number of distinct elements currently in the bag (does not include duplicates).

Doubly-linked lists are composed out of instances of the following DLLNode class:

public class DLLNode {

public DLLNode prev, next; public Object value;

}

Note that this class does not have an explicit constructor method. But it still has a default con- structor method. Invoking new DLLNode() creates a instance in which all three instance variables are null (indicated by a box filled with a “/”).

DLLNode prev /

value /

next /

For example, Fig. 5 shows the BagDLLSortedFrontBack representation of a bag that has two As, three Bs and one C.

  • You are encouraged to define private auxiliary methods that will help you implement the required methods. One particularly useful auxiliary instance method is the following:

private DLLNode find (Object x, DLLNode L); Assume that L is a sorted doubly-linked list. If x appears at or to the right of node L, returns the first node encountered in a left-to-right linear search of the list whose head is x. If x does not appear at or to the right of node L, returns the first node encountered in a left-to-right linear search of the list whose head is greater than x (according to the bag comparator), if it exists. If there is no node at or to the right of L whose head is greater than x, returns null.

The above method is a useful utility for all the other methods that require searching through the doubly-linked list. It is a good idea to abstract the search process into a single find method rather than writing specialized versions of it several times in different methods.

  • The empty bag must be represented specially as an instance of BagDLLSortedFrontBack whose front and back instance variables are both null. (Compare this to the implementation of QueueFrontBack.)

Problem 3 [40]: Mutable BST of Bag Entries Implementation of Bags In this problem, you are to implement a class BagMBSTBagEntries that represents bags as a mutable binary search tree of entries pairing elements with their number of occurrences. Each entry should be an instance of the following BagEntry class:

public class BagEntry {

public Object elt; public int num;

public BagEntry (Object elt, int num) { this.elt = elt; this.num = num; }

}

To improve the running time of size() and count(), the values to be returned by these methods should be cached in instance variables. So instances of BagMBSTEntries should have the following instances variables:

  • comp: a comparator for determining the order of elements.
  • entries: a mutable binary search tree whose elements are instances of BagEntry.
  • size: the number of elements currently in the bag (includes duplicates).
  • count: the number of distinct elements currently in the bag (does not include duplicates).

For example, Fig. 6 shows one possible representation of an instance of BagMBSTEntries that contains two As, three Bs and one C. (The shape of the tree depends on the order in which the elements are inserted.)

BagMBSTEntries

entries count 3 size 6 comp ...

BagEntry elt B num 3

BagEntry elt A num 2

BagEntry elt C num 1

Figure 6: An example of a BagMBSTEntries instance.

Problem Set Header Page Please make this the first page of your hardcopy submission.

CS230 Problem Set 7

Due Friday, November 15

Name:

Date & Time Submitted:

Collaborators (anyone you worked with on the problem set):

By signing below, I attest that I have followed the collaboration policy as

specified in the Course Information handout.

Signature:

In the Time column, please estimate the time you spend on the parts of this problem set. Please try to be as accurate as possible; this information will help me design future problem sets. I will fill out the Score column when grading your problem set.

Part Time Score

General Reading

Problem 1 [30]

Problem 2 [30]

Problem 3 [40]

Total