




Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Cs138 programming assignment 2, where students are required to build a simple replicated database system using berkeley db software and write a metadata server. The system consists of a master database server, metadata server, and secondary servers. Students will write a new server thread using the comm package to communicate and implement the metadata server interface. Details on berkeley db functions and metadata server interface.
Typology: Assignments
1 / 8
This page cannot be seen from the preview
Don't miss anything!





Assignment Out: Feb. 3, 2003 Helpsession: Feb. 5, 2003 (CIT 165, 8pm) Assignment Due: Feb 21, 2003 (10:00 pm)
Now that you’ve built a basic message passing system, we’re going to use that infrastructure to build a simple replicated database system. This setup will entail a single master database server, a metadata server and a number of secondary, or slave, servers. While there are no updates in this assignment, in later parts the master will be in charge of handling updates and will act as the central repository of all data in the system. However, both the master and any slave can handle read requests. The metadata server keeps tabs on what data the master has so that slaves know when they need to get the data from the master.
For this assignment, you’ll be implementing the above system. Don’t worry, your loving ta’s have provided you with database software so that you don’t have to write them yourself. You will, however, have to write the metadata server. Descriptions of the database software and metadata server specification are provided below. You’ll also want to concentrate on leveraging your code from last time. You’re going to want to write a new server thread using the comm package to communicate. This server thread will listen for and respond to client and slave requests.
The database software you’ll be using for this class is Berkeley DB (BDB). Rather than a separate program you run, it’s actually compiled into your Java executable. This makes communicating with BDB much easier. To make things even easier, we have written a wrapper around BDB so that you don’t have to mess with the mundane tasks of serializing and converting everything to byte arrays. This wrapper is BDBUtil.java, and it abstracts out the basic functionality of BDB. It initializes it, shuts it down and takes care of the calling conventions. In addition, it will automatically serialize all the data passed in.
void init(String db file, boolean overwrite) Before you do anything else, you’re going to want to initialize the database. This entails telling BDB where its database file is located. You should call this function at the beginning of your program.
boolean put(String key, Serializable data) boolean reput(String key, Serializable data) Serializable get(String key) These commands are pretty straightforward. put and reput associate the given key with the given piece of data. get retrieves the piece of data associated with the given key. You should note that if you’re trying to put an item with a key that’s already in BDB, you’ll get an error. In this case, you’ll need to use reput, which deletes the data associated with the current key and then puts your new data.
boolean del(String key) Deletes the data that is associated with the given key.
void clear() Deletes all keys and associated data from the database.
void shutdown(boolean clear) When you’re all done, you need to call shutdown to cleanly close the database. This ensures that all changes that were made to the database are saved.
You’ll be writing a metadata server (mds) for this assignment. The metadata server, like its name implies, maintains data about the data in the database as well as information about the state of the system as a whole. Specifically, the metadata server tracks who the master server is and what keys the master server has. Thus the metadata server needs to implement this interface:
void setMaster(String host, int port) Sets the server that is returned in subsequent calls to getMaster().
String getMaster() Returns the current master server’s host name.
int getMasterPort() Returns the current master server’s port.
void addKey(String host, int port, String key, String[] keywords) Associates the given key with the given server. Also, associates the specified keywords with the given key.
For the master and slave servers, you’re going to want to write a common thread that does the work of both and uses the comm package to communicate. The constructor for this class should take an argument telling it whether it’s a master or a slave. The expected behavior for a master and a slave is slightly different and is described below:
The master server initializes itself by telling the metadata server that it’s the master. It then tells the metadata server about all the keys in its database and their associated keywords. Then, it simply waits for incoming messages and processes them appropriately. It does not handle search requests; those are handled only by the metadata server.
A slave server initializes itself by first asking the metadata server who the master is. It then downloads the master’s keys from the metadata server and directly contacts the master to get the data associated with each key. It adds those keys and the data to its database. Finally, like the master, it sits and waits for incoming messages. Like the master, it does not handle search requests.
The common server functionality is to listen for requests and then service them. The types of requests that the master services is a superset of those that the slaves service (ie, the master acts on both slave download requests as well as client query requests^1 ). Thus you can simply write one big receive loop that both the master and slaves will use. You might think this is a problem because a slave has master functionality and will try to perform those functions if requested by another server. Thus, you should add checks when parsing messages intended for the master to ensure that the current server is a master and to err otherwise.
Before we define how servers and clients communicate, we need to define the format for our packets. At the lowest level, we are sending CommPackets. For all server communication, we add a
(^1) It may be the case that in your implementation a “slave download request” and a “client query request” are the same thing.
second layer. The m data field for the CommPacket will always be a GenericData object. GenericData has two fields:
m type The message type. m data The data associated with this message (the actual type depends on the message and can be null for some message types).
Here is an example of how to create a packet:
GenericData gd = new GenericData(); gd.m_type = myType; gd.m_data = myData;
CommPacket p = new CommPacket(); p.setSrcPort(m_comm.srcPort()); p.setSrc(m_comm.iaddr().getHostName()); p.setDest(myDest); p.setDestPort(myDestPort); p.setData(gd);
The message type field is simply an integer. You’re going to need to define how the integer values map to the type of message. For instance, the value 1 might mean “client query request”. Instead of just using these numbers all over the place in your code, it’s best to define a meaningful text string and use that instead, ie public static final int MSG CLIENT QUERY = 1; Now anytime you want to check for that message type, you just check if it’s equal to MSG CLIENT QUERY.
So what are the message types you’ll be needing for this assignment? Well, that’s something you need to figure out. Thinking about it, you’ll notice that there are two sets of message types: messages to database servers and messages to the metadata server. Currently, the database servers do little more than return the associated data item for a given key. For the metadata server, you’ll need to create message types for each of the functions listed in the interface provided above. Also, since some of those functions return data, you’ll need to create return message types as well.
Finally, since your servers will run forever unless you CTRL-C them, you need a way for them to safely and gently shut down. Therefore, you’re also going to need to implement a “kill server” message type. This way, a client can send the kill message to a server and the server will shut itself down.
Remember to define each of your message types explicitly. Each packet type should have a strict definition on who can send it, who can receive it and what type of data it holds. You should be vigilantly error checking when receiving and drop the packet if you find something incorrect.
Technical note: when checking the type of data sent with a packet, you’re going to want to attempt to cast the m data member of the GenericData packet to the type you think the data should be. However, if it isn’t what you assumed, you’ll get a ClassCastException. So, you
port the port that the database server listens on master/slave specifies the type of server db file only used if this is a master server. Specifies location of db file used when initializing bdb mdshost the hostname of the metadata server mdsport the port of the metadata server
As mentioned above, we’ve included some scripts to make your lives easier. While it is highly recommended that you use them, you don’t have to. If you choose not to, be sure to take a look at the scripts because you will need to set CLASSPATH and LD LIBRARY PATH in order for everything to work properly. All of these scripts are located in /course/cs138/bin:
start-mds Starts up the metadata server. You must run this script within the directory that has the MDSMain.class file. start-master Starts up a master server. You should provide the hostname and port of the metadata server. You must run this script in the directory that has the DBMain.class file. start-slave Starts up a slave server. You need to give it the hostname and port of the metadata server. Again, you must run it in the directory that has the DBMain.class file.
In order for these scripts to work, you’re going to need to detect whether a specified port p is already bound. You’ll know will happen because you’ll get a BindException when you’re attempting to construct your ServerSocket. If this happens, simply try binding to port p + 1. You’ll probably have to modify your comm implementation to get this to work.
A distributed database isn’t much good without some data. So, we’re setting you up with a preloaded BDB database file. You’ll get it when you copy over the sddb assignment directory. When starting the master, give it the filename of this db file. Then, when you start up slaves, they’ll create a fresh db file and download the master’s data into that db file. This is one way you know everything’s working properly.
Specifically, the data in the database is of type MediaData. As you can see from the class definition, it has all the information you need to pass to the metadata server. Thus, when the master starts up, it will have to call BDB’s getKeys function, then iterate through each key, getting that key from BDB, then parsing the MediaData object and sending off the appropriate information to the metadata server.
You’re also responsible for creating a client that can query your database servers. It should also be able to connect to the metadata server to do keyword searches. This client will be one of the major factors in our determination of whether your distributed database works or not, so you should be sure the client works.
To get started, copy the /course/cs138/asgn/sddb directory to your 138 directory.
When you’re ready to hand in, you should be sure to write a README documenting any bugs or quirks in your implementation as well as any extra credit features you’re implemented. Then run /course/cs138/bin/cs138 handin sddb.
There’s not too many opportunities for extra credit at this point - the next assignment will be much more fruitful, since this assignment is setting up the basic infrastructure of your data store. However, a few ideas: