EECS 484 Assignment 2: Database-backed File Search Engine - Prof. Kristen R. Lefevre, Assignments of Database Management Systems (DBMS)

An assignment for a database systems course where students are required to create a database, enforce constraints using oracle, and develop a java application that interacts with the database using jdbc. The assignment involves creating tables and views, enforcing constraints, and developing a java application that searches for words in multiple files and stores the data in a relational dbms.

Typology: Assignments

Pre 2010

Uploaded on 09/02/2009

koofers-user-pry-1
koofers-user-pry-1 🇺🇸

9 documents

1 / 8

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Page 1 of 8
EECS 484 Assignment 2 – Database-backed File Search Engine
(Due: February 11, 2009, 10:30AM)
Overview
In this assignment, you will extend the functionality of Project 1 to find words in multiple files (kind of
like Apple Spotlight or Google Desktop search). Also, you will store your word data in a Relational
DBMS (Oracle), rather than in Java data structures.
This project has two main goals:
1. You will get hands-on experience creating a database and enforcing constraints using a
commercial DBMS.
2. You will become familiar with ways of interacting with a DBMS via SQL from a procedural
program. (In this case, the interaction will be done from Java via JDBC.)
Each student should already have an account on the CAEN Oracle server; CAEN sent you instructions
containing your login and password. If you have trouble accessing your account, please contact the GSI.
Before starting the project, please change your password to one that you can share with your project
partner. This password should not be the same as your UM Kerberos, CAEN, or EECS passwords!
Instructions for changing your password are at the end of this document in the SQL*Plus section.
The keys to success on this project are planning wisely and starting early. Read the assignment carefully,
and discuss how you will divide the work between team members. You should already have the tools
necessary to start Part 1. By the end of January, we will have covered all material (SQL, JDBC, triggers)
necessary to complete the entire project.
Part 1 – Database Tables and Constraints
The first part of the assignment is to create and initialize the tables and constraints in the database. You
will use SQL*Plus scripts to create, initialize, and destroy your tables and constraints. All of the
commands to create and initialize the database should be stored in a single file: DBINIT.txt.
Below, we describe the required tables and constraints. You should use the basic constructs of CREATE
TABLE (e.g., PRIMARY KEY, FOREIGN KEY, UNIQUE, and NOT NULL) as much as possible to
enforce constraints. However, in some cases, you will find that you cannot enforce a constraint using
these basic mechanisms. In these cases, you will need to create triggers to enforce the constraints. Your
file should include commands to populate the database with the minimum amount of data necessary to
satisfy the constraints. You may not create any tables, views, sequence variables, or stored procedures
other than those listed below.
Required Tables
At a high level, the entities in your database will be users, files, and words. More specifically, you should
use the CREATE command to create the following tables and views.
filetable(fileid INTEGER, filename VARCHAR(256), ownerid VARCHAR(8)): contains a fileid
(primary key), filename (candidate key), and the userid of the user who owns the file. The fileid
value should be “auto incremented” by using a sequence variable that is named fileidsequence
(See the last section of the handout for suggestions about how to do this in Oracle).
useridtable(userid VARCHAR(8)): contains the set of valid userids in the system.
pf3
pf4
pf5
pf8

Partial preview of the text

Download EECS 484 Assignment 2: Database-backed File Search Engine - Prof. Kristen R. Lefevre and more Assignments Database Management Systems (DBMS) in PDF only on Docsity!

EECS 484 Assignment 2 – Database-backed File Search Engine

(Due: February 11, 2009, 10:30AM)

Overview

In this assignment, you will extend the functionality of Project 1 to find words in multiple files (kind of like Apple Spotlight or Google Desktop search). Also, you will store your word data in a Relational DBMS (Oracle), rather than in Java data structures. This project has two main goals:

  1. You will get hands-on experience creating a database and enforcing constraints using a commercial DBMS.
  2. You will become familiar with ways of interacting with a DBMS via SQL from a procedural program. (In this case, the interaction will be done from Java via JDBC.) Each student should already have an account on the CAEN Oracle server; CAEN sent you instructions containing your login and password. If you have trouble accessing your account, please contact the GSI. Before starting the project, please change your password to one that you can share with your project partner. This password should not be the same as your UM Kerberos, CAEN, or EECS passwords! Instructions for changing your password are at the end of this document in the SQL*Plus section. The keys to success on this project are planning wisely and starting early. Read the assignment carefully, and discuss how you will divide the work between team members. You should already have the tools necessary to start Part 1. By the end of January, we will have covered all material (SQL, JDBC, triggers) necessary to complete the entire project.

Part 1 – Database Tables and Constraints

The first part of the assignment is to create and initialize the tables and constraints in the database. You will use SQL*Plus scripts to create, initialize, and destroy your tables and constraints. All of the commands to create and initialize the database should be stored in a single file: DBINIT.txt. Below, we describe the required tables and constraints. You should use the basic constructs of CREATE TABLE (e.g., PRIMARY KEY, FOREIGN KEY, UNIQUE, and NOT NULL) as much as possible to enforce constraints. However, in some cases, you will find that you cannot enforce a constraint using these basic mechanisms. In these cases, you will need to create triggers to enforce the constraints. Your file should include commands to populate the database with the minimum amount of data necessary to satisfy the constraints. You may not create any tables, views, sequence variables, or stored procedures other than those listed below.

Required Tables

At a high level, the entities in your database will be users, files , and words. More specifically, you should use the CREATE command to create the following tables and views.

  • filetable(fileid INTEGER, filename VARCHAR(256), ownerid VARCHAR(8)): contains a fileid (primary key), filename (candidate key), and the userid of the user who owns the file. The fileid value should be “auto incremented” by using a sequence variable that is named fileidsequence (See the last section of the handout for suggestions about how to do this in Oracle).
  • useridtable(userid VARCHAR(8)): contains the set of valid userids in the system.
  • readertable(fileid INTEGER, userid VARCHAR(8)): An entry in this table gives a userid explicit read access to the file. You should make permission changes to this table, not to readerview.
  • wordtable(fileid INTEGER, word VARCHAR(12), count INTEGER): contains the word, the file it occurs in, and the number of occurrences in that file.
  • A VIEW readerview(fileid INTEGER, userid VARCHAR(8)): An entry in this view gives the userid who has either explicit or implicit access to the file. You will use this view, rather than readertable, for queries that attempt to locate words in files.

Required Constraints

Your database must also enforce the following constraints. (Please note that these constraints should be enforced by the DBMS, rather than in the Java code you write for Part 2.)

  1. There should be no duplicate rows in any of the tables, except the readerview view.
  2. There should be no NULL values in any of the tables or views.
  3. In filetable , the fileid attribute should be auto-incremented. This attribute identifies files in the system, so designate it as a foreign key in appropriate tables.
  4. In useridtable , the userid attribute uniquely identifies users. The ownerid and userid in other tables should always reference this attribute via foreign key constraints.
  5. There is a special userid value 'admin' , which must always exist in the useridtable. It should not be possible to delete or replace this value.
  6. The ‘ admin’ user always has implicit read access to all files. The ownerid for a file also has implicit read access to the owned file. These rights must be reflected in readerview.
  7. readertable contains userids who have been given explicit read permissions to a particular file. These rights must also be reflected in readerview. Any valid userid , including ‘admin’ or a file owner, can be given explicit read permissions to a file (do not prevent that).
  8. Any user (specified by its userid ), except for 'admin' , can be deleted at any time from the useridtable. If a userid is deleted, all files owned by that user must become owned by 'admin'. All read rights belonging to the deleted userid must go away from all the relevant tables and views.
  9. A file can be deleted from filetable at any time. Any dependent entries should go away from all the tables and views.
  10. If a file’s owner is updated (e.g., via an UPDATE SQL call on the ownerid field of the filetable ), that should work, as long as the new owner exists in the useridtable. If the new owner is different from the previous owner for the file, all the explicit read rights on that file in readertable should be deleted. If the new owner is the same, then there should be no change to the existing read rights on the file.
  11. Each count in wordtable must be greater than 0. (You can use CHECK to enforce this).

Destroying the Database

You should write the SQL*Plus commands to destroy the database completely. They should drop all of the tables, views, triggers, etc. that were created by DBINIT.txt. You should put all of these commands in a single file, DBDELETE.txt. There is one easy test for this – Executing the commands in DBINIT.txt, followed by the commands in DBDELETE.txt (and repeating this process) should not give any errors.

on success and false on failure. But for some methods, an exception must be thrown to indicate an error.

Running the Sample Code

We provide you a Makefile that contains the appropriate locations of the Jar files for Oracle’s JDBC driver and the Junit testing module. To compile the program, do: % make all To run the FileFinderTest’s main program, type: % make run To just see what commands are executed, without actually executing them, you can do % make – n all % make – n run The relevant jar files can be found at: /usr/lib64/oracle/10.2.0.3/client/lib64/ojdbc14.jar: Needed for all files. /afs/umich.edu/class/eecs484/public/junit-4.1.jar: Needed if you decide to extend FileFinderTest.java for doing the testing. This file relies on the junit package, which is provided by the jar file. More information on junit is available within Eclipse and at http://www.junit.org. You can also compile and develop your program, including using junit, outside Eclipse using the Makefile.

Transactions

Each public method must provide “transactional” semantics. In other words, if it interacts with the database system and returns successfully, then all the lookups and updates on the database must commit as one transaction. If the method fails (e.g., SQL constraint violation), then none of the updates should commit. For this assignment, you must ensure the JDBC’s auto-commit mode for the connection is always ON outside the procedures. In other words, if we were to check the mode’s value in the main program at any time, we should find it on. However, within the procedures, you will need to turn auto-commit on and off to provide transactional semantics. For example, if you issue multiple SQL commands from one of the procedures, if you leave auto-commit on, it could happen that the first update commits, but the second fails (or that an intervening transaction occurs from another session).

Debugging & Testing

You can put print statements in your code, as you wish, to help you debug, etc. (Though, please don’t use print statements as you only debugging tool! Eclipse has a nice debugging environment.) But when you are ready to submit the assignment, please remove any debugging print statements.

You will find it useful to use e.printStackTrace(), in the catch blocks for exceptions. That can help identify the problem line quickly. You will need to test your code (this requires significant effort). You do not have to submit tests in this project, but we encourage you to look at JUnit package for writing test cases. It is available within Eclipse. You can use SQL*Plus commands to check integrity constraints.

Part 3 – Report

Finally, you should create a brief design document to accompany your group’s project (in a file named REPORT.pdf) that contains the following:

  1. An ER diagram that reflects the various relationships and as many integrity constraints as possible. Describe (below the diagram) any integrity constraints that could not be reflected in the diagram’s notation. You can use any tools you like to create the ER diagram, but please use the notation from class and the textbook. If you want, you can hand-draw the diagram and scan it in.
  2. How are the constraints enforced by your design? For each constraint, point to the schema/trigger element(s) (from the DBINIT.txt file) that enforce the constraint. If some constraint is not enforced by the database, please state that also.
  3. For each public method, a brief explanation of design and an example of the actual SQL query (or queries) that will be sent to the database during execution. (This is so that we don’t have to reverse engineer your code.)

What to Hand In

To summarize, you are expected to hand in the following material for this project:

  1. A file DBINIT.txt that contains SQL*Plus commands to create and initialize the database, including constraints and triggers.
  2. A file DBDELETE.txt that contains SQL*Plus commands to destroy the database completely.
  3. A file FileFinder.java that should be a modification of the sample file that we give you. The file contains public methods in the FileFinder class that you need to fill in. (Again, do not change the signature of these methods.) You can add additional classes to the file. You can also add any class variables or private methods to help implement the functionality of the public methods.
  4. (Optional) Any additional helper .java files, which should be in the package “filefinder” and written by you. The methods in these classes should be protected, so they are not visible outside the filefinder package.
  5. A file REPORT.pdf containing your report.

Submission instructions (using CTools)

1: Log on to a campus LINUX machine. 2: Create a new directory and call it "filefinder" (You should already have this directory, as it is the required name for the filefinder package.) 3: In this directory put the following files. Please leave out any other files (.class, etc.) from your submission.

  1. **DBINIT.txt
  2. DBDELETE.txt**

END;

RUN;

sqlplus does not have command history like the UNIX or DOS shell. Keeping longer commands that you use frequently in a text file is a good idea. sqlplus commands are terminated by a semi-colon. If you don’t type the semi-colon, it will prompt you to enter the remaining command on the next lines, by showing line numbers 2, 3, etc. To change your Oracle password, start sqlplus and then type “password”. Follow the prompts to change the password.

Transactions within SQL*Plus

By default, sqlplus does not commit transactions as you type. You either need to issue an explicit COMMIT or quit the session normally for the transactions to be committed. You can also rollback everything since the last commit by issuing the ROLLBACK command. Unfortunately, this means that you can get strange lockup problems if you try to run multiple sqlplus sessions simultaneously. For example, suppose you connect to CAEN via SSH, run sqlplus, and make some updates without issuing a COMMIT. If the SSH connection is dropped, the Oracle server does not always know the sqlplus session is aborted. The commands you issued may still hold locks on tables, which will prevent a new sqlplus session from updating the tables. If this happens to you, you can wait a few hours until the old sqlplus session (and its locks) times out. Be patient in that case, or switch to your partner’s account. If you are frequently accessing the CAEN machines remotely, and are prone to dropped SSH connections, then issuing the following command at the beginning of you session may prevent many lockouts: SQL> SET AUTOCOMMIT ON; The above causes each transaction to commit as soon as it is typed. A commit should cause the database to release any locks. If the AUTOCOMMIT is not on, typing SQL> COMMIT will also release the locks and allow parallel sessions to proceed.

Oracle Mutating Table Errors

A common problem encountered in this project is the following: You have a table A that references table B with a FOREIGN KEY / CASCADE constraint. In this case, deleting a tuple in table B with a trigger on table B produces a “mutating table” error. There are various ways to fix this. The simplest solution is to not cascade the original foreign key, but to use a trigger instead.

Oracle Sequence Variables and Auto-Generated Keys

In some commercial DBMS systems, there is a feature allowing you to specify that a column should be incremented automatically for each new record. (This is useful, for example, for generating key values.)

In Oracle, you can accomplish this using a sequence variable and a trigger. When you declare the sequence variable, you can specify the initial start value, as well as amount it is incremented. For example, the following declares a sequence variable with initial value 1, and an increment value of 2. CREATE SEQUENCE myid INCREMENT BY 2 START WITH 1 You can access the next value in the sequence variable with a call to myid.NEXTVAL. Once you have defined your sequence variable, you should be able to create a trigger to populate the key field appropriately each time a new record is inserted.

Oracle Constraint Idiosyncrasies

You will soon discover that SQL is not entirely standard across vendors. You are likely to discover differences between what is supported by Oracle and what is described in the textbook. Nonetheless, you should be able to implement all constraints required for this project using supported features. Here are some examples of differences you might encounter:

  • Oracle does not implement assertions (i.e., CREATE ASSERTION)
  • Oracle does not allow subqueries in CHECK constraints
  • For foreign keys, Oracle allows an optional ON DELETE CASCADE or ON DELETE SET NULL, but does not implement the ON DELETE SET DEFAULT or ON UPDATE options described in the textbook.

Oracle Error Messages

Triggers can be used to enforce certain constraints, and errors can be handled via the built-in function RAISE_APPLICATION_ERROR. When this happens, the command that activated the trigger is aborted, and the effects of both the trigger and the triggering statement are rolled back. For example, the following trigger enforces the constraint that Students.age >= 18: CREATE TRIGGER CheckAge AFTER INSERT OR UPDATE OF age ON Students FOR EACH ROW BEGIN IF (:new.age < 18) THEN RAISE_APPLICATION_ERROR(-2000, ‘All students must be at least 18’); END IF; END; . RUN;