CPS 116 Fall 2004 Homework #3, Study notes of Introduction to Database Management Systems

The homework assignment for cps 116 fall 2004, including 3 problems related to xml programming and database management. Problem 1 requires completing a gradiance homework titled 'homework 3.1 (xml basics)'. Problem 2 involves writing xquery queries to answer questions about an xml document modeling data from an internet auction site. Problem 3 requires converting an xml document describing a thesaurus into a new format and designing a relation schema to store the thesaurus data.

Typology: Study notes

2011/2012

Uploaded on 01/29/2012

arold
arold 🇺🇸

4.7

(24)

372 documents

1 / 4

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CPS 116 Fall 2004
Homework #3 (8.75% of course grade: 100 points)
Assigned: Thursday, October 21
Due: Tuesday, November 2
Problem 1 (10 points).
Complete the Gradiance homework titled “Homework 3.1 (XML Basics).”
Problem 2 (56 points).
Consider an XML document (from a project called XMark) modeling the data maintained by
an Internet auction site in /home/dbcourse/examples/xml-xmark/auction.xml. The
main entities modeled are: items, persons, open auctions, closed auctions, and categories.
iteminfo elements describe items that are for sale or that already have been sold.
Each item carries a unique identifier and bears properties like payment method (credit
card, money order, etc.), a reference to the seller, a description, etc., all encoded as
subelements. Each item belongs to a world region represented by the iteminfo’s
parent.
Open auctions are auctions in progress. Their properties include the bid history (i.e.,
increases over time) along with references to the bidders, a reference to the seller, a
reference to the iteminfo being sold, etc.
Closed auctions are auctions that are finished. Their properties include references to
the seller and the buyer, a reference to the respective iteminfo, the price, the
quantity of items sold, etc.
Persons are characterized by name, email address, phone number, mail address, profile
of their interests, a set of open auctions they watch, etc.
Categories feature a name and a description; they are used to implement classification
of items. A category graph links categories into a network.
The figure below illustrates the part of the document structure that is relevant to the problem:
site
regions people open_auctions closed_auctions
{africa,asia,...}
iteminfo
description
address
name
street
person
name
city country
zipcode
open_auction
bidder itemref seller
date
time personref
increase
initial
closed_auction
itemref
seller bu yer
price
incategory
pf3
pf4

Partial preview of the text

Download CPS 116 Fall 2004 Homework #3 and more Study notes Introduction to Database Management Systems in PDF only on Docsity!

CPS 116 Fall 2004 Homework #3 (8.75% of course grade: 100 points) Assigned: Thursday, October 21 Due: Tuesday, November 2

Problem 1 (10 points).

Complete the Gradiance homework titled “Homework 3.1 (XML Basics).”

Problem 2 (56 points).

Consider an XML document (from a project called XMark) modeling the data maintained by an Internet auction site in /home/dbcourse/examples/xml-xmark/auction.xml. The main entities modeled are: items, persons, open auctions, closed auctions, and categories.

  • iteminfo elements describe items that are for sale or that already have been sold. Each item carries a unique identifier and bears properties like payment method (credit card, money order, etc.), a reference to the seller, a description, etc., all encoded as subelements. Each item belongs to a world region represented by the iteminfo’s parent.
  • Open auctions are auctions in progress. Their properties include the bid history (i.e., increases over time) along with references to the bidders, a reference to the seller, a reference to the iteminfo being sold, etc.
  • Closed auctions are auctions that are finished. Their properties include references to the seller and the buyer, a reference to the respective iteminfo, the price, the quantity of items sold, etc.
  • Persons are characterized by name, email address, phone number, mail address, profile of their interests, a set of open auctions they watch, etc.
  • Categories feature a name and a description; they are used to implement classification of items. A category graph links categories into a network. The figure below illustrates the part of the document structure that is relevant to the problem: site

regions people open_auctions closed_auctions

{africa,asia,...}

iteminfo description

address

name

street

person name

city country

zipcode

open_auction bidder itemref seller date time personref

increase

initial

closed_auction itemref buyer seller

price

incategory

Fire up QuiP’s GUI and type in the following query { document("/home/dbcourse/examples/xml-xmark/auction.xml") } to get an overview of the document. Please refer to the document “XML Programming Notes” on the course Web site for instructions on running QuiP.

Write queries in XQuery to answer the following questions. You may test your queries in QuiP’s GUI, but please use the batch mode to generate the output to turn in. Because QuiP does not have a sophisticated optimizer, query performance may be heavily influenced by the way you write your queries. If a particular query takes forever to run, consider reordering loops and evaluating selections (filters) as early as possible.

For each question below, say (a), write your XQuery in a file named hw3-1-a.xquery, and generate the output file hw3-1-a.xml by running quip hw3-1-a.xquery > hw3-1- a.xml. Turn in printout of all your .xquery and output .xml files. If the output file is too long, print out only the first page of the output.

(a) Find names of all items in “namerica” region. (b) Find names of all items that belong to “category0.” (c) Find names of all persons whose address is in “United States.” (d) Find all buyers who paid for more than $50 in a closed auction. (Note: To get this query working, you might want to consult the “QuiPified” XQuery examples presented in lecture, which can be downloaded from the course Web site under the “Lecture Notes” section.) (e) Find names of all person who has bidden in an open auction for an item whose name contains the string “cow.” (Note: To use the XPath built-in function contains() in QuiP, you need to convert any node argument to a string explicitly, e.g., contains(string(name), “cow”).) (f) Find names of all persons with address in “United States” who never bought anything in closed auctions. (g) For each open auction whose seller's name is “Venkatavasu Takano,” print out the following information: <open_auction id=“...”> <bidder_name>...</bidder_name> <bidder_name>...</bidder_name> ... </open_auction>

Problem 3 (34 points).

Consider an XML document describing a thesaurus (from http://zthes.z3950.org/) in /home/dbcourse/examples/xml-zthes/thatt.xml. The XML document contains a list of terms. Each term has a list of relations with other terms. For example, the following excerpt

  1. Write a Java program using SAX API to perform the conversion.
  2. Write a Java program using DOM API to perform the conversion.
  3. Write an XQuery to perform the conversion.
  4. Write an XSLT program to perform the conversion. Please refer to the document “XML Programming Notes” on the course Web site for instructions on how to compile and run these programs and queries. You can test your code first on a smaller input (thatt-short.xml), and compare it with the sample output (thatt-new-short.xml). After you are done testing on the small document, test on thatt.xml, and validate the output with thatt-new.dtd. Instructions on how to validate an XML document can be found in “XML Programming Notes.” Validation will automatically check the consistency of all IDREF’s.

You must implement two out of the four options. For each option you implement, turn in a printout of the source code and the first twenty term elements in the result of transforming thatt.xml. Show the raw output generated by your code; do not manually edit the output (other than truncating it for printout).

(b) (14 points.) Design and create a relation schema to store the thesaurus data using the schema-aware approach discussed in lecture. Your design should not have the redundancy described in (a). Populate the tables with the data from thatt.xml. You should write a program or script to convert the data into flat files and import them into DB2. Do not use individual INSERT statements. The choice of programming/scripting language and XML API is up to you. Instructions on how to load a DB2 database can be found in the document “DB2 SQL Programming Notes” on the course Web site; examples can be found in /home/dbcourse/examples/db- */ directories.

Turn in the following:

  • CREATE TABLE statements you used in creating the relational schema, with keys and foreign keys properly declared.
  • Printout of the program/script you used to generate DB2 data files for loading.
  • Printout of the script you used to import data files into DB2.
  • Printout of the result of running the following query for each table you have loaded: SELECT * FROM table FETCH FIRST 10 ROWS ONLY;