The Document Object Model, Programming with XML | CMPS 180, Study notes of Database Management Systems (DBMS)

Material Type: Notes; Class: Database Systems I; Subject: Computer Science; University: University of California-Santa Cruz; Term: Unknown 2003;

Typology: Study notes

Pre 2010

Uploaded on 08/19/2009

koofers-user-hjy
koofers-user-hjy 🇺🇸

10 documents

1 / 18

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Winter 2003 1
Today’s Lecture
Document Object Model (XML API's)
An XML query language: XQuery
These slides were adapted from slides developed at the
University of Pennsylvania (by Peter Buneman and
Susan Davidson)
Winter 2003 2
P art I I I
Th e D ocum en t O b j ect M odel
( D O M )
Programming with XML
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12

Partial preview of the text

Download The Document Object Model, Programming with XML | CMPS 180 and more Study notes Database Management Systems (DBMS) in PDF only on Docsity!

Winter 2003 1

Today’s Lecture

  • Document Object Model (XML API's)
  • An XML query language: XQuery
  • These slides were adapted from slides developed at the

University of Pennsylvania (by Peter Buneman and Susan Davidson)

Winter 2003 2

P art I I I

Th e D ocum en t O b j ect M odel

( D O M )

Programming with XML

Winter 2003 3

X M L P arsers

  • traditional: build (main memory) data structure (DOM)
  • event based: SAX (Simple API for XML)
    • http://www.megginson.com/SAX
    • write handler for start tag and for end tag

Winter 2003 4

DOM – t h e Do c u m e n t Ob j e c t Mo d e l

  • Interface to parsed XML
  • “… language-neutral...” interface (IDL)
  • “With the Document Object Model, programmers can build documents, navigate their structure, and add, modify, or delete elements and content. Anything found in an HTML or XML document can be accessed, changed, deleted, or added using the Document Object Model…”

http://www.w3.org/DOM/

Winter 2003 7

public interface NodeList { public Node item(int index); public int getLength(); }

public interface NamedNodeMap { public Node getNamedItem(String name); ... }

Sub-e l e m e n t s --a n “a r ra y ”

A tt r ibut e s -- a “d ic t io n a r y ”

Winter 2003 8

A com m on f orm of data ex tracti on

John Doe <contact-info>

123 7456 [email protected] </contact-info> Math ...

John Doe 123 7456 Jane Dee 234 5678 … ...

F in d t h e n a m e s a n d te le p h o n e s o f a ll e m p lo y e e s in M a th

Winter 2003 9

Top -l ev el trav ersal i n D O M

public class Test

{

public static void main(String args[]) throws Exception { Parser parser = new Parser( args[0] ); Document doc = parser.readStream( new FileInputStream( args[0] )); NodeList nodes = doc.getDocumentElement.getChildNodes(); for (int i=0; i<nodes.getLength(); i++) { Node n = nodes.item(i); //coercion // select Math depts } }

}

John Doe <contact-info>

123 7456 [email protected] </contact-info> Math

Winter 2003 10

S e le c t in g t h e m a t h d e p a r t m e n t s

NodeList ndl = n.getChildNodes(); for(int idl=0; idl<ndl.getLength(); idl++) { Node nd = ndl.item(idl); if ((nd.tagName = "dept") &&

(((CharacterData) (nd.getFirstChild)).getData=“Math")) // coercion { //inner code return;

} }

John Doe <contact-info>

123 7456 [email protected] </contact-info> Math

Winter 2003 13

C on structi n g data

John Doe <contact-info>

123 7456 [email protected] </contact-info> Math
...

John Doe 123 7456 Jane Dee 234 5678 ...

Winter 2003 14

C on structi n g D ata usi n g th e D O M

Document d = new DocumentImplementation … Element root = d.createElement(“doc2”) // set root of document //top level loop { Element emp = d.createElement(“employee”) root.appendChild(emp) //innermost loop { ... Element name = d.createElement(“name”) // set s to appropriate character string name.appendChild(createCDATASection(s)) emp.appendChild(name) ... } … }

9 /-4:;,-.03<,=<4:>/?.@,816^9 /-+

A 9 )B (/ 9 /8+4%74AC,-+

D ,86>( .E7@( 3 ,F63 /G78+?/HIA 3 4AC+ J

+,-.@/?,85#7K1;( ,+8/8.MLN+-,-./

Winter 2003 15

A j oi n

John Doe <contact-info>...</contact-info> Math ...

John Doe A123 Jane Dee B456 ...

T h e n a m e s o f e m p lo y e e s a n d th e ir d e p a rt m e n t bui l d in g s

Math A123 ...

Winter 2003 16

I m p lem en ti n g a J oi n in th e D O M??

nl1 = r1.getChildNodes() //r1 is root of doc for (int i1 = 1; i1 < n1.getLength; i1++) { nl2 = r2.getChildNodes() //r1 is root of doc for (int i2 = 1; i2 < n2.getLength; i2++)

… }

nl1 = r1.getChildNodes() //r1 is root of doc for (int i1 = 1; i1 < n1.getLength; i1++) { nl2 = r2.getChildNodes() //r1 is root of doc for (int i2 = 1; i2 < n2.getLength; i2++)

… }

  • E v e n if we c an ge t b oth d oc u me nts in c ore , this

is not the mos t e f f ic ie nt me thod

  • I f not?
  • T his is a ty p ic al d atab as e q u e ry!

P art I V : Q uery Lan g uag es

Why a query language? Extracting, Restructuring, Integration, Browsing…

XQuery http://www.w3.org/TR/xquery/ http://www.xml.com/pub/a/2002/10/16/xquery.html XPATH (part of a query language) http:www.w3.org/TR/xpath XSLT http://www.w3.org/TR/xslt http://www.mulberrytech.com/quickref/XSLTquickref.pdf and many others….

Winter 2003 20

X Q uery -- l i k el y to g ai n accep tan ce...

… and, like other things in W3C, not necessarily the best.

Ingredients:

  • A concrete syntax: http://www.w3.org/TR/xquery
  • Based on XPath: http://www.w3.org/TR/xpath.html
  • A formal semantics and “algebra”: http://www.w3.org/TR/query-semantics/
  • Some test cases: http://www.w3.org/TR/xmlquery- use-cases

Winter 2003 21

X Q uery: b asi c syn tax

  • FLWR: For, Let, Where, Return
  • FOR: loops through each node in a sequence, binding

that node to a variable For $x in document(“company.xml”)//department

  • LET: binds a variable to an entire sequence of nodes

Let $y := $x//employee

  • WHERE: evaluates an expression each time the

statement attempts to RETURN information

  • Each of these clauses may use XPath expressions.

Winter 2003 22

X P ath : som e b asi cs

  • “/’ denotes traversal to a child -. /title (where “.” denotes the current context)
  • “//” denotes traversal to a descendant -. //book
  • “[…]” denotes restrictions on a node; they can be boolean expressions (and, or) and include XPath functions -. //book[title=Data on the Web’]: book with title subelement with value ‘Data on the Web’ -. //person[@id=1234’] : person with id attribute with value ‘1234’ -. //person/child[1]: first child subelement of person

Winter 2003 25

A ddress B ook R ev i si ted

O-PQ Q%RTS!U!U VW

OYXZR[\U ]^^%_Y`a!b!b bdcfeechggg!g%ijW OY]PkIZWmlPZ[\PR;Ono]PkpZ%W O-q%RZZrsWmlPZ[\PR-tTkuX!ZRPrU RsOnjq%RZZrvW O-PQQRvWxw-yZlPXz rU%{OnjPQQRTW O-PQQRvWG|U kIZ}~€#!‚ ƒ„ On†PQ Q%RvW OrZ{ W<‡fg%eb‰ˆ>ƒ!‚ „e Šb‹OnjrZ{ W OŒPŽWm‡Tgeboˆ;ƒ!‚„e! ŠeOŒPŽW OrZ{ W<‡fg%eb‰ˆ>ƒ!‚ „e Š g#OnjrZ{ W O-ZkIPz {W’‘“d”‹ŒU Rf• k‹– RU kIZ– URq—OnjZ!kIPz {W OnoXZ!R[\U ]W

OnjPQ Q%RTS!U!U VW

Winter 2003 26

P attern M atch i n g

˜> ™›š- %  œ ’žNŸ ™%m ‰¡  ¢ -8  £

ª¬« ®¯°T±‰«² ³²—´«μ ¶·°²%¸†¹ º»´´-¬°±±¼ ½·—¾ ¿jÀ¬ÁjÁ¯%°¬±d«² ¾°¸ ®°Y¬ÃÄ®¯°T±‰«²Ád°·x»³ ¾ ÅÆ °¬°^ ® ¯°^ ¬±d«²!Á²»·’°-ÃYÇCÈ8»°±»TÉ f° ¸j¶s² (^) ¤ ¯%°v±‰«-² (^) ¨Ê© ®¯°T±‰«² ÁË›Ì Ì8Í;Î (^) ©®°Î (^) ¤ Á¯°¬±d«² (^) ¨ Î

¤Á¥@¦‹§%¨

Ï8ÐÑÒÓ ÔÕ

¤ ¯%°¬±d«²mË<ÌÌ8̀ÃYºÖÖÖ!×sØØ׆ÙÙÙÙ¿^ ¨ ¤ °·»³^ ¾¨ ÚμËmªf«-v¶·›¼Ûf«·’°-¼^ «¬Ü^ ¤Ý^ °·»³^ ¾¨ ¤Á¯%°¬±d«²^ ¨ ¤Á¥@¦‹§%¨

Data Extraction

ÞßЛàáÒÓ âmã-Ó Ñá—äã åЗæNçè ÔÔÐémÕ

ª¬« ®¯°¬±d«² ³²)´«μ ¶·’°² ¸†¹ º»´´T°±d± ¼½·#¾ ¿jÀ¬ÁjÁ¯%°v±‰«²-ês²»·°-ÃYÇCÈ8»°±d»TÉ ë†ì ®³´0¬ÃÄ® ¯%°v±‰«²ÁË›Ì Ì8Í ¬° ¸j¶v² (^) ¤ ¯%°T±d«² (^) ¨E© ®³´Î ©®¯°T±d«²Á‰°·’»³ ¾Î ¤ Á¯°T±‰«² (^) ¨ Î ¤Ýs¥¦‹§%¨

Winter 2003 27

Manipulating sequences

P rint th e al l te l e p h one nu m b e rs lis te d f or C ae s ar:

¤0¥¦‹§%¨ © ª¬« ®¯°¬±d«² ³²)´«μ ¶·’°² ¸†¹ º»´´T°±d± ¼½·#¾ ¿jÀ¬ÁjÁ¯°T±‰«-²ês²»·’°ÃYÇÛÈ8»%°±d»TÉ ë ¾°¸€® ¸ÂTÃí®¯%° ¬±d«²Á\¸v°¾ ¬° ¸j¶v² (^) ¤ ¯%°T±d«² (^) ¨E© ®%¸†Î (^) ¤Ý ¯%°v±«² (^) ¨ Î ¤Ýs¥¦‹§%¨ Ï 8ÐÑÒÓ ÔÕ ¤0¥¦§¨ ¤ ¯%°¬±d«²^ ¨ ¤ ¸v°¾^ ¨ ¹†ÙØ8Ö!ÀMîïð›Øñ^ ò;Ö^ ¤Ý ¸v°¾^ ¨ ¤ ¸v°¾^ ¨ ¹†ÙØ8Ö!ÀMîïð›Øñ^ òÙ^ ¤Ý ¸v°¾^ ¨ ¤Á¯%°¬±d«²^ ¨ ¤Á¥@¦‹§%¨

Winter 2003 28

Manipulating sequences, co nt.

P rint th e f irs t te l e p h one nu m b e r l is te d f or C ae s ar:

¤0¥¦‹§%¨ © ª¬« ®¯°¬±d«² ³²)´«μ ¶·’°² ¸†¹ º»´´T°±d± ¼½·#¾ ¿jÀ¬ÁjÁ¯°T±‰«-²ês²»·’°ÃYÇÛÈ8»%°±d»TÉ ë ¾°¸ ®¸ÂTÃí®¯%° ¬±d«²Á\¸v°¾ ê\Ö ë ¬° ¸j¶v² (^) ¤ ¯%°T±d«² (^) ¨E© ®%¸†Î (^) ¤Ý ¯%°v±«² (^) ¨ Î ¤Ýs¥¦‹§%¨ Ï 8ÐÑÒÓ ÔÕ ¤0¥¦§¨ ¤ ¯%°¬±d«²^ ¨ ¤ ¸v°¾^ ¨ ¹†ÙØ8Ö!ÀMîïð›Øñ^ ò;Ö^ ¤Ý ¸v°¾^ ¨ ¤Á¯%°¬±d«²^ ¨ ¤Á¥@¦‹§%¨

Winter 2003 31

J o ins ( co nt’ d )

R e s u lt: ¤0¥¦§¨ ¤ μ³²°×vμ«²^ ¸v»μ¸^ ¨ ¤ÅÆ « ¨ È8»%°±d»@ój·—¯°¬»¸v«-^ ¤ÁÅÆ « ¨ ¤ÅÆ °¬°^ ¨ÚμË)ª¬«s¶·›¼C¬«·°-¼^ «fÜ^ ¤ÁÅÆ °f°^ ¨ ¤ ·«ö³^ ° ¨ ÷±o¸s°v³^ ½#»²´mÈ;¾^ °«8¯%»^ ¸j¬»^ ¤ Á·«ö³^ °¨ ¤ Á‰μ^ ³²°×Tμ«-²^ ¸v»μ¸^ ¨

¤ μ³²°×vμ«²^ ¸v»μ¸^ ¨ ¤ ÅÆ « ¨ ø j¼Ì¸jf»²Ü°¾^ «ö°^ ¤ ÁÅÆ « ¨ ¤ÅÆ °¬°^ ¨ ±o¸jT»²Ü°¾^ «ö˾^ «ö°-¼^ ¸Æ°-¼Cõ%«-·#õ^ ¤ ÁÅÆ °f°^ ¨ ¤ ·«ö³^ ° ¨ ø \¼Ì¸jT»²Ü^ °¾^ «ö°#«žù;«^ Å ó€Ì¸v«¯¯%°´F¼¼h¼^ ¤ Á·’«ö³^ °¨ ¤ Á‰μ^ ³²°×Tμ«-²^ ¸v»μ¸^ ¨ ¼¼h¼ ¤

Á

Data I nte g ration

Winter 2003 32

X Q uer y : B ey o nd F L W R

  • XQuery has many built-in functions and predicates,

such as

  • count(), sum(), min(), max(), position(), first(… ), last() which work over sequences
  • index-of() finds the position of a node in a sequence
  • Distinct-values(), distinct-nodes() remove duplicates
  • Set operations: union, intersection
  • If-then-else statements and function definition (“define

function name (params) returns result”) are also included

Winter 2003 33

E quality

  • Equality
    • node-equal: same node
    • deep-equal: same value

úûüýþoÿü  ;ú  0ú 

ý û!#"%$'&() 0ú* + 0ú,- 

Tûü+. /$0 *Tû1.Yú ü

2 "& Yû-)1!û34.û5$!%û 6 7$'" &>û 6 †û34.;ú 8 ýþoÿü/Mý!û!9"%$&5:;

<^ û-û)=451!û34.û5$'!û^6 &Yû-û5=6†û34.'0ú^8 dýþoÿ^ !ü>Mýû!?"4$&):;

9Tû @.>ú ü ABCD)E FG

HTû @.Yú ü

2 " & Yû-;þ0ú !û

< û-û)=4>ü7I. û

9Tû @.>ú üJ

X Q uer y use cases

http://support.x-hive.com/xquery/index.html