Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

XML Query Processing: Navigational and Structural Approaches for Querying XML Data, Slides of Database Management Systems (DBMS)

Duke University Database Management Systems (DBMS)

The processing of xml queries using both navigational and structural approaches. The lore data model, navigational plans, and the niagara unnest algorithm. It also explores the use of stack-based algorithms and the concept of twig joins. Insights into the advantages and disadvantages of each approach and the importance of choosing the optimal join order.

Typology: Slides

2011/2012

Uploaded on 01/29/2012

arold 🇺🇸

4.7

(24)

372 documents

1 / 9

This page cannot be seen from the preview

Don't miss anything!

XML Query Processing

CPS 216

Advanced Database Systems

Announcements (March 31)

Course project milestone 2 due today

Hardcopy in class or otherwise email please

I will be out of town next week

No class on Tuesday (April 5); will make up during

reading period

Badrish Chandramouli will give the lecture

on Thursday (April 7)

Homework #3 in less than two weeks (April 12)

Reading assignment for next week will be assigned

through email

Overview

Recall that XML queries based on path expressions

can be expressed by joins

Node/edge-based representation (graphs)

Equi-join on id’s

Chasing pointers ≈index nested-loop joins

)“Navigational” approach

Interval-based representation (trees)

“Containment” joins involving left and right

Sort-merge joins, zig-zag joins with indexes

)“Structural” approach

Discover Slides of Database Management Systems (DBMS) Duke University

Partial preview of the text

Download XML Query Processing: Navigational and Structural Approaches for Querying XML Data and more Slides Database Management Systems (DBMS) in PDF only on Docsity!

XML Query Processing

CPS 216

Advanced Database Systems

Announcements (March 31)

Course project milestone 2 due today

Hardcopy in class or otherwise email please

I will be out of town next week

No class on Tuesday (April 5); will make up during reading period Badrish Chandramouli will give the lecture on Thursday (April 7)

Homework #3 in less than two weeks (April 12)

Reading assignment for next week will be assigned

through email

Overview

Recall that XML queries based on path expressions

can be expressed by joins

Node/edge-based representation (graphs)

Equi-join on id ’s Chasing pointers ≈ index nested-loop joins )“Navigational” approach

Interval-based representation (trees)

“Containment” joins involving left and right Sort-merge joins, zig-zag joins with indexes )“Structural” approach

Navigational processing in Lore

VLDB 1999

Lore data model peculiarity: labels on edges instead of labels on nodes Access paths in Lore Base representation: (parent, label) → child Label index: (child, label) → parent Edge index: label → (parent, child) Value index: (value, label) → node Path index: path expression → node

Correspond to the following in a label-on-node model label/value → node (parent, label) → child child → parent

Navigational plans in Lore

//A/B/C[.=5]

Top down: pointer chasing Start with //A, navigate down to //A/B and then to //A/B/C, and then check values of C Bottom up: reverse pointer chasing Start with //C[.=5], navigate up to //B[/C[.=5]] and then to //A[/B/C[.=5]] Hybrid: top down and bottom up, meet in middle Start with //A, navigate down to //A/B Start with //C[.=5], navigate up to //B[/C[.=5]] Intersect B nodes )In general, hybrid can combine multiple top-down and bottom-up plans starting from anywhere in the path expression

Comparison of Lore navigational plans

Which plan is best depends on the size of the intermediate results it generates Choose the optimal join order! Top down and bottom up are essentially index nested-loop joins (“pure” navigation) Hybrid can use any join strategy to combine subplans

Structural approach

Binary containment joins (Al-Khalifa et al., ICDE 2002) Given Alist and Dlist , two lists of elements encoded with ( left , right ), with each list sorted by left Find all pairs of ( a , e ), where a ∈ Alist and e ∈ Dlist , such that a is a parent (or ancestor) of e

Example query processing scenario: //book/author Using an inverted-list index, retrieve the list of book elements sorted by left , and the list of author elements sorted by left Find pairs that actually form parent-child relationships

Tree-based algorithms

Algorithm Tree-Merge-Anc

BeginJoinable = 0;

For each a in Alist :

Start from BeginJoinable and skip Dlist until the

first element with left > a. left ; update BeginJoinable ;

Start from BeginJoinable and join each d from

Dlist with a ; stop at the first d with left > a. right ;

An alternative algorithm, Tree-Merge-Desc , uses Dlist

as the outer table instead of Alist , and requires

minor tweaks to conditions

Tree-Merge-Anc example

a 1 : BeginJoinable = d 1 ; stops at d 4

a 2 : BeginJoinable = d 2 ; stops at d 4

a 3 : BeginJoinable = d 4 ; stops at d 6

a 4 : BeginJoinable = d 6

) Further optimization is possible to avoid unnecessary rescanning; though in general rescanning cannot be avoided

a 1 a 2

a 3 a 4

d 1 d 2 d 3

d 4 d 5 d 6

Worst case of Tree-Merge-Anc

Optimal (up to a

constant factor) for //

Not optimal for /

Worst case of Tree-Merge-Desc

Not even optimal

for //

) Problem: linear

access to Alist forces

unnecessary

scanning

) Idea: create another

representation that

corresponds more

closely to a tree

traversal

Stack-based algorithms

Algorithm Stack-Tree-Desc Start with an empty stack Astack

While Astack or Alist or Dlist is not empty: If heads of both Alist and Dlist come after the top of Astack , pop Astack ; Else if the head of Alist is contained by the top of Astack , push it onto Astack and advance Alist ; Else join the head of Dlist with everything on Astack and advance Dlist ;

) Output is ordered by Dlist

An alternative algorithm, Stack-Tree-Anc , orders output by Alist but requires more bookkeeping

Compact encoding using stacks

One stack for each node in the query twig

Elements in a stack form a containment chain

Each stack element points to one in the parent stack

Specifically, the top one that contains it

PathStack

Handles twigs with no branches q 1 // q 2 //…// qn

Input lists T (^) q 1 , T (^) q 2 , …, T (^) qn and stacks Sq 1 , Sq 2 , …, Sqn While T (^) qn is not empty: Let T (^) qmin be the list whose head has smallest left ; Clean all stacks: pop while top’s right < head ( T (^) qmin ). left ; Push head ( T (^) qmin ) on Sqmin , with pointer to top ( Sparent ( q min) ); If q min is the leaf ( qn ), output results and pop Sqmin ;

Check properties Elements in a stack form a containment chain Each stack element points to the top one in the parent stack that contains it

Extending PathStack to TwigStack

A first cut Decompose a twig into root-to-leaf paths Process each path using PathStack Merge solutions for all paths

Problem: intermediate results may be big

All authors will be returned by PathStack , though only the last one should be in the final result

TwigStack

Generate solutions for each root-to-leaf path

Do not use PathStack , which generates all solutions Modify PathStack to generate only solutions that are parts of the final result (possible if twig contains only //) Specifically, when pushing h (^) q onto stack Sq , ensure that

h (^) q has a descendent h (^) q’ in the each input list Tq’ where q’ is a child of q
Each h (^) q’ recursively satisfies the above property

Merge solutions for all paths

TwigStack still suboptimal for /

Example

Desired result: ( A 1 , B 2 , C 2 ), ( A 2 , B 1 , C 1 )

Initial state: all three stacks empty; ready to push one of A 1 , B 1 , C 1 onto a stack

If we want to ensure that non-contributing nodes are never pushed onto the stack, then Cannot decide on A 1 unless we see B 2 and C 2 Cannot decide on B 1 or C 1 unless we see A (^2)

A 1 A 2 B 1 C 1

B 2 C 2

A B C

Optimization using an index

Idea: if there are indexes on input lists ordered by left , use these indexes to skip lists more efficiently

Example: Niagara’s ZigZag join on A//B

After advancing to the second A, use the index on B list to go directly to the first joining B, instead of scanning B list linearly When processing a B, use the index on A list to skip

XML Query Processing: Navigational and Structural Approaches for Querying XML Data, Slides of Database Management Systems (DBMS)

Related documents

Partial preview of the text

Download XML Query Processing: Navigational and Structural Approaches for Querying XML Data and more Slides Database Management Systems (DBMS) in PDF only on Docsity!

XML Query Processing

CPS 216

Advanced Database Systems

Announcements (March 31)

 Course project milestone 2 due today

 I will be out of town next week

 Homework #3 in less than two weeks (April 12)

 Reading assignment for next week will be assigned

through email

Overview

 Recall that XML queries based on path expressions

can be expressed by joins

 Node/edge-based representation (graphs)

 Interval-based representation (trees)

Navigational processing in Lore

VLDB 1999

Navigational plans in Lore

//A/B/C[.=5]

Comparison of Lore navigational plans

Structural approach

Tree-based algorithms

Algorithm Tree-Merge-Anc

BeginJoinable = 0;

For each a in Alist :

Start from BeginJoinable and skip Dlist until the

first element with left > a. left ; update BeginJoinable ;

Start from BeginJoinable and join each d from

Dlist with a ; stop at the first d with left > a. right ;

 An alternative algorithm, Tree-Merge-Desc , uses Dlist

as the outer table instead of Alist , and requires

minor tweaks to conditions

Tree-Merge-Anc example

Worst case of Tree-Merge-Anc

 Optimal (up to a

constant factor) for //

 Not optimal for /

Worst case of Tree-Merge-Desc

 Not even optimal

for //

) Problem: linear

access to Alist forces

unnecessary

scanning

) Idea: create another

representation that

corresponds more

closely to a tree

traversal

Stack-based algorithms

Compact encoding using stacks

 One stack for each node in the query twig

 Each stack element points to one in the parent stack

PathStack

Extending PathStack to TwigStack

TwigStack

 Generate solutions for each root-to-leaf path

 Merge solutions for all paths

TwigStack still suboptimal for /

Optimization using an index

Course project milestone 2 due today

I will be out of town next week

Homework #3 in less than two weeks (April 12)

Reading assignment for next week will be assigned

Recall that XML queries based on path expressions

Node/edge-based representation (graphs)

Interval-based representation (trees)

An alternative algorithm, Tree-Merge-Desc , uses Dlist

Optimal (up to a

Not optimal for /

Not even optimal

One stack for each node in the query twig

Each stack element points to one in the parent stack

Generate solutions for each root-to-leaf path

Merge solutions for all paths