Advanced Database Systems-Lecture 13 Slides-Computer Science, Slides of Database Management Systems (DBMS)

Query Processing, Systems View, Physical Plan Execution, Iterator Interface, Iterator for Table Scan, open(), getNext(), close(), Iterator for Nested-loop Join, open(), Iterator for 2-pass Merge Sort, Blocking vs. Non-Blocking Iterators, Execution of an Iterator Tree, Memory Management for DBMS, Buffer Manager Basics, Standard OS Replacement Policies, Problems With OS Buffer Management, Performance Problems, Replacement Policy, Prefetch Policy, Crash Recovery, Old Algorithms, Query Locality S

Typology: Slides

2011/2012

Uploaded on 01/28/2012

arold
arold šŸ‡ŗšŸ‡ø

4.7

(24)

372 documents

1 / 13

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Query Processing: A Systems View
CPS 216
Advanced Database Systems
2
Announcements (March 1)
īš™Reading assignment due Wednesday
Buffer management
īš™Homework #2 due this Thursday
īš™Course project proposal due in one week
īš™Midterm next Thursday in class
Open book, open notes
3
Physical (execution) plan
īš™A complex query may involve multiple tables and
various query processing processing algorithms
E.g., table scan, index nested-loop join, sort-merge join,
hash-based duplicate elimination…
īš™A physical plan for a query tells the DBMS query
processor how to execute the query
A tree of physical plan operators
Each operator implements a query processing algorithm
Each operator accepts a number of input tables/streams
and produces a single output table/stream
pf3
pf4
pf5
pf8
pf9
pfa
pfd

Partial preview of the text

Download Advanced Database Systems-Lecture 13 Slides-Computer Science and more Slides Database Management Systems (DBMS) in PDF only on Docsity!

Query Processing: A Systems View

CPS 216

Advanced Database Systems

2

Announcements (March 1)

Ā™ Reading assignment due Wednesday

ƒ Buffer management

Ā™ Homework #2 due this Thursday

Ā™ Course project proposal due in one week

Ā™ Midterm next Thursday in class

ƒ Open book, open notes

3

Physical (execution) plan

Ā™ A complex query may involve multiple tables and

various query processing processing algorithms

ƒ E.g., table scan, index nested-loop join, sort-merge join,

hash-based duplicate elimination…

Ā™ A physical plan for a query tells the DBMS query

processor how to execute the query

ƒ A tree of physical plan operators

ƒ Each operator implements a query processing algorithm

ƒ Each operator accepts a number of input tables/streams

and produces a single output table/stream

Examples of physical plans

Ā™ Many physical plans for a single query

ƒ Equivalent results, but different costs and assumptions! )DBMS query optimizer picks the ā€œbestā€ possible physical plan

PROJECT ( title ) INDEX-NESTED-LOOP-JOIN ( CID )

Index on Enroll( SID )

Index on Course ( CID )

Index on Student ( name )

INDEX-SCAN ( name = ā€œBartā€)

INDEX-NESTED-LOOP-JOIN ( SID )

PROJECT ( title ) MERGE-JOIN ( CID )

SORT ( CID )SCAN (Course) MERGE-JOIN ( SID )

SCAN ( Enroll )

SORT ( SID )

SCAN ( Student )

FILTER ( name = ā€œBartā€)

SELECT Course.title FROM Student, Enroll, Course WHERE Student.name = ā€˜Bart’ AND Student.SID = Enroll.SID AND Enroll.CID = Course.CID;

5

Physical plan execution

Ā™ How are intermediate results passed from child

operators to parent operators?

ƒ Temporary files

  • Compute the tree bottom-up
  • Children write intermediate results to temporary files
  • Parents read temporary files

ƒ Iterators

  • Do not materialize intermediate results
  • Children pipeline their results to parents

6

Iterator interface

Ā™ Every physical operator maintains its own execution

state and implements the following methods:

ƒ open(): Initialize state and get ready for processing

ƒ getNext(): Return the next tuple in the result (or a null

pointer if there are no more tuples); adjust state to allow

subsequent tuples to be obtained

ƒ close(): Clean up

Blocking vs. non-blocking iterators

Ā™ A blocking iterator must call getNext()

exhaustively (or nearly exhaustively) on its children

before returning its first output tuple

ƒ Examples:

Ā™ A non-blocking iterator expects to make only a few

getNext() calls on its children before returning its

first (or next) output tuple

ƒ Examples:

11

Execution of an iterator tree

Ā™ Call root.open()

Ā™ Call root.getNext() repeatedly until it returns null

Ā™ Call root.close()

) Requests go down the tree

) Intermediate result tuples go up the tree

) No intermediate files are needed

12

Memory management for DBMS

Ā™ DBMS operations require main memory

ƒ While data resides on disk, it is manipulated in memory ƒ Sometimes the more memory the better, e.g., sort

Ā™ One approach: let each operation pre-allocate some amount

of ā€œprivateā€ memory and manage it explicitly

ƒ Not very flexible ƒ Limits sharing and reuse

Ā™ Alternative approach: use a buffer manager

ƒ Responsible for reading/writing data blocks from/to disk as needed ƒ Higher-level code can be written without worrying about whether data is in memory or not

Buffer manager basics

Ā™ Buffer pool: a global pool of frames (main-memory blocks)

)Some systems use separate pools for different objects (e.g., tables and indexes) and for different operations (e.g., sorting and others)

Ā™ Higher-level code can pin and unpin a frame

ƒ Pin: I need to work on this frame in memory ƒ Unpin: I no longer need this frame ƒ A completely unpinned frame is a candidate for replacement )In some systems you can hate a frame (i.e., suggesting it for replacement)

Ā™ A frame becomes dirty when it is modified

ƒ Only dirty frames need to be written back to disk )Related to transaction processing

14

Standard OS replacement policies

Ā™ Example

ƒ Current buffer pool: 0, 1, 2 ƒ Past requests: 0, 1, 2 ƒ Incoming requests: 3, 0, 1, 2, 3, 0, 1, 2, 3, 4, 5, 6, 7, … )Which frame to replace?

Ā™ Optimal: replace the frame that will not be used for the

longest time (2)

Ā™ Random (0, 1, or 2 with equal probability)

Ā™ LRU: least recently used (0)

Ā™ LRU approximation: clock, aging

Ā™ MRU: most recently used (2)

15

Problems with OS buffer management

Stonebraker. ā€œOperating System Support for Database Management.ā€ CACM , 1981.

Ā™ Performance problems

ƒ Getting a page from the OS to user space is usually a system call (process switch) and copy

Ā™ Replacement policy

ƒ LRU, clock, etc. often ineffective ƒ DBMS knows access pattern in advance and therefore should dictate policy → major OS/DBMS distinction

Ā™ Prefetch policy

ƒ DBMS knows of multiple ā€œordersā€ for a set of records; OS only knows physical order

Ā™ Crash recovery

ƒ DBMS needs more control

Hot set algorithm

) Exploit query behavior more!

Ā™ A set of pages that are accessed over and over form a hot set

ƒ ā€œHot pointsā€ in the graph of buffer size vs. number of page faults ƒ Example: For nested-loop join R  S , size of hot set is B ( S ) + 1 (under LRU)

Ā™ Each query is given enough memory for its hot set

Ā™ Admission control: Do not let a query into the system

unless its hot set fits in memory

Ā™ Replacement: LRU within each hot set (seems arbitrary)

Ā™ Derivation of hot set assumes LRU, which may be

suboptimal

ƒ Example: What is better for nested-loop join?

20

Query locality set model

Ā™ Observations

ƒ DBMS supports a limited set of operations

ƒ Reference patterns are regular and predictable

ƒ Reference patterns can be decomposed into simple

patterns

Ā™ Reference pattern classification

ƒ Sequential

ƒ Random

ƒ Hierarchical

21

Sequential reference patterns

Ā™ Straight sequential: read something sequentially once

ƒ Example: selection on unordered table )Each page is only touched once, so just buffer one page

Ā™ Clustered sequential: repeatedly read a ā€œchunkā€ sequentially

ƒ Example: merge join; rows with the same join column value are scanned multiple times )Keep all pages in the chunk in buffer

Ā™ Looping sequential: repeatedly read something sequentially

ƒ Example: nested-loop join )Keep as many pages as possible in buffer, with MRU replacement

Random reference patterns

Ā™ Independent random: truly random accesses

ƒ Example: index scan through a non-clustered (e.g.,

secondary) index yields random data page access

)The larger the buffer the better?

Ā™ Clustered random: random accesses that happen to

demonstrate some locality

ƒ Example: in an index nested-loop join, inner index is

non-clustered and non-unique, while outer table is

clustered and non-unique

)Try to keep in buffer data pages of the inner table

accessed in one cluster

23

Hierarchical reference patterns

Ā™ Example: operations on tree indexes

Ā™ Straight hierarchical: regular root-to-leaf traversal

Ā™ Hierarchical with straight sequential: traversal

followed by straight sequential on leaves

Ā™ Hierarchical with clustered sequential: traversal

followed by clustered sequential on leaves

Ā™ Looping hierarchical: repeatedly traverse an index

ƒ Example: index nested-loop join

)Keep the root index page in buffer

24

DBMIN algorithm

Ā™ Associate a chunk of memory with each file instance (each

table in FROM)

ƒ This chunk is called the file instance’s locality set ƒ Instances of the same table may share buffered pages ƒ But each locality set has its own replacement policy )Based on how query processing uses each relation (finally!) )No single policy for all pages accessed by a query )No single policy for all pages in a table

Ā™ Estimate locality set sizes by examining the query plan and

database statistics

Ā™ Admission control: a query is allowed to run if its locality

sets fit in free frames

Locality sets for more ref. patterns

Ā™ Straight hierarchical, hierarchical/straight sequential: just

like straight sequential

ƒ Size = 1 ƒ Just replace as needed

Ā™ Hierarchical/clustered sequential: like clustered sequential

ƒ Size = number of index pages in the largest cluster ƒ FIFO or LRU

Ā™ Looping hierarchical

ƒ At each level of the index you have random access among pages ƒ Use Yao’s formula to figure out how many pages need to be accessed at each level ƒ Size = sum over all levels that you choose to worry about ƒ LIFO with 3-4 buffers should be okay

29

Simulation study

Ā™ Hybrid simulation model

ƒ Trace-driven simulation

  • Recorded from a real system (running Wisconsin Benchmark)
  • For each query, record its execution trace
    • Page read/write, file open/close, etc.

ƒ Distribution-driven simulation

  • Generated by some stochastic model
  • Synthesize the workload by merging query execution traces

Ā™ Simulator models CPU, memory, and one disk

Ā™ Performance metric: query throughput

30

Workload

Ā™ Mix 1: all six types equally likely

Ā™ Mix 2: I and II together appear 50% of the time

Ā™ Mix 3: I and II together appear 75% of the time

Mix 1 (no data sharing)

Ā™ Thrashing is evident

for simple algorithms

with no load control

Ā™ Working set (a popular

OS choice) fails to

capture join loops for

queries with high

memory demand (types

V and VI)

ƒ It still functions (though suboptimally) with large number of current queries (NCQ)

DBMIN Hot set

Working set

32

Mix 3 (no data sharing)

Ā™ Thrashing is still

evident

Ā™ Working set fares

better because mix 3

has more simple

queries and fewer

ones of types V and

VI

DBMIN Hot set

Working set

33

Mix 1 (full data sharing)

Ā™ With full data

sharing, locality is

easier to capture

ƒ Performance

improves across the

board and the gap

disappears

ƒ Random and FIFO

do not capture

locality as effectively

as others

DBMIN and others Random and FIFO

Conclusion

Ā™ Same basic access patterns come up again and again

in query processing

Ā™ Make buffer manager aware of these access patterns

) Look at the workload, not just the content

ƒ Contents can at best offer guesses at likely workloads