Database System Implementation: Cost-based Query Optimization and Result Estimation, Study notes of Computer Science

A set of lecture notes for cs 4420 database system implementation course at georgia tech, focusing on query optimization, specifically cost-based query optimization and intermediate result estimation. The notes include an introduction, midterm questions and answers, lecture outline, and detailed explanations on estimating intermediate results and i/o costs.

Typology: Study notes

Pre 2010

Uploaded on 08/05/2009

koofers-user-n5r
koofers-user-n5r 🇺🇸

10 documents

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
DB10 Introduction
Prepared by Ling Liu
CS 4420 Database System Implementation
Query Optimization
(Part II)
Ling Liu
Associate Professor
College of Computing, Georgia Tech
2
DB10 Introduction
Prepared by Ling Liu
Midterm 1 Questions/Answers
Notes
Chapters (suggest to read)
zChapter 1
zChapter 2 – Section 2.1 to Section 2.4
zChapter 3 – Section 3.1, Section 3.2, Section 3.4, Section 3.5
zChapter 4
zChapter 5 – Section 5.1 to Section 5.3.5
zChapter 6 – Section 6.1 to Section 6.4, Section 6.6, Section
6.7
zChapter 7 – Section 7.1 to Section 7.5.3, Section 7.7 to
Section 7.7.5
pf3
pf4
pf5

Partial preview of the text

Download Database System Implementation: Cost-based Query Optimization and Result Estimation and more Study notes Computer Science in PDF only on Docsity!

1 Prepared by Ling Liu

CS 4420 Database System Implementation

Query Optimization

(Part II)

Ling Liu Associate Professor College of Computing, Georgia Tech

2

DB10 (^) Introduction

Midterm 1 Questions/Answers

„ Notes

„ Chapters (suggest to read)

z Chapter 1 z Chapter 2 – Section 2.1 to Section 2. z Chapter 3 – Section 3.1, Section 3.2, Section 3.4, Section 3. z Chapter 4 z Chapter 5 – Section 5.1 to Section 5.3. z Chapter 6 – Section 6.1 to Section 6.4, Section 6.6, Section

z Chapter 7 – Section 7.1 to Section 7.5.3, Section 7.7 to Section 7.7.

3 Prepared by Ling Liu

Lecture Outline

„ Review of Last Lecture:

ÆCost-based Query Optimization

)Intermediate Result Estimation

„ Today’s Lecture

z Cost-based Query Optimization

) I/O Blocks Estimation

) Choosing the optimal query execution

plan

4

DB10 (^) Introduction

Relational Query Optimization

„ High level (Transformation-based) Optimization

(Logical Query Plans)

z Goal: Reduce the search space of logical query plans z Mechanisms: Æ Algebraic Transformation or Query rewritting Æ Heuristics for good transformations ) Push Selection and Projection closer to the base relations in a SQL query tree. ) Reduce intermediate result size as much as possible

„ Low Level (cost-based) Optimization

(Executable Query Plans)

7 Prepared by Ling Liu

Estimate Intermediate Results

z T(R) : # tuples in R z S(R) : # of bytes in each R tuple z V(R, A) : # distinct values in R for attribute A z B(R) = # of blocks containing R tuples

T(W) = f * T(R) fA=a=

V(R,Z)

Selection: W = σCOND(R)

Max(R,Z) – a + 1

Max(R,Z) – Min(R,Z) + 1

f A>a=

f A>a= 1/2 or 1/

8

DB10 (^) Introduction

Estimate Intermediate Results

Join: W = R1 A R

T(W) = T(R2) * T(R1)

Max(V(R1,A),V(R2, A))

Projection: W = πA(R)

T(W) = T(R)

9 Prepared by Ling Liu

Estimating I/O costs

„ Keep statistics for relation R

zT(R) : # tuples in R zS(R) : # of bytes in each R tuple zV(R, A) : # distinct values in R for attribute A z B(R) = # of blocks containing R tuples z f(R) = max # of tuples of R per block z M = # memory blocks available z HT(i) = # levels in index I z B(i) = # of leaf blocks in index i

10

DB10 (^) Introduction

  • Estimating cost of query plan

(1) Estimating size of results

Selection, Projection, Join

(2) Estimating # of IOs

„ Count # of disk blocks that must be read (or

written) to execute query plan

„ Factors considered

z Relations Contiguous or not z Index or not z Hash or not