Query Processing and Optimization, Study notes of Distributed Database Management Systems

Translating SQL queries into relational algebra, Basic algorithms - Sorting: internal sorting and external sorting - Implementing the SELECT operation - Implementing the JOIN operation - Implementing the Project operation - Other operations • Heuristics for query optimization, What is a heap? • MaxHeap and Maintenance of MaxHeaps - MaxHeapify - BuildMaxHeap • Heapsort - Algorithm - Heapsort analysis

Typology: Study notes

2024/2025

Available from 06/29/2026

parineeta-deb
parineeta-deb 🇮🇳

5 documents

1 / 122

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Sept. 2023 Dr. Yangjun Chen ACS-4902 1
Query Processing and Optimization
(Ch. 18, 3rd Ch. 15, 4th ed. 5th ed., Ch. 19, 6th
ed., Ch. 18, 19, 7th ed.)
Processing a high-level query
Translating SQL queries into relational algebra
Basic algorithms
- Sorting: internal sorting and external sorting
- Implementing the SELECT operation
- Implementing the JOIN operation
- Implementing the Project operation
- Other operations
Heuristics for query optimization
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60
pf61
pf62
pf63
pf64

Partial preview of the text

Download Query Processing and Optimization and more Study notes Distributed Database Management Systems in PDF only on Docsity!

Query Processing and Optimization

(Ch. 18, 3

rd

- Ch. 15, 4

th ed. 5

th ed., Ch. 19, 6

th

ed., Ch. 18, 19, 7

th ed.)

  • Processing a high-level query
  • Translating SQL queries into relational algebra
  • Basic algorithms
    • Sorting: internal sorting and external sorting
    • Implementing the SELECT operation
    • Implementing the JOIN operation
    • Implementing the Project operation
    • Other operations
  • Heuristics for query optimization
  • Steps of processing a high-level query

Scanning, Parsing, Validating

Query in a high-level language

Intermediate form of query

Query optimization

Execution plan

Query code generation

Code to execute the query

Runtime database processor

Result of query

SELECT MAX(SALARY)

FROM EMPLOYEE

WHERE DNO = 5

SELECT LNAME, FNAME

FROM EMPLOYEE

WHERE SALARY > c

F

MAX SALARY

DNO=

(EMPLOYEE))

LNAME FNAME

SALARY>C

(EMPLOYEE))

  • Translating SQL queries into relational algebra
    • translate query blocks into relational algebra expressions
  • Basic algorithms
    • sorting: internal sorting and external sorting
    • algorithm for SELECT operation
    • algorithm for JOIN operation
    • algorithm for PROJECT operation
    • algorithm for SET operations
    • implementing AGGREGATE operation
    • implementing OUTER JOIN
  • Basic algorithms
    • different internal sorting methods:

sorting numbers

Input n numbers. Sort them such that the numbers are

ordered increasingly.

3 9 1 6 5 4 8 2 10 7

1 2 3 4 5 9 7 8 9 10

main idea:

1st step: 3 9 1 6 5 4 8 2 10 7

2nd step: 1 9 3 6 5 4 8 2 10 7

1 2 3 6 5 4 8 9 10 7

… ...

swap

swap

  • Basic algorithms
    • A simple sorting algorithm

Heapsort

  • What is a heap?
  • MaxHeap and Maintenance of MaxHeaps
    • MaxHeapify
    • BuildMaxHeap
  • Heapsort
    • Algorithm
    • Heapsort analysis

Heapsort

  • Combines the better attributes of merge sort and

insertion sort.

  • Like merge sort, but unlike insertion sort, running time

is O ( n lg n ).

  • Like insertion sort, but unlike merge sort, sorts in

place.

  • Introduces an algorithm design technique
    • Create data structure ( heap ) to manage information

during the execution of an algorithm.

  • The heap has other applications beside sorting.
    • Priority Queues

Data Structure Binary Heap

  • length[ A ] – number of elements in array A.
  • heap-size[ A ] – number of elements in heap stored in A.
    • heap-size[ A ]  length[ A ]

1

(^2 )

4 5 6 7

(^8 9 )

1 2 3 4 5 6 7 8 9 10

Searching the tree in breadth-first

fashion, we will get the array.

Last row filled from left to right.

Heap Property (Max and Min)

  • Max-Heap
    • For every node excluding the root, the value stored in

that node is at most that of its parent: A [ parent [ i ]]  A [ i ]

  • Largest element is stored at the root.
  • In any subtree, no values are larger than the value

stored at the subtree’s root.

  • Min-Heap
    • For every node excluding the root, the value stored in

that node is at least that of its parent: A [ parent [ i ]]  A[ i ]

  • Smallest element is stored at the root.
  • In any subtree, no values are smaller than the value

stored at the subtree’s root

1 2 3 4 5 6 7 8 9 10

1

(^2 )

4 5 6 7

(^8 9 )

exchange

Heaps in Sorting

  • Use max-heaps for sorting.
  • The array representation of a max-heap is not sorted.
  • Steps in sorting

(i) Convert the given array of size n to a max-heap ( BuildMaxHeap )

(ii) Swap the first and last elements of the array.

  • Now, the largest element is in the last position – where it

belongs.

  • That leaves n – 1 elements to be placed in their appropriate

locations.

  • However, the array of first n – 1 elements is no longer a max-

heap.

  • Float the element at the root down one of its subtrees so that

the array remains a max-heap ( MaxHeapify )

  • Repeat step (ii) until the array is sorted.

MaxHeapify

  • Suppose two subtrees are max-heaps,

but the root violates the max-heap

property.

  • Fix the offending node by exchanging the value at the node

with the larger of the values at its children.

  • May lead to the subtree at the child not being a max

heap.

  • Recursively fix the children until all of them satisfy the max-

heap property.

MaxHeapify – Example

MaxHeapify( A , 2)

14

1

2 3

4 5 6 7

8

9 10