











Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Notes on parallel query processing in database systems. It covers topics such as rule-based fragmentation, localization, optimization, sorting, join algorithms, and privacy-preserving join. The notes also discuss various parallel operations like duplicate elimination and aggregates.
Typology: Slides
1 / 19
This page cannot be seen from the preview
Don't miss anything!












CS 347 Notes 03 7
E.g.: in conditions: (S.A=1) (S.A>5) False (S.A<10) (S.A<5) S.A<
CS 347 Notes 03 8
CS 347 Notes 03 9
E.g.: Push conditions down
CS 347 Notes 03 10
CS 347 Notes 03 11
(1) Start with query (2) Replace relations by fragments
(4) Simplify – eliminate unnecessary operations
CS 347 Notes 03 12
[R: cond]
fragment conditions its tuples satisfy
CS 347 Notes 03 13
CS 347 Notes 03 14
CS 347 Notes 03 15
CS 347 Notes 03 16
CS 347 Notes 03 17
CS 347 Notes 03 18
[R: False] Ø
5
CS 347 Notes 03 25
A
A
CS 347 Notes 03 26
[R 1 S 2 : False] Ø A
A
A
CS 347 Notes 03 27
Example C (2)
K
CS 347 Notes 03 28
K K^ K^ K
CS 347 Notes 03 29
K K
CS 347 Notes 03 30
[R 1 S 2 :False ] (K is key of R, R 1 )
Ø
K
K
K
6
CS 347 Notes 03 31
K K
K K
R (^1) S 1 R 2 S 2
CS 347 Notes 03 32
Example D (1) A R 1 (K, A, B) R R 2 (K, C, D)
CS 347 Notes 03 33
K
CS 347 Notes 03 34
K not really needed
CS 347 Notes 03 35
CS 347 Notes 03 36
i
8
CS 347 Notes 03 43
Input: (a) relation R on single site/disk (b) R fragmented/partitioned by sort attribute (c) R fragmented/partitioned by other attribute
CS 347 Notes 03 44
Output (a) sorted R on single site/disk (b) fragments/partitions sorted
CS 347 Notes 03 45
7 3
ko k 1
CS 347 Notes 03 46
CS 347 Notes 03 47
Shared nothing:
Shared memory: sorts F1 sorts F
Net F 1 F 2
CS 347 Notes 03 48
9
CS 347 Notes 03 49
Rb
ko
k
Local sort
Local sort
Local sort
Result
CS 347 Notes 03 50
R a R b R c
CS 347 Notes 03 51
CS 347 Notes 03 52
CS 347 Notes 03 53
Coordinator receives: SA : Min=5 Max=10 # = 10 tuples SB: Min=7 Max=17 # = 10 tuples
Expected tuples:
ko?
[assuming we want to sort at 2 sites] CS 347 Notes 03 54
Expected tuples:
ko?
[assuming we want to sort at 2 sites]
11
CS 347 Notes 03 61
Input: Relations R, S May or may not be partitioned Output: R S Result at one or more sites
CS 347 Notes 03 62
Ra (^) S 1
Rb
Sa
Sb
Sc
Local join
Result
f(A) f(A)
CS 347 Notes 03 63
CS 347 Notes 03 64
R1 R2 R3 S1 S2 S3 R1 S1 R2 S2 R3 S
CS 347 Notes 03 65
CS 347 Notes 03 66
12
CS 347 Notes 03 67
Ra (^) S
Rb
Sa
Sb
Local join
Result
f partition union
CS 347 Notes 03 68
CS 347 Notes 03 69
f partition n copies of each fragment -> 3 fragments
Ra
Rb
CS 347 Notes 03 70
Result
All nxm pairings of
R,S fragments
CS 347 Notes 03 71
CS 347 Notes 03 72
R (S R) or
(R S) (S R)
A
A
A
A
A
A
A (^) A
14
CS 347 Notes 03 79
CS 347 Notes 03 80
<----one bit/possible key------->
CS 347 Notes 03 81
Goal: R S T
CS 347 Notes 03 82
Goal: R S T
Option 1: R’ S’ T where R’ = R S; S’ = S T
CS 347 Notes 03 83
Goal: R S T
Option 1: R’ S’ T where R’ = R S; S’ = S T
Option 2: R’’ S’ T where R’’ = R S’; S’ = S T
CS 347 Notes 03 84
Many options! Number of semi-join options is exponential in # of relations in join
15
CS 347 Notes 03 85
site 1 (^) site 2
CS 347 Notes 03 86
A R = (a1, a2, a3, a4)
site 1
R A B a1 b a2 b a3 b a4 b
site 2
S A C a1 c a3 c a5 c a7 c
CS 347 Notes 03 87
A R = (h(a1), h(a2), h(a3), h(a4))
site 1
R A B a1 b a2 b a3 b a4 b
site 2
S A C a1 c a3 c a5 c a7 c
Site 2 sees it has h(a1),h(a3)
(a1, c1), (a3, c3) CS 347 Notes 03 88
A R = (h(a1), h(a2), h(a3), h(a4))
site 1
R A B a1 b a2 b a3 b a4 b
site 2
S A C a1 c a3 c a5 c a7 c
Site 2 sees it has h(a1),h(a3)
(a1, c1), (a3, c3)
CS 347 Notes 03 89
A R = (h(a1), h(a2), h(a3), h(a4))
site 1
R A B a1 b a2 b a3 b a4 b
site 2
S A C a1 c a3 c a5 c a7 c
Site 2 sees it has h(a1),h(a3)
(a1, c1), (a3, c3)
CS 347 Notes 03 90
17
CS 347 Notes 03 97
1 toy 10 2 toy 20 3 sales 15
4 sales 5 5 toy 20 6 mgmt 15 7 sales 10 8 mgmt 30
1 toy 10 2 toy 20 5 toy 20 6 mgmt 15 8 mgmt 30
3 sales 15 4 sales 5 7 sales 10
R a
R b
CS 347 Notes 03 98
1 toy 10 2 toy 20 3 sales 15
4 sales 5 5 toy 20 6 mgmt 15 7 sales 10 8 mgmt 30
1 toy 10 2 toy 20 5 toy 20 6 mgmt 15 8 mgmt 30
3 sales 15 4 sales 5 7 sales 10
dept sum toy 50 mgmt 45
dept sum sales 30
sum
sum
R a
R b
CS 347 Notes 03 99
1 toy 10 2 toy 20 3 sales 15
4 sales 5 5 toy 20 6 mgmt 15 7 sales 10 8 mgmt 30
R a
R b
less data!
CS 347 Notes 03 100
1 toy 10 2 toy 20 3 sales 15
4 sales 5 5 toy 20 6 mgmt 15 7 sales 10 8 mgmt 30
R a
R b
dept sum toy 30 toy 20 mgmt 45
dept sum sales 15 sales 15
sum
sum
less data!
CS 347 Notes 03 101
1 toy 10 2 toy 20 3 sales 15
4 sales 5 5 toy 20 6 mgmt 15 7 sales 10 8 mgmt 30
dept sum toy 50 mgmt 45
dept sum sales 30
sum
sum
R a
R b
dept sum toy 30 toy 20 mgmt 45
dept sum sales 15 sales 15
sum
sum
less
CS 347 Notes 03 102
data A
data A
data A
data B
data B
data C
data C
18
CS 347 Notes 03 103
CS 347 Notes 03 104
But what about indexes?
CS 347 Notes 03 105
ko k 1
Localindexes
Site 1 Site 2 Site 3
CS 347 Notes 03 106
Index sites
Tuple sites
ko k 1
CS 347 Notes 03 107
CS 347 Notes 03 108
20
CS 347 Notes 03 115
As we consider query plans for optimization, we must consider various tricks: