



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
These notes from cs 347 cover query optimization techniques in parallel and distributed systems. Topics include exhaustive search with pruning, hill climbing, and query separation. Examples and cost comparisons are provided.
Typology: Slides
1 / 5
This page cannot be seen from the preview
Don't miss anything!




CS 347 Notes 04 7
(in parallel/distributed system)
CS 347 Notes 04 8
Site 1 Site 2 Site 3 Site 4 Startup Distri- Searching Final bution +send results proc.
CS 347 Notes 04 9
(1) Exhaustive (with pruning) (2) Hill climbing (greedy) (3) Query separation
CS 347 Notes 04 10
(1) Exhaustive
CS 347 Notes 04 11
R S T R S RT S R S T T S TR (S R) T (T S) R ship S semi ship T semi to R join to S join 1 Prune because cross-product not necessary 2 Prune because larger relation first
Example: join R S T |R|>|S|>|T|
A B
2 1 2 1
CS 347 Notes 04 12
e.g.: Goal is parallelism in system with fast net, consider partitioning relation(s) first e.g.: Goal is reduction of net traffic, consider semi-joins
CS 347 Notes 04 13
(2) Hill climbing Better plans
Worse plans
x Initial plan
1
CS 347 Notes 04 14
(2) Hill climbing Better plans
Worse plans
x Initial plan
1
2
CS 347 Notes 04 15
Example R S T V
Rel Site Size tuple size = 1 R 1 10 S 2 20 T 3 30 V 4 40
A B C
Goal: minimize data transmission
CS 347 Notes 04 16
What site do we send all relations to? To site 1: cost=20+30+40= To site 2: cost=10+30+40= To site 3: cost=10+20+40= To site 4: cost=10+20+30=60
CS 347 Notes 04 17
Compute R S T V at site 4
CS 347 Notes 04 18
1 2
CS 347 Notes 04 25
Example: best plan could be: PB : T (3 4) =T V (4 2) ’= S ’ (2 1) ”= ’ R ” (1 4) Compute answer
’’
’
V T
R (^) S T
[optional]
CS 347 Notes 04 26
Example: best plan could be: PB : T (3 4) =T V (4 2) ’= S ’ (2 1) ”= ’ R ” (1 4) Compute answer
’’
’
V T
R (^) S T
33 = total
Costs could be low because β is [optional] very selective
CS 347 Notes 04 27
(3) Query separation
CS 347 Notes 04 28
A
CS 347 Notes 04 29
A
CS 347 Notes 04 30
(a) Compute A values in answer (steps 1,2) (b) Get tuples from sites with matching A values and compute answer (step 3)
CS 347 Notes 04 31
CS 347 Notes 04 32
CS 347 Notes 04 33
CS 347 Notes 04 34
CS 347 Notes 04 35
“Optimization is like chess playing”
i.e., May have to make sacrifices (move data, partition relations, build indexes) for later gains!