





Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Query Optimization, Advanced Database Systems, Search Space, Left-deep Plans, Heuristic, Greedy Algorithm, Query Optimization in System R, Bottom-up Plan Generation, The Need for “Interesting Order”, Comparing Interesting Orders, System-R Algorithm, Reasoning About Predicates, Bushy Plans, Optimizer “Blow-up”, Transformations, Iterative Improvement, Simulated Annealing, Two-phase Optimization
Typology: Slides
1 / 9
This page cannot be seen from the preview
Don't miss anything!






2
Remember to email me to sign up for a 30-minute slot
3 hours—no time pressure! Open book, open notes Comprehensive, but with emphasis on the second half of the course and materials exercised in homework
3
Join ordering is the most important subproblem
Multiway joins Different join methods Placement of selection and projection operators
R 2 R 1 R 3 R (^4) R 5
5
Heuristic: consider only “left-deep” plans, in which only the left child can be a join
How many left-deep plans are there for R 1 L R (^) n?
R 2 R 1
R 3
R 4
6
S 1 , …, Sn Say selections have been pushed down; i.e., Si = σ p Ri Start with the pair Si , Sj with the smallest estimated size for Si Sj
Repeat until no table is left: Pick Sk from the remaining tables such that the join of Sk and the current result yields an intermediate result of the smallest size
Current subplan
…, Sk , Sl , Sm , …
Remaining tables to be joined
Pick most efficient join method
Sk
Minimize expected size
Complexity?
Comparing their costs is not enough
At most one for each interesting order
11
Pass 1: Find the best single-table plans Pass 2: Find the best two-table plans by considering each single-table plan (from Pass 1) as the outer input and every other table as the inner input … Pass k : Find the best k -table plans by considering each ( k –1)-table plan (from Pass k –1) as the outer input and every other table as the inner input …
Heuristics Push selections and projections down Process cross products at the end
12
No join condition
FROM Student, Enroll, Course WHERE Student.age < 10 AND Student.SID = Enroll.SID AND Enroll.CID = Course.CID AND Course.title LIKE ‘%data%’;
Primary keys/indexes Student(SID), Enroll(CID, SID), Course(CID) Ordered, secondary indexes Student(age), Course(title)
14
Plans for { Student } S1: Table scan, then filter ( age < 10); cost 100; result ordered by SID S2: Index scan using condition ( age < 10); cost 5; result ordered by age Plans for { Enroll } E1: Table scan; cost 1000; result ordered by CID , SID
Plans for { Course } C1: Table scan, then filter ( title LIKE ’%data%’); cost 40; result ordered by CID C2: Index scan with filter ( title LIKE ’%data%’); cost 60; result ordered by title
SELECT SID, CID FROM Student, Enroll, Course WHERE Student.age < 10 AND Student.SID = Enroll.SID AND Enroll.CID = Course.CID AND Course.title LIKE ‘%data%’;
15
Extending best plans for { Student }
☻
SELECT SID, CID FROM Student, Enroll, Course WHERE Student.age < 10 AND Student.SID = Enroll.SID AND Enroll.CID = Course.CID AND Course.title LIKE ‘%data%’;
Heuristics-based query optimization Randomized query optimization (Ioannidis & Kang, SIGMOD 1990) Genetic programming (PostgreSQL)
20
Cost
Space of plans
Plan Transformations
Global optimum
Local optimum
21
Relational algebra equivalences (or query rewrite rules in general):
Join method choice: R method1 S → R method2 S
Join commutativity: R S → S R
Join associativity: ( R S ) T → R ( S T )
Left join exchange: ( R S ) T → R ( T S )
Right join exchange: R ( S T ) → S ( R T )
) Why the last two redundant rules?
Start with a random plan Repeatedly go downhill (i.e., pick a neighbor with a lower cost randomly) to get to a local optimum
23
Repeat until some equilibrium (e.g., a fixed number of iterations):
24