Download Scapegoat Trees: An Amortized Analysis of Sorted Set Operations and more Study notes Data Structures and Algorithms in PDF only on Docsity!
Outline
- (^) Scapegoat Trees ( O(log n) amortized time)
- (^) 2-4 Trees ( O(log n) worst case time)
- (^) Red Black Trees ( O(log n) worst case time)
Scapegoat trees
- Deterministic^ data structure
- Lazy^ data structure
- Only does work when search^ paths get too long
- Search^ in^ O(log n)^ worst-case^ time
- Insert/delete^ in^ O(log n)^ amortized^ time
- Starting with an empty scapegoat tree, a sequence of^ m insertions and deletions takes O(mlog n) time
Scapegoat philosophy
- (^) We cannot do it to often if we want to keep the order of O(log n) amortized time.
- (^) Rebuild the tree cost O(n) time
• How to know when we need to rebuild the tree?
- (^) Scapegoat trees keep two counters:
- n: the number of items in the tree (size)
- q: an overestimate of n
- (^) We maintain the following two invariants:
- q/2 ≤ n ≤ q
- No node has depth greater than log 3/ q
Search and Delete
- (^) How can we perform a search in a Scapegoat tree?
- run the standard deletion algorithm for binary search trees.
- decrement n
- if n < q/2 then
- (^) rebuild the entire tree and set q=n
- (^) How can we delete a value x from a Scapegoat tree?
- (^) How can we insert a value x into a Scapegoat tree?
7 9
0 3 6 1 4 2 8 5 n = q = 10 n = q = 10 u=3. 5 u=3. 5
Inserting into a Scapegoat tree
( easy case )
- Create a node u and insert in the normal way.
- Increment n and q
- depth(u) = 4 ≤ log3/2 q = 5. n = q = 11 n = q = 11 u u
u=3. 5 u=3. 5
Inserting into a Scapegoat tree
( bad case )
n = q = 11 n = q = 11 (^59) 6 8 7 (^03) 1 4 2 d(u) = 6 > log 3/ q =
d(u) = 6 > log3/2 q =
w w 1 ≤ (2/3)2 =
1 ≤ (2/3)2 =
size( size(ww)) >> ((2/32/3)) size(w.parent)
u=3. 5 u=3. 5
Inserting into a Scapegoat tree
( bad case )
n = q = 11 n = q = 11 (^59) 6 8 7 (^03) 1 4 2 d(u) = 6 > log 3/ q =
d(u) = 6 > log3/2 q =
w w 3 ≤ (2/3) = 4 3 ≤ (2/3) = 4 size( size(ww)) >> ((2/32/3)) size(w.parent)
( Scapegoat )
u=3. 5 u=3. 5
Inserting into a Scapegoat tree
( bad case )
n = q = 11 n = q = 11 (^59) 6 8 7 (^03) 1 4 2 d(u) = 6 > log 3/ q =
d(u) = 6 > log3/2 q =
w w 6 > (2/3)7 =
6 > (2/3)7 =
size( size(ww)) >> ((2/32/3)) size(w.parent)
Why is there always a scapegoat?
- Lemma: if^ d > log3/2 q^ then there exists a^ scapegoat^ node.
- (^) Proof by contradiction
- (^) Assume (for contradiction) that we don't find a scapegoat node.
- (^) Then size(w) ≤ (2/3) size(w.parent) for all nodes w on the path to u
- (^) The size of a node at depth i is at most n(2/3) I
- But d > log 3/ q ≥ log 3/ n, so size(u) ≤ n(2/3) d < n(2/3) log3/2 n n = n/n = 1
- (^) Contradiction! (Since size(u)=1) So there must be a scapegoat node.
Summary
- (^) So far, we know
- (^) Insert and delete maintain the invariants:
- the^ depth^ of any node is at most^ log 3/ q
- (^) q < 2n
- (^) So the depth of any node is most log 3/ 2n ≤ 2 + log 3/ n
- (^) So, we can search in a scapegoat tree in O(log n) time
- (^) Some issues still to resolve
- (^) How do we keep track of size(w) for each node w?
- (^) How much time is spent rebuilding nodes during deletion and insertion?
(Not) keeping track of the size
(^59) 6 8 7 (^03) 1 4 2
- (^) We only need the size(w) while looking for a scapegoat
- (^) Knowing size(w), we can compute size(w.parent) by traversing the subtree rooted at sibling(w)
- (^) But we do O(size(v)) work when we rebuild v anyway, so this doesn't add anything to the cost of rebuilding
- (^) So, in O(size(v)), we know all sizes up to the scapegoat node time
Analysis of deletion
- (^) When deleting, if n < q/2, then we rebuild the whole tree
- (^) This takes O(n) time
- (^) If n < q/2 then we have done at least q - n > n/2 deletions
- (^) The amortized (average) cost of rebuilding (due to deletions) is O(1) per deletion
Review: Maintaining
Sorted Sets
- (^) We have seen the following data structures for implementing a SortedSet − Skiplists: find(x)/add(x)/remove(x) in O(log n) expected time per operation − Treaps: find(x)/add(x)/remove(x) in O(log n) expected time per operation − Scapegoat trees: find(x) in O(log n) worst-case time per operation, add(x)/remove(x) in O(log n) amortized time per operation
- (^) No data structures course would be complete without covering − 2-4 trees: find(x)/add(x)/remove(x) in O(log n) worst-case time per operation − Red-black trees: find(x)/add(x)/remove(x) in O(log n) worst-case time per operation
Review: Maintaining
Sorted Sets