Download Binary Search Trees: Implementation and Performance - Prof. Sharat Chandran and more Study notes Data Structures and Algorithms in PDF only on Docsity!
Title Page
Contents
JJ II
J I
Page 1 of 15
Go Back
Full Screen
Close
Introduction to Search Trees
Want to maintain an ordered collection (dictionary) D of n items (k, e) to support
- findElement(k): If D contains an item with key equal to k, then return the element. Otherwise, return a sentinel NO_SUCH_KEY
- insertItem(k,e): Duplicates may or may not be allowed.
- removeElement(k)
- closestElement(k): Find the item whose key value is closest to k. Binary search trees (BST) is a good choice (expected time O(log n) if there are no removeElement() calls ) Balanced binary search trees (e.g. AVL trees) is the method of choice if we want to guarantee worst case time O(log n)
Title Page
Contents
JJ II
J I
Page 2 of 15
Go Back
Full Screen
Close
Straightforward Implementations are O(n)
- An unordered array is such that inserting takes O(1) time, but searching or removing takes O(n)
- Can be used to maintain log files (frequent insertions, rare searches or removals)
- An ordered array is such that inserting and removing takes O(n) time, but searching takes only O(log n) time.
- Can be used for lookup tables (frequent searches, with rare inser- tions or removals)
- A binary search tree attempts to do better
Title Page
Contents
JJ II
J I
Page 4 of 15
Go Back
Full Screen
Close
Search: Most Frequent Usage
A successful search traverses a path starting at the root and ends at an internal node Node f i n d ( Key k , Node p ) { i f ( i s E x t e r n a l ( p ) ) return p ; i f ( k == p. key ) return p ; i f ( k < p. key ) return f i n d ( k , p. l e f t ) ; return f i n d ( k , p. r i g h t ) ; }
Title Page
Contents
JJ II
J I
Page 5 of 15
Go Back
Full Screen
Close
Insertion Follows Find
- Let w be an external node returned by find(k, root). If we had access to a parent pointer, we can accomplish insert(k, root) by hanging the item to the parent
- This is particularly easy if we had the (placeholder) external nodes found in extended binary trees.
- If w is an internal node, and if duplicates are permitted, we call the algorithm recursively on the left child of w.
Title Page
Contents
JJ II
J I
Page 7 of 15
Go Back
Full Screen
Close
Three Cases for Removal
- Removal also follows a find()
- If the node w to be removed is such that
- Both children of w are external, then we set the parent of w to be external
- Only one of w’s children v contains a valid item, we set the child of the parent of w to be v.
- All cases are easily implemented if we have parent pointers
Title Page
Contents
JJ II
J I
Page 8 of 15
Go Back
Full Screen
Close
Three Cases for Removal
If the node w to be removed is such that both children of w contain valid items, we find a replacement for the item at w and then remove the replacement.
The replacement will satisfy one of the two conditions mentioned.
Title Page
Contents
JJ II
J I
Page 10 of 15
Go Back
Full Screen
Close
Good and Bad Binary Search Trees
- Given (offline) a set of keys, there exist many binary search trees that can be built based on these keys
- Questions
- Assume: All keys are equally likely candidates for searches. What is the best BST? How to construct one?
- Assume: All keys are not equally likely candidates, but the prob- abilities are known. What is the best BST? How to construct one?
- Given two trees, which one is better for successful (unsuccessful) search? (Assume equally likely situation)
Title Page
Contents
JJ II
J I
Page 11 of 15
Go Back
Full Screen
Close
Good and Bad Binary Search Trees
- Metrics for n node BST
- The internal path length I is the sum of the lengths of the paths from the root to each internal node (Example:
- The external path length E is the sum of the lengths of the paths from the root to each external (placeholder) node (Example: 25)
- E = I + 2n
- Which tree is better for successful search?
Title Page
Contents
JJ II
J I
Page 13 of 15
Go Back
Full Screen
Close
Expected Height of Binary Search Tree
- We circumvent this problem by the following reasonable model
- No find() or remove()
- n insert() calls are modeled by an array of n distinct number
- Any of the n! permutations of the input is equally likely
- The height of the leftmost branch of the resulting binary tree is the expected height.
- Example
Title Page
Contents
JJ II
J I
Page 14 of 15
Go Back
Full Screen
Close
Expected Height of Leftmost Branch
- Crucial observation: The number of nodes on the leftmost branch is equal to the number of minimum changes in the input array
- Expected number of minimum changes in a randomly permuted array: - Let Y be the random variable that represents the number of minimum changes - Let xi be the indicator random variable that is true if the ith number is the minimum. Thus Y =
xi.
- The probability that the ith number is the minimum is (^1) i
- E[Y ] =
E[xi] =
i = ln n + O(1)
- Expected height is Θ(log n)