Binary Search Trees: Implementation and Performance - Prof. Sharat Chandran, Study notes of Data Structures and Algorithms

An introduction to binary search trees, their definition, search algorithm, insertion, removal, and discusses the importance of balanced binary search trees. It also covers the differences between good and bad binary search trees and their impact on search performance.

Typology: Study notes

Pre 2010

Uploaded on 07/30/2009

koofers-user-cpl
koofers-user-cpl 🇺🇸

10 documents

1 / 15

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Home Page
Title Page
Contents
JJ II
JI
Page 1of 15
Go Back
Full Screen
Close
Quit
Introduction to Search Trees
Want to maintain an ordered collection (dictionary) Dof nitems (k, e)
to support
findElement(k): If Dcontains an item with key equal to k, then
return the element. Otherwise, return a sentinel NO_SUCH_KEY
insertItem(k,e): Duplicates may or may not be allowed.
removeElement(k)
closestElement(k): Find the item whose key value is closest to
k.
Binary search trees (BST) is a good choice (expected time O(log n) if
there are no removeElement() calls )
Balanced binary search trees (e.g. AVL trees) is the method of choice
if we want to guarantee worst case time O(log n)
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Binary Search Trees: Implementation and Performance - Prof. Sharat Chandran and more Study notes Data Structures and Algorithms in PDF only on Docsity!

Title Page

Contents

JJ II

J I

Page 1 of 15

Go Back

Full Screen

Close

Introduction to Search Trees

Want to maintain an ordered collection (dictionary) D of n items (k, e) to support

  • findElement(k): If D contains an item with key equal to k, then return the element. Otherwise, return a sentinel NO_SUCH_KEY
  • insertItem(k,e): Duplicates may or may not be allowed.
  • removeElement(k)
  • closestElement(k): Find the item whose key value is closest to k. Binary search trees (BST) is a good choice (expected time O(log n) if there are no removeElement() calls ) Balanced binary search trees (e.g. AVL trees) is the method of choice if we want to guarantee worst case time O(log n)

Title Page

Contents

JJ II

J I

Page 2 of 15

Go Back

Full Screen

Close

Straightforward Implementations are O(n)

  • An unordered array is such that inserting takes O(1) time, but searching or removing takes O(n)
  • Can be used to maintain log files (frequent insertions, rare searches or removals)
  • An ordered array is such that inserting and removing takes O(n) time, but searching takes only O(log n) time.
  • Can be used for lookup tables (frequent searches, with rare inser- tions or removals)
  • A binary search tree attempts to do better

Title Page

Contents

JJ II

J I

Page 4 of 15

Go Back

Full Screen

Close

Search: Most Frequent Usage

A successful search traverses a path starting at the root and ends at an internal node Node f i n d ( Key k , Node p ) { i f ( i s E x t e r n a l ( p ) ) return p ; i f ( k == p. key ) return p ; i f ( k < p. key ) return f i n d ( k , p. l e f t ) ; return f i n d ( k , p. r i g h t ) ; }

Title Page

Contents

JJ II

J I

Page 5 of 15

Go Back

Full Screen

Close

Insertion Follows Find

  • Let w be an external node returned by find(k, root). If we had access to a parent pointer, we can accomplish insert(k, root) by hanging the item to the parent
  • This is particularly easy if we had the (placeholder) external nodes found in extended binary trees.
  • If w is an internal node, and if duplicates are permitted, we call the algorithm recursively on the left child of w.

Title Page

Contents

JJ II

J I

Page 7 of 15

Go Back

Full Screen

Close

Three Cases for Removal

  • Removal also follows a find()
  • If the node w to be removed is such that
    1. Both children of w are external, then we set the parent of w to be external
    2. Only one of w’s children v contains a valid item, we set the child of the parent of w to be v.
  • All cases are easily implemented if we have parent pointers

Title Page

Contents

JJ II

J I

Page 8 of 15

Go Back

Full Screen

Close

Three Cases for Removal

If the node w to be removed is such that both children of w contain valid items, we find a replacement for the item at w and then remove the replacement.

The replacement will satisfy one of the two conditions mentioned.

Title Page

Contents

JJ II

J I

Page 10 of 15

Go Back

Full Screen

Close

Good and Bad Binary Search Trees

  • Given (offline) a set of keys, there exist many binary search trees that can be built based on these keys
  • Questions
    • Assume: All keys are equally likely candidates for searches. What is the best BST? How to construct one?
    • Assume: All keys are not equally likely candidates, but the prob- abilities are known. What is the best BST? How to construct one?
  • Given two trees, which one is better for successful (unsuccessful) search? (Assume equally likely situation)

Title Page

Contents

JJ II

J I

Page 11 of 15

Go Back

Full Screen

Close

Good and Bad Binary Search Trees

  • Metrics for n node BST
    • The internal path length I is the sum of the lengths of the paths from the root to each internal node (Example:
    • The external path length E is the sum of the lengths of the paths from the root to each external (placeholder) node (Example: 25)
    • E = I + 2n
  • Which tree is better for successful search?

Title Page

Contents

JJ II

J I

Page 13 of 15

Go Back

Full Screen

Close

Expected Height of Binary Search Tree

  • We circumvent this problem by the following reasonable model
    • No find() or remove()
    • n insert() calls are modeled by an array of n distinct number
    • Any of the n! permutations of the input is equally likely
    • The height of the leftmost branch of the resulting binary tree is the expected height.
  • Example

Title Page

Contents

JJ II

J I

Page 14 of 15

Go Back

Full Screen

Close

Expected Height of Leftmost Branch

  • Crucial observation: The number of nodes on the leftmost branch is equal to the number of minimum changes in the input array
  • Expected number of minimum changes in a randomly permuted array: - Let Y be the random variable that represents the number of minimum changes - Let xi be the indicator random variable that is true if the ith number is the minimum. Thus Y =

xi.

  • The probability that the ith number is the minimum is (^1) i
  • E[Y ] =

E[xi] =

i = ln n + O(1)

  • Expected height is Θ(log n)