CS61B Lecture #29: Balanced Search Trees and Probabilistic Balancing with Skip Lists, Slides of Data Structures and Algorithms

A lecture note from cs61b course at university of california, berkeley. It covers the concepts of balanced search trees, specifically b-trees and red-black trees, and their importance for fast insertion, deletion, and search operations. The document also introduces skip lists, a probabilistic data structure that offers fast searches with high probability.

Typology: Slides

2012/2013

Uploaded on 04/27/2013

netii
netii 🇮🇳

4.4

(7)

91 documents

1 / 33

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CS61B Lecture #29
Administrative
Project 2 autograder trial run Wednesday night, 5 April (last run
before Friday deadline).
Be sure to keep your eye on the newsgroup for news about big test
data set, important project suggestions, etc.
Today:
Balanced search structures (
DS(IJ),
Chapter 9
Coming Up:
Pseudo-random Numbers (
DS(IJ),
Chapter 11)
Public Service Announcement: The TBΠEngineering Honor Soci-
ety is sponsoring Telebears peer-advising sessions. Free food and
drinks! Tu 4/6, 7–9PM in 430/438 Soda, Tu 4/11, 8–10PM in Unit 2
Rec Room, Wed. 4/12, 6–8PM in 120 Bechtel, Th. 4/13, 8–10PM in
Foothill. Questions? [email protected]
Last modified: Thu Apr 6 20:18:19 2006 CS61B: Lecture #29 1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21

Partial preview of the text

Download CS61B Lecture #29: Balanced Search Trees and Probabilistic Balancing with Skip Lists and more Slides Data Structures and Algorithms in PDF only on Docsity!

CS61B Lecture

Administrative

• Project 2 autograder trial run Wednesday night, 5 April (last run

before Friday deadline).

• Be sure to keep your eye on the newsgroup for news about big test

data set, important project suggestions, etc.

Today:

• Balanced search structures ( DS(IJ), Chapter 9

Coming Up:

• Pseudo-random Numbers ( DS(IJ), Chapter 11)

Public Service Announcement: The TBΠ Engineering Honor Soci-

ety is sponsoring Telebears peer-advising sessions. Free food and

drinks! Tu 4/6, 7–9PM in 430/438 Soda, Tu 4/11, 8–10PM in Unit 2

Rec Room, Wed. 4/12, 6–8PM in 120 Bechtel, Th. 4/13, 8–10PM in

Foothill. Questions? [email protected]

Balanced Search: The Problem

• Why are search trees important?

– Insertion/deletion fast (on every operation, unlike hash table,

which has to expand from time to time).

– Support range queries, sorting (unlike hash tables)

• But O(lg N ) performance from binary search tree requires remaining

keys be divided ≈ by 2 at each node.

• In other words, that tree be “bushy”

• “Stringy” trees (many nodes with one side much longer than other)

perform like linked lists.

• Suffices that heights of two subtrees always differ by no more than

constant K.

Sample Order 4 B-tree ((2,4) Tree)

• Crossed lines show path when finding 40.

• Keys on either side of each child pointer in path bracket 40.

• Each node has at least 2 children, and all leaves (little circles) are

at the bottom, so height must be O(lg N ).

• In real-life B-tree, order typically much bigger

– comparable to size of disk sector, page, or other convenient unit

of I/O

Inserting in B-tree (Simple Case)

• Start:

• Insert 7:

Deleting Keys from B-tree

• Remove 20 from last tree.

(too small) (combine)

(too big) 5 10 15 27 30 40 50

Red-Black Trees

• Red-black tree is a binary search tree with additional constraints

that limit how unbalanced it can be.

• Thus, searching is always O(lg N ).

• Used for Java’s TreeSet and TreeMap types.

• When items are inserted or deleted, tree is rotated as needed to

restore balance.

• Constraints:

1. Each node is (conceptually) colored red or black.

2. Root is black.

3. Every leaf node contains no data (as for B-trees) and is black.

4. Every leaf has same number of black ancestors.

5. Every internal node has two children.

6. Every red node has two black children.

• Conditions 4, 5, and 6 guarantee O(lg N ) searches.

Red-Black Insertion and Rotations

• Insert at bottom just as for binary tree (color red except when tree

initially empty).

• Then rotate (and recolor) to restore red-black property, and thus

balance.

• Rotation of trees preserves binary tree property, but changes bal-

ance.

A C

B

E

D

C E

D

A

B

D.rotateRight()

B.rotateLeft()

Example of Red-Black Insertion (I)

• Insert 7:

– Here, sibling of offending node (10) is black, so rotate and re-

color.

– In corresponding (2,4) tree, new node fits in existing node.

Example of Red-Black Insertion (II)

• Insert 27, recolor to restore red-black property. Doesn’t do any

rebalancing, but sets things up to cause future insertions to rebal-

ance.

• In corresponding (2,4) tree, this recoloring splits nodes (adds extra

black nodes). We don’t have to recolor the root to red, as we did

25, because we are increasing the height of this (2,4) tree.

Example of Red-Black Insertion (II)

• Insert 27, recolor to restore red-black property. Doesn’t do any

rebalancing, but sets things up to cause future insertions to rebal-

ance.

• In corresponding (2,4) tree, this recoloring splits nodes (adds extra

black nodes). We don’t have to recolor the root to red, as we did

25, because we are increasing the height of this (2,4) tree.

Really Efficient Use of Keys: the Trie

• Have been silent about cost of comparisons.

• For strings, worst case is length of string.

• Therefore should throw extra factor of key length, L, into costs:

– Θ(M ) comparisons really means Θ(M L) operations.

– So to look for key X, keep looking at same chars of X M times.

• Can we do better? Can we get search cost to be O(L)?

Idea: Make a multi-way decision tree, with one decision per character

of key.

The Trie: Example

• Set of keys

{a, abase, abash, abate, abbas, axolotl, axe, fabric, facet}

• Ticked lines show paths followed for “abash” and “fabric”

• Each internal node corresponds to a possible prefix.

• Characters in path to node = that prefix.

a

a

a 2

b

ab

a

aba

s

abas

e

abase 2

h

abash 2

t

abate 2

b

abbas 2

x

ax

e

axe 2

o

axolotl 2

f

f

a

fa

b

fabric 2

c

facet 2

A Side-Trip: Scrunching

• For speed, obvious implementation for internal nodes is array in-

dexed by character.

• Gives O(L) performance, L length of search key.

• [Looks as if independent of N , number of keys. Is there a depen-

dence?]

• Problem: arrays are sparsely populated by non-null values—waste of

space.

Idea: Put the arrays on top of each other!

• Use null (0, empty) entries of one array to hold non-null elements of

another.

• Use extra markers to tell which entries belong to which array.

Scrunching Example

Small example: (unrelated to Tries on preceding slides)

• Three leaf arrays, each indexed 0..

A1:

bass trout pike

A2:

ghee milk oil

A3:

salt cumin mace

• Now overlay them, but keep track of original index of each item:

A123:

bass

salt

ghee

trout

cumin

pike

milk oil

mace

A1: 0* 1 2 3 4 5* 6 7* 8 9

A2: 0 1 2* 3 4 5 6* 7* 8 9

A3: 0 1* 2 3 4 5* 6 7 8 9*