



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An overview of balanced trees, including avl trees and red-black trees, and multi-way search trees. The concepts, history, and algorithms for maintaining tree balance, as well as the properties and search algorithms for multi-way search trees. The document also includes examples and explanations of tree rotations.
Typology: Study notes
1 / 7
This page cannot be seen from the preview
Don't miss anything!




1
2
Balance Rotation
Search Insert
3
Worst case Search in O(n) time
Average case Search in O( log(n) ) time
Degenerate binary tree
Balanced binary tree
4
Can we keep tree (mostly) balanced?
AVL trees Red-black trees
Select invariant (that keeps tree balanced) Fix tree after each insertion / deletion Maintain invariant using rotations Provides operations with O( log(n) ) worst case
5
Binary search tree Heights of children for node differ by at most 1
88
44
17 78
32 50
48 62
2
4
1
1
2
3
1
1
Heights of children shown in red
6
Discovered in 1962 by two Russian mathematicians, Adelson-Velskii & Landis
7
Red-black Trees
Binary search tree Every node is red or black The root is black Every leaf is black All children of red nodes are black For each leaf, same # of black nodes on path to root
Properties ensures no leaf is twice as far from root as another leaf
8
Red-black Trees
9
Red-black Trees
Discovered in 1972 by Rudolf Bayer
Insert / delete may require complicated bookkeeping & rotations
TreeMap, TreeSet use red-black trees
10
Tree Rotations
Move nodes Change edges
Single rotation Left Right Double rotation Left-right Right-left
11
Tree Rotation Example
1
2
3
1
2
3
12
Tree Rotation Example
1
2
3
5
6
4 1 6
2
3
5
4
19
Insert( 4 )
20
Insert( 1 )
Split node Split parent
21
Height of tree is O( logT (n) ) Reduces number of nodes accessed Wasted space for non-full nodes
1 node = 1 disk block Reduces number of disk blocks read
22
Key C can be decomposed into a sequence of subkeys C 1 , C 2 , … Cn Redundancy exists between subkeys
Store subkey at each node Path through trie yields full key
Huffman tree
23
String decomposes into sequence of letters Example “ART” ⇒ “A” “R” “T”
Less overhead than hashing
Exploiting redundancy
Explicitly storing substrings
24
Single character per node
Eliminating chains of nodes
Stores indices into original string(s)
Stores all suffixes of string
25
Standard Tries
Each node (except root) is labeled with a character Children of node are ordered (alphabetically) Paths from root to leaves yield all input strings
Trie for Morse Code 26
Standard Trie Example
{ a, an, and, any, at }
27
Standard Trie Example
{ bear, bell, bid, bull, buy, sell, stock, stop }
a
e
b
r
l
l
s
u
l
l
y
e t
l
l
o
c
k
p
i
d
28
Standard Tries
Value between 1…m Reference to m children Array or linked list
Class Node { Letter value; // Letter V = { V 1 , V 2 , … Vm } Node child[ m ]; }
29
Standard Tries
Uses O(n) space Supports search / insert / delete in O(d×m) time For n total size of strings indexed by trie d length of the parameter string m size of the alphabet
30
Word Matching Trie
Insert words into trie Each leaf stores occurrences of word in the text
s e e b e a r? s e l l s t o c k! s e e b u l l? b u y s t o c k! b i d s t o c k!
a a
h e t h e b e l l? s t o p!
b i d s t o c k!
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
a r 87 88
a
e
b
l
s u l
e t e 0, 24
o c
i l r 6
l 78
d 47, 58 (^) l 30
y 36 l (^12) k 17, 40, 51, 62
p 84
h e
r 69
a
37
Suffix Trie Example
e nimize
nimize ze
i mi ze
mize nimize ze
m i n i m i z e 0 1 2 3 4 5 6 7
38
Tries and Web Search Engines
Collection of all searchable words Stored in compressed trie
Associated with a word List of pages (URLs) containing that word Called occurrence list
Ranked by relevance
39
Computational Biology
Sequence of 4 different nucleotides (ATCG) Portions of DNA sequence produce proteins (genes)
Master DNA sequence for organism For Human 46 chromosomes 3 billion nucleotides
41
Tries and Computational Biology
Fragments of expressed DNA Indicator for genes (& location) 5.5 million sequences at NIH
Build suffix trie of genome 8 hours, 60 Gbytes Search for ESTs in suffix trie 11 hours w/ 8 processor Sun
5 +^ years (predicted)
Genome
ESTs
Suffix tree
Mapping
Gene