Advanced Data Structures: Balanced Trees and Multi-way Search Trees - Prof. Nelson Padua-P, Study notes of Computer Science

An overview of balanced trees, including avl trees and red-black trees, and multi-way search trees. The concepts, history, and algorithms for maintaining tree balance, as well as the properties and search algorithms for multi-way search trees. The document also includes examples and explanations of tree rotations.

Typology: Study notes

Pre 2010

Uploaded on 02/13/2009

koofers-user-j23
koofers-user-j23 🇺🇸

9 documents

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
1
CMSC 132:
Object-Oriented Programming II
Advanced Tree Structures
Department of Computer Science
University of Maryland, College Park
2
Overview
Binary trees
Balance
Rotation
Multi-way trees
Search
Insert
Indexed tries
3
Tree Balance
Degenerate
Worst case
Search in O(n) time
Balanced
Average case
Search in O( log(n) ) time
Degenerate
binary tree Balanced
binary tree
4
Tree Balance
Question
Can we keep tree (mostly) balanced?
Self-balancing binary search trees
AVL trees
Red-black trees
Approach
Select invariant (that keeps tree balanced)
Fix tree after each insertion / deletion
Maintain invariant using rotations
Provides operations with O( log(n) ) worst case
5
AVL Trees
Properties
Binary search tree
Heights of children for node differ by at most 1
Example
88
44
17 78
32 50
48 62
2
4
1
1
2
3
1
1
Heights of
children
shown in red
6
AVL Trees
History
Discovered in 1962 by two Russian
mathematicians, Adelson-Velskii & Landis
Algorithm
1. Find / insert / delete as a binary search tree
2. After each insertion / deletion
a) If height of children differ by more than 1
b) Rotate children until subtrees are balanced
c) Repeat check for parent (until root reached)
pf3
pf4
pf5

Partial preview of the text

Download Advanced Data Structures: Balanced Trees and Multi-way Search Trees - Prof. Nelson Padua-P and more Study notes Computer Science in PDF only on Docsity!

1

CMSC 132:

Object-Oriented Programming II

Advanced Tree Structures

Department of Computer Science

University of Maryland, College Park

2

Overview

Binary trees

Balance Rotation

Multi-way trees

Search Insert

Indexed tries

3

Tree Balance

Degenerate

Worst case Search in O(n) time

Balanced

Average case Search in O( log(n) ) time

Degenerate binary tree

Balanced binary tree

4

Tree Balance

Question

Can we keep tree (mostly) balanced?

Self-balancing binary search trees

AVL trees Red-black trees

Approach

Select invariant (that keeps tree balanced) Fix tree after each insertion / deletion Maintain invariant using rotations Provides operations with O( log(n) ) worst case

5

AVL Trees

Properties

Binary search tree Heights of children for node differ by at most 1

Example

88

44

17 78

32 50

48 62

2

4

1

1

2

3

1

1

Heights of children shown in red

6

AVL Trees

History

Discovered in 1962 by two Russian mathematicians, Adelson-Velskii & Landis

Algorithm

  1. Find / insert / delete as a binary search tree
  2. After each insertion / deletion a) If height of children differ by more than 1 b) Rotate children until subtrees are balanced c) Repeat check for parent (until root reached)

7

Red-black Trees

Properties

Binary search tree Every node is red or black The root is black Every leaf is black All children of red nodes are black For each leaf, same # of black nodes on path to root

Characteristics

Properties ensures no leaf is twice as far from root as another leaf

8

Red-black Trees

Example

9

Red-black Trees

History

Discovered in 1972 by Rudolf Bayer

Algorithm

Insert / delete may require complicated bookkeeping & rotations

Java collections

TreeMap, TreeSet use red-black trees

10

Tree Rotations

Changes shape of tree

Move nodes Change edges

Types

Single rotation Left Right Double rotation Left-right Right-left

11

Tree Rotation Example

Single right rotation

1

2

3

1

2

3

12

Tree Rotation Example

Single right rotation

1

2

3

5

6

4 1 6

2

3

5

4

Node 4 attached

to new parent

19

Multi-way Search Trees

Insert Example (for 2-3 tree)

Insert( 4 )

20

Multi-way Search Trees

Insert Example (for 2-3 tree)

Insert( 1 )

Split node Split parent

21

B-Trees

Characteristics

Height of tree is O( logT (n) ) Reduces number of nodes accessed Wasted space for non-full nodes

Popular for large databases

1 node = 1 disk block Reduces number of disk blocks read

22

Indexed Search Tree ( Trie)

Special case of tree

Applicable when

Key C can be decomposed into a sequence of subkeys C 1 , C 2 , … Cn Redundancy exists between subkeys

Approach

Store subkey at each node Path through trie yields full key

Example

Huffman tree

C 3

C 1

C 2

C 3 C 4

23

Tries

Useful for searching strings

String decomposes into sequence of letters Example “ART” ⇒ “A” “R” “T”

Can be very fast

Less overhead than hashing

May reduce memory

Exploiting redundancy

May require more memory

Explicitly storing substrings

S

A

R

E T

“ART”

24

Types of Tries

Standard

Single character per node

Compressed

Eliminating chains of nodes

Compact

Stores indices into original string(s)

Suffix

Stores all suffixes of string

25

Standard Tries

Approach

Each node (except root) is labeled with a character Children of node are ordered (alphabetically) Paths from root to leaves yield all input strings

Trie for Morse Code 26

Standard Trie Example

For strings

{ a, an, and, any, at }

27

Standard Trie Example

For strings

{ bear, bell, bid, bull, buy, sell, stock, stop }

a

e

b

r

l

l

s

u

l

l

y

e t

l

l

o

c

k

p

i

d

28

Standard Tries

Node structure

Value between 1…m Reference to m children Array or linked list

Example

Class Node { Letter value; // Letter V = { V 1 , V 2 , … Vm } Node child[ m ]; }

29

Standard Tries

Efficiency

Uses O(n) space Supports search / insert / delete in O(d×m) time For n total size of strings indexed by trie d length of the parameter string m size of the alphabet

30

Word Matching Trie

Insert words into trie Each leaf stores occurrences of word in the text

s e e b e a r? s e l l s t o c k! s e e b u l l? b u y s t o c k! b i d s t o c k!

a a

h e t h e b e l l? s t o p!

b i d s t o c k!

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86

a r 87 88

a

e

b

l

s u l

e t e 0, 24

o c

i l r 6

l 78

d 47, 58 (^) l 30

y 36 l (^12) k 17, 40, 51, 62

p 84

h e

r 69

a

37

Suffix Trie Example

e nimize

nimize ze

i mi ze

mize nimize ze

m i n i m i z e 0 1 2 3 4 5 6 7

38

Tries and Web Search Engines

Search engine index

Collection of all searchable words Stored in compressed trie

Each leaf of trie

Associated with a word List of pages (URLs) containing that word Called occurrence list

Trie is kept in memory (fast)

Occurrence lists kept in external memory

Ranked by relevance

39

Computational Biology

DNA

Sequence of 4 different nucleotides (ATCG) Portions of DNA sequence produce proteins (genes)

Genome

Master DNA sequence for organism For Human 46 chromosomes 3 billion nucleotides

41

Tries and Computational Biology

ESTs

Fragments of expressed DNA Indicator for genes (& location) 5.5 million sequences at NIH

ESTmapper

Build suffix trie of genome 8 hours, 60 Gbytes Search for ESTs in suffix trie 11 hours w/ 8 processor Sun

Search genome w/ BLAST

5 +^ years (predicted)

Genome

ESTs

Suffix tree

Mapping

Gene