






















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Notes; Professor: Padua-Perez; Class: OBJECT-ORIENTED PROG II; Subject: Computer Science; University: University of Maryland; Term: Spring 2006;
Typology: Study notes
1 / 30
This page cannot be seen from the preview
Don't miss anything!























Compression
Reduce size of data
(number of bits needed to represent data)
Reduce storage needed
Reduce transmission cost / latency / bandwidth
Sources of Compressibility
Recognize repeating patterns
Exploit using
Dictionary
Variable length encoding
Less sensitive to some information
Can discard less important data
Types of Compression
Preserves all information
Exploits redundancy in data
Applied to general data
May lose some information
Exploits redundancy & human perception
Applied to audio, image, video
Effectiveness of Compression
Random data โ hard
Example: 1001110100 โ?
Organized data โ easy
Example: 1111111111 โ 1 ร 10
No universally best compression algorithm
Effectiveness of Compression
Pigeonhole principle
Reduce size 1 bit โ can only store ยฝ of data
Example
000, 001, 010, 011, 100, 101, 110, 111 โ 00, 01, 10, 11
If compression is always possible (alternative view)
Compress file (reduce size by 1 bit)
Recompress output
Repeat (until we can store data with 0 bits)
Huffman Code
Variable length encoding of symbols
Exploit statistical frequency of symbols
Efficient when symbol probabilities vary widely
Use fewer bits to represent frequent symbols
Use more bits to represent infrequent symbols
Huffman Code Example
Original โ 1/8ร2 + 1/4ร2 + 1/2ร2 + 1/8ร2 = 2 bits / symbol
Huffman โ 1/8ร3 + 1/4ร2 + 1/2ร1 + 1/8ร3 = 1.75 bits / symbol
Symbol
3 bits 2 bits 1 bit 3 bits
Huffman 110 10 0 111
Encoding
2 bits
Bird
Frequency 1/8 1/4 1/
2 bits 2 bits 2 bits
Original
Encoding
Dog Cat Fish
Huffman Code Algorithm Overview
Calculate frequency of symbols in file
Create binary tree representing โbestโ encoding
Use binary tree to encode compressed file
For each symbol, output path from root to leaf
Size of encoding = length of path
Save binary tree
Huffman Code โ Creating Tree
Place each symbol in leaf
Weight of leaf = symbol frequency
Select two trees L and R (initially leafs)
Such that L, R have lowest frequencies in tree
Create new (internal) node
Left child โ L
Right child โ R
New frequency โ frequency( L ) + frequency( R )
Repeat until all nodes merged into one tree
Huffman Tree Construction 2
Huffman Tree Construction 3
Huffman Tree Construction 5
E = 01
I = 00
C = 10
A = 111
H = 110
Huffman Coding Example
E = 01
I = 00
C = 10
A = 111
H = 110