























Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An overview of data compression, focusing on the concepts of compression, its benefits, and the use of huffman codes. The principles of compression, its sources of compressibility, types of compression, and the effectiveness of lossless and lossy compression. It also delves into the huffman code approach, its algorithm, and its properties.
Typology: Study notes
1 / 31
This page cannot be seen from the preview
Don't miss anything!
























Overview
Examples Sources Types Effectiveness
Properties Huffman tree (encoding) Decoding
Compression Examples
winzip, pkzip, compress, gzip
Images
.jpg, .gif
Audio
.wav (CD), .mp3, .wma, .aac
Video
mpeg1 (LD,VCD), mpeg2 (DVD), mpeg4 (Divx)
General
.zip, .gz
Sources of Compressibility
Recognize repeating patterns Exploit using
Dictionary Variable length encoding
Less sensitive to some information Can discard less important data
Effectiveness of Compression
Bits per byte (8 bits)
2 bits / byte
¼ original size
8 bits / byte
no compression
Percentage
75% compression
¼ original size
Effectiveness of Compression
Random data
hard
Example: 1001110100
Organized data
easy
Example: 1111111111
No universally best compression algorithm
Lossless Compression Techniques
Build pattern dictionary Replace patterns with index into dictionary
Find & compress repetitive sequences
Use variable length codes based on frequency
Huffman Code
Variable length encoding of symbols Exploit statistical frequency of symbols Efficient when symbol probabilities vary widely
Use fewer bits to represent frequent symbols Use more bits to represent infrequent symbols
Huffman Code Data Structures
Represents Huffman code Edge
code (0 or 1)
Leaf
symbol
Path to leaf
encoding
Example
To efficiently build binary tree
Huffman Code Algorithm Overview
Calculate frequency of symbols in file Create binary tree representing “best” encoding Use binary tree to encode compressed file
For each symbol, output path from root to leaf Size of encoding = length of path
Save binary tree
Huffman Tree Construction 1
Huffman Tree Construction 2
Huffman Tree Construction 4
Huffman Tree Construction 5
E
=
01
I
=
00
C
=
10
A
=
111
H
=
110