























Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
An overview of data compression, focusing on the concepts of huffman codes. Learn about the benefits of compression, different types, and the effectiveness of lossless and lossy methods. Discover the principles of huffman coding and its advantages in handling varying symbol probabilities.
Typology: Study notes
1 / 31
This page cannot be seen from the preview
Don't miss anything!
























winzip, pkzip, compress, gzip
Images
.jpg, .gif
Audio
.mp3, .wav
Video
mpeg1 (VCD), mpeg2 (DVD), mpeg4 (Divx)
General
.zip, .gz
Recognize repeating patterns Exploit using
Dictionary Variable length encoding
Less sensitive to some information Can discard less important data
Bits per byte (8 bits)
2 bits / byte
¼ original size
8 bits / byte
no compression
Percentage
75% compression
¼ original size
Random data
hard
Example: 1001110100
Organized data
easy
Example: 1111111111
No universally best compression algorithm
Build pattern dictionary Replace patterns with index into dictionary
Find & compress repetitive sequences
Use variable length codes based on frequency
Variable length encoding of symbols Exploit statistical frequency of symbols Efficient when symbol probabilities vary widely
Use fewer bits to represent
frequent
symbols
Use more bits to represent
infrequent
symbols
Represents Huffman code Edge
code (0 or 1)
Leaf
symbol
Path to leaf
encoding
Example
To efficiently build binary tree
Calculate frequency of symbols in file Create binary tree representing “best” encoding Use binary tree to encode compressed file
For each symbol, output path from root to leaf Size of encoding = length of path
Save binary tree