Lecture Notes on Data Compression | COP 3530, Study notes of Data Structures and Algorithms

Material Type: Notes; Class: DATA STRUC/ALGORITHMS; Subject: COMPUTER PROGRAMMING; University: University of Florida; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 09/17/2009

koofers-user-ur5
koofers-user-ur5 🇺🇸

10 documents

1 / 33

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Data Compression
Reduce the size of data.
Reduces storage space and hence storage cost.
Compression ratio = original data size/compressed data size
Reduces time to retrieve and transmit data.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21

Partial preview of the text

Download Lecture Notes on Data Compression | COP 3530 and more Study notes Data Structures and Algorithms in PDF only on Docsity!

Data Compression

  • Reduce the size of data.
    • Reduces storage space and hence storage cost.
      • Compression ratio = original data size/compressed data size
        • Reduces time to retrieve and transmit data.

Lossless And Lossy Compression

  • compressedData = compress(originalData) • decompressedData = decompress(compressedData) • When originalData = decompressedData, the

compression is lossless.

  • When originalData != decompressedData, the

compression is lossy.

Text Compression

• Lossless compression is essential.^ •Popular text compressors such aszip and Unix’s compress are basedon the LZW (Lempel-Ziv-Welch)method.

LZW Compression

  • Character sequences in the original text are

replaced by codes that are dynamicallydetermined.

  • The code table is not encoded into the

compressed text, because it may bereconstructed from the compressed text duringdecompression.

LZW Compression

code^ key

0 a

1 b

  • Original text = abababbabaabbabbaabba • Compression is done by scanning the original text

from left to right.

  • Find longest prefix p for which there is a code in the

code table.

  • Represent p by its code pCode and assign the next

available code number to pc, where c is the nextcharacter in the text that is to be compressed.

LZW Compression

code^ key

0 a

1 b

(^2) ab

  • Original text = abababbabaabbabbaabba • p = a • pCode = 0 • c = b • Represent a by 0 and enter ab into the code table. • Compressed text = 0

LZW Compression

code^ key

0 a

1 b

(^2) ab

(^3) ba

(^4) aba

  • Original text =

abababbabaabbabbaabba

  • Compressed text = 01 • p = ab • pCode = 2 • c = a • Represent ab by 2 and enter aba into the code table. • Compressed text = 012

LZW Compression

code^ key

0 a

1 b

(^2) ab

(^3) ba

(^4) aba

(^5) abb

  • Original text =

abababbabaabbabbaabba

  • Compressed text = 012 • p = ab • pCode = 2 • c = b • Represent ab by 2 and enter abb into the code table. • Compressed text = 0122

LZW Compression

  • Original text =

abababbabaabbabbaabba

  • Compressed text = 01223

code^ key

0 a

1 b

(^2) ab

(^3) ba

  • p = ba • pCode = 3 • c = a • Represent ba by 3 and enter baa into the code table. • Compressed text = 012233

(^4) aba

(^6) bab

(^7) baa

abb

LZW Compression

  • Original text =

abababbabaabbabbaabba

  • Compressed text = 012233

code^ key

0 a

1 b

(^2) ab

(^3) ba

  • p = abb • pCode = 5 • c = a • Represent abb by 5 and enter abba into the code

table.

  • Compressed text = 0122335

(^4) aba

(^6) bab

(^7) baa

(^8) abba

abb

LZW Compression

code • Original text = abababbabaabbabbaabba • Compressed text = 01223358

key

0 a

1 b

(^2) ab

(^3) ba

  • p = abba • pCode = 8 • c = null • Represent abba by 8. • Compressed text = 012233588

(^4) aba

(^6) bab

(^7) baa

(^8) abba

(^9) abbaa

abb

Code Table Representation

  • Dictionary.
    • Pairs are (key, element) = (key,code). – Operations are : get(key) and put(key, code)
      • Limit number of codes to 2
  • Use a hash table. code^ key^ – Convert variable length keys into fixed length keys.^ – Each key has the form pc, where the string p is a key thatis already in the table.^ – Replace pc with (pCode)c.

0 a

1 b

(^2) ab

(^3) ba

(^4) aba

(^6) bab

(^7) baa

(^8) abba

(^9) abbaa

abb

LZW Decompression

code^ key

0 a

1 b

  • Original text = abababbabaabbabbaabba • Compressed text = 012233588 • Convert codes to text from left to right. • 0 represents a. • Decompressed text = a • pCode = 0 and p = a. • p = a followed by next text character (c) is entered

into the code table.

LZW Decompression

code^ key

0 a

1 b

(^2) ab

  • Original text = abababbabaabbabbaabba • Compressed text = 012233588 • 1 represents b. • Decompressed text = ab • pCode = 1 and p = b. • lastP = a followed by first character of p is entered

into the code table.