Transaction Compression Techniques in Databases: A Comprehensive Guide, Lecture notes of Database Management Systems (DBMS)

A comprehensive overview of transaction compression techniques used in databases. It explores various methods like lossy and lossless compression, dictionary compression, run-length encoding, and delta encoding. The benefits and limitations of each technique, highlighting their impact on storage space, query performance, and data consistency. It also discusses factors to consider when choosing a compression technique based on data characteristics and database requirements.

Typology: Lecture notes

2024/2025

Uploaded on 04/13/2025

akudzweishe-guri
akudzweishe-guri 🇿🇼

1 document

1 / 16

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
TRANSACTION
COMPRESSION
GROUP 15
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Transaction Compression Techniques in Databases: A Comprehensive Guide and more Lecture notes Database Management Systems (DBMS) in PDF only on Docsity!

TRANSACTION

COMPRESSION

GROUP 15

MEMBERS

• PATRICIA MUPINGA R231039H

• TADIWANASHE A ZVOMUYA R233091H

• CHARMAINE HANGAZHA R213557H

COLLETTA MAPFIRO R235217H

• COURAGE NYAMHONDORO R228061H

COMPRESSION TECHNIQUES THAT CAN BE APPLIED

TO ACHIEVE TRANSACTION COMPRESSION:

Lossy compression

Lossless compression

Dictionary compression

Run-length encoding (RLE)

Delta encoding

LOSSY COMPRESSION

A method which removes some of the original data to make the data more

compact. It discards information regarded as less important or unnecessary, it

might discard some colour.

Example JPEG image when its compressed some of the original information is

lost, but the human eye cant detect the difference.

MP3 file is common audio compression format. When its compressed some of

the original sound data is lost but its not noticeable to the human ear

DICTIONARY COMPRESSION

 It involves creating a dictionary of frequently occurring values and replacing those

values with shorter codes or references within the database.

 This technique is particularly effective when there are many repeated values within

the transactions.

 By replacing the repeated values with shorter codes or references, the overall storage

space required for the transaction is reduced. For example, the word Hello might be

replaced with the code h1.

 Additionally, dictionary compression can speed up database operations by reducing

the amount of data that needs to be read from or written to disk.

RUN-LENGTH ENCODING

 Is a simple compression technique that replaces consecutive repeated values

with a count and a single instance of the value.

 RLE can be applied to compress sequences of repeated values within the

transaction.

 For example, if a transaction contains multiple consecutive updates to the same

attribute, RLE can be used to compress these updates by storing the attribute

value and the number of consecutive updates instead of storing each update

individually.

 If a data set contains the sequence “aaaaaaaaa”, RLE would replace this with “9a”

where 9 is the number of times “a” is repeated.

FACTS TO CONSIDER WHEN CHOOSING

COMPRESSION TECHNIQUE:

The choice of compression technique depends on the characteristics of the

data, the specific requirements of the database system, and the trade-offs

between storage space, query performance and computational overhead.

Different compression techniques can be combined or used selectively

based on the nature of the data and the database workload to achieve

transaction compression.

LIMITATIONS OF TRANSACTION COMPRESSION

Transaction compression can make it more difficult to keep data consistent

across multiple nodes in a distributed database system. This is because

compression is done on a per-node basis and the different nodes may

compress the data in different ways. As a result, the same data may be

stored in different ways on different nodes, leading to inconsistency.