









Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
A comprehensive overview of transaction compression techniques used in databases. It explores various methods like lossy and lossless compression, dictionary compression, run-length encoding, and delta encoding. The benefits and limitations of each technique, highlighting their impact on storage space, query performance, and data consistency. It also discusses factors to consider when choosing a compression technique based on data characteristics and database requirements.
Typology: Lecture notes
1 / 16
This page cannot be seen from the preview
Don't miss anything!










GROUP 15
Lossy compression
Lossless compression
Dictionary compression
Run-length encoding (RLE)
Delta encoding
A method which removes some of the original data to make the data more
compact. It discards information regarded as less important or unnecessary, it
might discard some colour.
Example JPEG image when its compressed some of the original information is
lost, but the human eye cant detect the difference.
MP3 file is common audio compression format. When its compressed some of
the original sound data is lost but its not noticeable to the human ear
It involves creating a dictionary of frequently occurring values and replacing those
values with shorter codes or references within the database.
This technique is particularly effective when there are many repeated values within
the transactions.
By replacing the repeated values with shorter codes or references, the overall storage
space required for the transaction is reduced. For example, the word Hello might be
replaced with the code h1.
Additionally, dictionary compression can speed up database operations by reducing
the amount of data that needs to be read from or written to disk.
Is a simple compression technique that replaces consecutive repeated values
with a count and a single instance of the value.
RLE can be applied to compress sequences of repeated values within the
transaction.
For example, if a transaction contains multiple consecutive updates to the same
attribute, RLE can be used to compress these updates by storing the attribute
value and the number of consecutive updates instead of storing each update
individually.
If a data set contains the sequence “aaaaaaaaa”, RLE would replace this with “9a”
where 9 is the number of times “a” is repeated.
The choice of compression technique depends on the characteristics of the
data, the specific requirements of the database system, and the trade-offs
between storage space, query performance and computational overhead.
Different compression techniques can be combined or used selectively
based on the nature of the data and the database workload to achieve
transaction compression.
Transaction compression can make it more difficult to keep data consistent
across multiple nodes in a distributed database system. This is because
compression is done on a per-node basis and the different nodes may
compress the data in different ways. As a result, the same data may be
stored in different ways on different nodes, leading to inconsistency.