0% found this document useful (0 votes)
4 views

data-compression

The document discusses data compression, highlighting its purpose to save storage space, reduce transfer times, and lower costs. It differentiates between lossy and lossless compression, explaining techniques like JPEG for images and MP3 for audio, as well as lossless methods such as Run-Length Encoding (RLE). RLE is detailed as a technique that compresses repeated data elements, demonstrating its efficiency with examples of text and image data.

Uploaded by

levi makokha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

data-compression

The document discusses data compression, highlighting its purpose to save storage space, reduce transfer times, and lower costs. It differentiates between lossy and lossless compression, explaining techniques like JPEG for images and MP3 for audio, as well as lossless methods such as Run-Length Encoding (RLE). RLE is detailed as a technique that compresses repeated data elements, demonstrating its efficiency with examples of text and image data.

Uploaded by

levi makokha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Subject: Computer Science

Data Compression

Purpose of Compression
• Save storage space on devices.
• Reduce time for streaming or transferring files over networks.
• Minimize upload/download times, conserving bandwidth.
• Lower storage costs, especially with cloud services.
• ISPs may charge based on data usage, so compressed files save on data
costs.
Lossy and Lossless File Compression

Types of Compression:
• Lossless Compression: No data is lost; the original file can be perfectly
reconstructed. Used when every detail needs to be preserved.
• Lossy Compression: Some data is discarded to reduce file size. Cannot
fully reconstruct the original file but is useful for reducing storage and
bandwidth.
Lossy Compression
• Removes less important data:
• Image files: Lowers resolution or color depth.
• Sound files: Reduces sampling rate or resolution.

Common Lossy Compression Formats


MP3 (MPEG-3) and MP4 (MPEG-4)
• MP3: Removes sounds beyond human hearing range or overlapping
sounds (e.g., quieter sounds are discarded). Reduces size by about 90%.
• MP4: Stores audio, video, and multimedia. Maintains acceptable quality
for streaming with reduced size.
JPEG (Images)
• When a camera takes a photograph, it produces a raw bitmap file which
can be very large in size. These files are temporary in nature. JPEG is a
lossy file compression algorithm used for bitmap images. As with MP3,
once the image is subjected to the JPEG compression algorithm, a new
file is formed and the original file can no longer be constructed.
Key concepts include:
• Reduces file size by adjusting color details (humans are less sensitive to
color than brightness).
• Splits images into 8x8 blocks to allow for discarding certain data without
noticeable quality loss.

Lossless File Compression


Lossless compression is a method that compresses data without losing any
original information. This means the file can be perfectly reconstructed, making
it ideal for critical data where any loss would be problematic (e.g., spreadsheets
or software applications).

Run-Length Encoding (RLE)


• RLE is a type of lossless compression technique.
• It reduces the size of files by compressing repeated data elements.
• RLE is effective with data that has long sequences of identical values, like
images or text with repeated characters.

How RLE Works


• It converts a sequence of identical elements into two values:
i. The first value represents the number of identical items (run length).
ii. The second value represents the item itself (e.g., ASCII code for text or
color code for images).
• For instance, a string with repeated letters like "aaaaabbbcc" could be
represented as (5, 'a'), (3, 'b'), (2, 'c') in RLE, significantly reducing file size.
Example of RLE on Text Data:
1. Given a text like "aaaaabbbbccddddd", with ASCII codes used for each
character:

RLE would code it as follows:


• a (5 times, ASCII 97)
• b (4 times, ASCII 98)
• c (2 times, ASCII 99)
• d (5 times, ASCII 100)
This requires only 8 bytes instead of the original 16 bytes.

Handling Issues in RLE:


- For strings like "cdcdcdcd", RLE would not be efficient because there are no
long runs of identical characters.
- Solution: Use a flag (e.g., 255) to indicate sequences that are not repeated.
After the flag, specify the value and a run of 1 for each item.

Example with Flagging


• For the string "aaaaaaaabbbbbbbbccdeeeeeee", a flag 255 can be
introduced to indicate patterns:
• The string is converted to codes such as 08 97 (for 'a' 8 times), 01 99 (for
'c' once), etc.
• This compression reduces the original size by around 53%.
Using RLE with Images

Example 1: Black and White Image

• Consider an 8x8 grid where each cell requires 1 byte of storage:


• White cells (value 1), Black cells (value 0).
• For an image representing a letter "F", the grid data can be compressed
by RLE:
• For instance, a sequence like 11110000 becomes 4W 4B.
• Result: Original image (64 bytes) reduces to 30 bytes with RLE.
Example 2: Colored Images
• Consider an 8x8 grid where each cell represents a color (Red, Green, Blue
values):
• Four colors are represented with different RGB values.
• The grid data, when encoded with RLE, compresses sequences of colors.
• Data Example: 0 0 0 for one color, 255 0 0 for another, etc.
• Result: Original (192 bytes) reduces to 92 bytes using RLE, a reduction of
around 52%.

Note:
1. RLE is effective for Repeated Data: Works best with files that have long runs of
repeated elements.
2. RLE may not reduce file size if there are frequent changes in data (e.g.,
alternating patterns).
3.Commonly used in image and text compression where patterns are prevalent.

You might also like