Ip 4
Ip 4
degrading its visual quality. It is an essential technique used in various applications, such as
digital photography, image storage, transmission over networks, and multimedia systems. The
primary goal of image compression is to minimize the file size while preserving the essential
information and perceptual fidelity of the image.
The compression of images is an important step before we start the processing of larger images
or videos. The compression of images is carried out by an encoder and output a compressed
form of an image. In the processes of compression, the mathematical transforms play a vital
role.
Need
Image compression is essential in our digital lives because large image files can cause slow
website loading times, difficulties in sharing images online, and limited storage space. By
compressing images, their size is reduced, making it easier to store and transmit them.
1. Reduced Storage Requirements: Image compression reduces file sizes, allowing for
efficient storage of images on devices with limited storage capacity.
2. Bandwidth Efficiency: Compressed images require less bandwidth, resulting in faster
upload and download times during image transmission over networks.
3. Faster Processing: Smaller file sizes from compression lead to faster loading times and
improved performance in image processing tasks.
4. Cost Reduction: Compression reduces storage, network infrastructure, and data
transfer costs associated with images.
5. Improved User Experience: Smaller compressed images result in quicker website
loading, enhanced multimedia streaming, and better user satisfaction.
6. Compatibility: Compression adapts images to meet the size and format limitations of
different devices and platforms.
7. Archiving and Preservation: Image compression reduces storage requirements,
making it more feasible to archive and preserve large collections of images over time.
Classification
There are two main types of image compression: lossless compression and lossy compression.
Lossless Compression
● Lossless compression algorithms reduce the file size of an image without any loss of
information. The compressed image can be perfectly reconstructed to its original form.
● This method is commonly used in scenarios where preserving every detail is crucial,
such as medical imaging or scientific data analysis
● Lossless compression achieves compression by exploiting redundancy and eliminating
repetitive patterns in the image data.
● Some common lossless compression algorithms include:
○ Run-Length Encoding (RLE): This algorithm replaces consecutive repetitions of
the same pixel value with a count and the pixel value itself.
○ Huffman coding: It assigns variable-length codes to different pixel values based
on their frequency of occurrence in the image.
○ Lempel-Ziv-Welch (LZW): This algorithm replaces repetitive sequences of pixels
with shorter codes, creating a dictionary of commonly occurring patterns.
● Lossless compression techniques typically achieve modest compression ratios
compared to lossy compression but ensure exact data preservation.
● Types of lossless images include:
○ RAW - these files types tend to be quite large in size. Additionally, there are
different versions of RAW, and you may need certain software to edit the files.
○ PNG - Compresses images to keep their small size by looking for patterns on a
photo, and compressing them together. The compression is reversible, so once
you open a PNG file, the image recovers exactly.
○ BMP - A format found exclusively to Microsoft. It's lossless, but not frequently
used.
Lossy Compression
● Lossy compression algorithms achieve higher compression ratios by discarding some
information from the image that is less perceptually significant.
● This method is widely used in applications such as digital photography, web images, and
multimedia streaming, where a small loss in quality is acceptable to achieve significant
file size reduction.
● Lossy compression techniques exploit the limitations of human visual perception and the
characteristics of natural images to remove or reduce redundant or less noticeable
details
● The algorithms achieve this by performing transformations on the image data and
quantizing it to reduce the number of distinct values.
● The main steps involved in lossy compression are:
○ Transform Coding: The image is transformed from the spatial domain to a
frequency domain using techniques like Discrete Cosine Transform (DCT) or
Wavelet Transform. These transforms represent the image data in a more
compact manner by concentrating the energy in fewer coefficients.
○ Quantization: In this step, the transformed coefficients are quantized, which
involves reducing the precision or dividing the range of values into a finite set of
discrete levels. Higher levels of quantization lead to greater compression but also
more loss of information. The quantization process is typically designed to
allocate more bits to visually important coefficients and fewer bits to less
important ones.
○ Entropy Encoding: The quantized coefficients are further compressed using
entropy coding techniques like Huffman coding or Arithmetic coding. These
coding schemes assign shorter codes to more frequently occurring coefficients,
resulting in additional compression.
● The amount of compression achieved in lossy compression is customizable based on
the desired trade-off between file size reduction and visual quality. Different compression
algorithms and settings can be used to balance the compression ratio and the
perceptual impact on the image.
Methods of compression
Run-length Coding
Run-length coding is a simple and effective technique used in image compression, especially for
scenarios where the image contains long sequences of identical or highly similar pixels. It
exploits the redundancy present in such sequences to achieve compression.
The basic idea behind run-length coding is to represent consecutive repetitions of the same
pixel value with a count and the pixel value itself, instead of explicitly storing each pixel
individually. By doing so, run-length coding reduces the amount of data required to represent
these repetitive patterns.
Say you have a picture of red and white stripes, and there are 12 white pixels and 12 red pixels.
Normally, the data for it would be written as WWWWWWWWWWWWRRRRRRRRRRRR, with
W representing the white pixel and R the red pixel. Run length would put the data as 12W and
12R. Much smaller and simpler while still keeping the data unaltered.
Run-length Decoding:
● Retrieving the Encoded Data: The encoded run-length data is retrieved from storage.
● Decoding: The decoding process involves reconstructing the original image from the
run-length data. Starting from the first <count, value> pair, the algorithm repeats the
value count times to obtain the sequence of pixels. This process is repeated for each
<count, value> pair until the entire image is reconstructed.
The decoding step essentially reverses the encoding process, reconstructing the original
image by expanding the compressed run-length representation.
The original data isn't instantly accessible, we have to decode everything before you can access
anything. Also You can't tell how large the decoded data will be.
Run-length coding is particularly effective for images with areas of solid color or regions with
uniform patterns, such as line drawings, text, or simple graphics. However, it may not be as
efficient for more complex and detailed images, as they tend to have fewer long runs of identical
pixels.
Run-length coding is often used in conjunction with other compression techniques, such as
Huffman coding or arithmetic coding, to achieve higher compression ratios. By combining
run-length coding with these entropy encoding techniques, the frequency of occurrence of
different runs or pixel values can be exploited to assign shorter codes to more frequent patterns,
resulting in additional compression.
It's important to note that if the probabilities of two or more pixel values are the same, a different
mechanism may be employed to ensure the prefix-free property. For example, in such cases,
the pixel values can be sorted based on their original order in the image or some other
tie-breaking rule.
Shannon-Fano coding is a basic technique that assigns codes based on probabilities but does
not guarantee the optimal code lengths. Huffman coding, which is an extension of
Shannon-Fano coding, provides a more efficient encoding scheme by considering the
probabilities and constructing a binary tree where shorter codes are assigned to more frequent
symbols.
Huffman coding
Huffman coding is a widely used entropy encoding technique for image compression. It assigns
variable-length codes to different symbols (in this case, pixel values) based on their probabilities
or frequencies of occurrence. Huffman coding achieves efficient compression by assigning
shorter codes to more frequently occurring symbols.
Huffman coding achieves efficient compression by assigning shorter codes to more frequently
occurring symbols, which results in a reduction of the overall number of bits required to
represent the image. This technique is widely used in image compression algorithms such as
JPEG (Joint Photographic Experts Group) and is known for its simplicity and effectiveness in
achieving good compression ratios while preserving visual quality
Scalar quantization provides a simple and efficient means of reducing the number of bits
required to represent an image. However, it may introduce quantization errors and loss of fine
details since each pixel is quantized independently.
Vector Quantization:
Vector quantization (VQ) is a technique used in image compression that extends the concept of
scalar quantization by grouping multiple pixels together into blocks or vectors. It aims to capture
the statistical dependencies and similarities among neighboring pixels, resulting in improved
compression performance and preservation of local image features.
The MPEG compression standard is used in various formats such as MPEG-1, MPEG-2,
MPEG-4, and MPEG-7, each offering different levels of compression efficiency and supporting
different video applications, from low-quality video streaming to high-definition video storage.
Video compression
Video compression is the process of reducing the size of video data while maintaining an
acceptable level of visual quality. It involves applying various techniques to exploit spatial and
temporal redundancies in video sequences, resulting in efficient storage, transmission, and
streaming of videos.
Video compression techniques aim to achieve a balance between compression efficiency and
visual quality. The choice of compression algorithm, parameters, and settings depends on the
specific requirements, such as target bitrate, resolution, desired quality, and available resources.
Object recognition can be approached using different techniques, ranging from traditional
computer vision methods to more advanced deep learning-based approaches. Traditional
methods often rely on handcrafted features and classifiers, while deep learning methods
leverage the power of neural networks to automatically learn discriminative features and
classifiers from large amounts of labeled data.
Computer Vision
Computer vision is a field of study and research that focuses on enabling computers to gain a
high-level understanding of visual information from digital images or video. It involves
developing algorithms and techniques that allow computers to analyze, interpret, and make
sense of visual data, mimicking human visual perception and understanding.
● Template Matching:
○ Template matching compares a predefined template image with sub-regions of
the input image to find matching patterns. It involves calculating the similarity
between the template and image patches using metrics like correlation or sum of
squared differences. Template matching is straightforward but can be sensitive to
variations in scale, rotation, and lighting conditions.
● Feature-Based Methods:
○ Feature-based methods extract distinctive features from images and use them to
recognize objects. Examples of feature descriptors include Scale-Invariant
Feature Transform (SIFT), Speeded-Up Robust Features (SURF), and Oriented
FAST and Rotated BRIEF (ORB). These methods detect keypoints in images and
compute descriptors that represent the local visual characteristics of the
keypoints. Object recognition is then performed by matching and comparing
these features across images.
● Deep Learning:
○ Deep learning, particularly convolutional neural networks (CNNs), has
revolutionized object recognition. CNNs are capable of automatically learning
hierarchical features from raw image data. Training involves feeding labeled
images to the network, and it learns to recognize objects by adjusting the weights
of its layers. Deep learning-based object recognition models, such as YOLO (You
Only Look Once), Faster R-CNN (Region-based Convolutional Neural Networks),
and SSD (Single Shot MultiBox Detector), have achieved impressive results in
terms of accuracy and real-time performance.
● Histogram-based Methods:
○ Histogram-based methods utilize color and texture information to recognize
objects. These methods analyze the distribution of color or texture features in
images and use statistical measures to compare and classify objects. Examples
include color histograms, local binary patterns (LBPs), and histogram of oriented
gradients (HOG). Histogram-based methods are effective for simple object
recognition tasks but may struggle with complex scenes or object variations.
● Ensemble Techniques:
○ Ensemble techniques combine multiple object recognition models or classifiers to
improve overall performance. This can involve techniques such as ensemble
averaging, boosting, or bagging. By combining the predictions of multiple models,
ensemble techniques can enhance robustness, accuracy, and generalization of
object recognition systems.