0% found this document useful (0 votes)
13 views37 pages

DIP - Notes - PDF BN

digital imaage processing notes covered with topic related to image compression , image representation

Uploaded by

08617711622ml
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views37 pages

DIP - Notes - PDF BN

digital imaage processing notes covered with topic related to image compression , image representation

Uploaded by

08617711622ml
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Saturday, 1 March 2025

DIP Unit-1

- Image: An image may be de ned as a two-dimensional function, f(x, y), where x and
y are spatial (plane) coordinates, and the amplitude of f at any pair of coordinates
(x ,y) is called the intensity or gray level of the image at that point.
- Types of Image:
• Digital Image: An image represented by a grid of discrete pixels, each having a
numerical value corresponding to its brightness or color, and stored in a digital
format that can be processed by computers.
• Analog Image: A continuous image that directly represents the physical world,
where light and color vary smoothly without discrete values, typically captured by
traditional cameras or displayed on lm.

- Conversion Techniques:
• Sampling: Sampling is the process in which continuous-time signals or analog
signals are converted into discrete-time signals. We basically divide the signal into
equal time intervals or interval it based on X-axis.
• Quantization: It is the process by which we convert the magnitudes of the
continuous signal or analog signal to discrete values.
• Performing Sampling and then Quantization in order helps us in forming a digital
signal corresponding to the analog signal.

1
fi
fi
Saturday, 1 March 2025

- Basic Terminologies:
• Hue refers to the type of color in an image, essentially the "color" aspect (e.g., red,
blue, green). It is one of the main components in the HSL (Hue, Saturation,
Lightness) or HSV (Hue, Saturation, Value) color models.
• A pixel (short for "picture element") is the smallest unit of a digital image. It
represents a single point in the image and contains color information (usually in
terms of RGB values or another color space).
• Contrast is the di erence in brightness and color between elements in an image. It
measures the variation in light between the lightest and darkest areas of an image.
• Saturation refers to the intensity or purity of a color in an image. A highly saturated
image has vivid colors, while a desaturated image appears more washed out or
closer to grayscale.
• Brightness refers to the perceived lightness or darkness of an image, usually
based on the overall intensity of light in the image.
• Intensity is a measure of the amount of light that a pixel represents, which is
typically related to the brightness or luminance of the pixel.
• Luminance refers to the perceived brightness of a color, which takes into account
the human eye’s sensitivity to di erent wavelengths (colors). It is often represented
by a weighted sum of the RGB channels.
• Re ectance is the proportion of light that is re ected o a surface as opposed to
being absorbed. In image processing, it is related to how the surface properties of
an object in the image a ect its appearance.

2
fl
ff
ff
ff
fl
ff
Saturday, 1 March 2025
- Components of Image Processing System:

• Image sensors are devices that capture visual information from the real world and
convert it into digital data that can be processed by computers. Common types
include Charge-Coupled Devices (CCD) and Complementary Metal-Oxide-
Semiconductor (CMOS) sensors.
• Specialized hardware refers to devices or processors designed to accelerate and
optimize image processing tasks. These can include Graphics Processing Units
(GPUs), Digital Signal Processors (DSPs), and Field-Programmable Gate Arrays
(FPGAs).
• Hardcopy refers to physical output, usually printed images or documents, that can
be generated from digital image data.
• Image display refers to devices used to present images visually to a user. These
include monitors, projectors, and other display technologies.
• A computer is a device that processes digital data using hardware (like the CPU
and GPU) and software (image processing programs or algorithms).
• Mass storage refers to large-capacity storage devices or systems that store vast
amounts of data. These could include hard drives (HDDs), solid-state drives
(SSDs), or cloud storage systems.

3
Saturday, 1 March 2025
• Image processing software refers to programs or applications that provide tools to
manipulate, analyze, and transform images. Popular examples include Adobe
Photoshop, GIMP, MATLAB, OpenCV, and custom software developed for speci c
tasks.
• The cloud refers to internet-based services that provide remote storage,
processing power, and data sharing. Cloud services like Amazon Web Services
(AWS), Microsoft Azure, and Google Cloud provide cloud computing resources.

- In brief:-
Image Sensors capture images and convert light into digital data.
Specialized Image Processing Hardware accelerates image processing tasks
(e.g., GPUs, DSPs).
Hardcopy is the physical representation of processed images (e.g., prints).
Image Display shows processed images to users (e.g., monitors, projectors).
Computer is the central device that runs the software and processes the
images.
Mass Storage holds large amounts of image data for easy access.
Image Processing Software provides the tools to manipulate and analyze
images.
Cloud o ers remote storage, computation, and collaboration for image
processing.

- Image Representations:
• Image plotted as a surface: plot of the function, with two axes determining spatial
location and the third axis being the values of f as a function of x and y. This
representation is useful when working with grayscale sets whose elements are
expressed as triplets of the form (x, y, z), where x and y are spatial coordinates and
z is the value of f at coordinates (x, y).
• Image displayed as a visual intensity array: it shows f(x, y) as it would appear on a
computer display or photograph. Here, the intensity of each point in the display is
proportional to the value of f at that point.
• Image shown as a 2-D numerical array: an array (matrix) composed of the
numerical values of f x y ( , ). This is the representation used for computer

4
ff
fi
Saturday, 1 March 2025
processing. In equation form, we write the representation of an MxN numerical
f (0,0) f (0,1) . . . f (0,N − 1)
f (1,0) f (1,1) . . . f (0,N − 1)
array as: f (x, y) = . . .
. . .
. . .
f (M − 1,0) f (M − 1,1) . . . f (M − 1,N − 1)

- Image le formats:
• JPEG (Joint Photographic Experts Group) - Extension: .jpg, .jpeg: A lossy format
widely used for photos with small le sizes but reduced quality at higher
compression levels. Best for web use, but unsuitable for images requiring sharp
detail or high-quality preservation.
• PNG (Portable Network Graphics) - Extension: .png: Lossless compression with
transparency support, ideal for web graphics, logos, and images needing high
quality. File sizes are larger than JPEG, but it maintains clarity and detail.

5
fi
fi
Saturday, 1 March 2025
• GIF (Graphics Interchange Format) - Extension: .gif: Supports lossless
compression and animation, limited to 256 colors. Perfect for web animations and
simple graphics but not suitable for detailed images like photographs.
• TIFF (Tagged Image File Format) - Extension: .ti , .tif: Flexible and high-quality
format with both lossless and lossy compression. Common in professional
photography and publishing, it retains excellent image quality but results in large
le sizes.
• BMP (Bitmap Image File) - Extension: .bmp: Simple raster format with no
compression, resulting in large les. Primarily used in Windows for basic images,
but lacks modern features and is less e cient.
• HEIF (High E ciency Image Format) - Extension: .heif, .heic: A high-e ciency
format o ering better compression with support for advanced features like HDR.
Provides smaller le sizes while maintaining high quality, though compatibility is
still limited.
• RAW (Raw Image Format) - Extension: .raw (varies by camera manufacturer,
e.g., .cr2, .nef, .arw): An unprocessed, high-quality image format from cameras,
o ering exibility for editing and post-processing. Large le sizes and requires
specialized software for viewing and editing.
• SVG (Scalable Vector Graphics) - Extension: .svg: A vector format supporting
lossless compression and scalability. Best for web graphics, logos, and
illustrations. Not suitable for photos or highly detailed images.
• PSD (Photoshop Document) - Extension: .psd: The native format for Adobe
Photoshop, supporting layers and non-destructive editing. Ideal for graphic design
and professional editing but requires Photoshop and produces large les.
• PDF (Portable Document Format) - Extension: .pdf: A document format that can
include both text and images. Preserves formatting across platforms but is not
ideal for web image use.
- Applications of digital image processing
• Medical Imaging
- Medical Diagnostics: Detects tumors, fractures, etc. in X-rays, CT scans, MRIs.
- Image Enhancement: Improves clarity for better diagnosis.
- Segmentation: Identi es and isolates tissues or abnormalities.
- 3D Imaging: Helps visualize organs in 3D for surgery.
• Remote Sensing
- Satellite Imaging: Monitors natural resources, urban growth, and environmental
changes.

6
fi
ff
fl
ff
ffi
fi
fi
fi
ffi
ff
fi
fi
ffi
Saturday, 1 March 2025
- Land Classi cation: Classi es land into categories (forests, water bodies, etc.).
- Disaster Monitoring: Analyzes images of disaster zones for damage assessment.
• Computer Vision
- Object Detection & Recognition: Identi es and labels objects (e.g., facial
recognition, autonomous vehicles).
- Image Classi cation: Categorizes images into prede ned classes.
- Motion Tracking: Tracks objects in videos for surveillance, sports analysis.
- Scene Understanding: Recognizes relationships between objects in a scene.
• Security and Surveillance
- Face Recognition: Identi es faces for security purposes.
- Motion Detection: Detects movement for monitoring purposes.
- License Plate Recognition: Identi es vehicle license plates for tra c
management.
• Autonomous Vehicles
- Object Detection: Identi es pedestrians, vehicles, obstacles.
- Lane Detection: Ensures vehicles stay within lanes.
- 3D Mapping: Creates 3D maps of surroundings for navigation.
• Agriculture
- Crop Monitoring: Detects diseases and water stress.
- Precision Agriculture: Optimizes crop yield and reduces waste.
- Weed Detection: Di erentiates crops from weeds for automated control.
• Robotics
- Visual SLAM: Helps robots navigate by visual data.
- Object Grasping: Identi es objects for robotic manipulation.
- Path Planning: Guides robots in obstacle-free routes.
• Industrial Automation
- Defect Detection: Inspects products for defects in manufacturing.
- Assembly Line Automation: Guides robotic arms in assembly tasks.
- Barcode Scanning: Automates inventory management and logistics.
• Forensic Analysis

7
fi
fi
ff
fi
fi
fi
fi
fi
fi
fi
ffi
Saturday, 1 March 2025
- Crime Scene Analysis: Enhances crime scene photos for evidence.
- Fingerprint Recognition: Identi es individuals based on ngerprints.
- Forgery Detection: Identi es alterations in documents or photos.
• Sports
- Motion Analysis: Tracks player movements or ball trajectory.
- Event Detection: Highlights key moments (goals, penalties) in sports.
- Goal-line Technology: Determines if a ball has crossed the goal line.

- Image Analysis:
• Geometric Transformations:-

8
fi
fi
fi
Saturday, 1 March 2025
• Intensity Transformations: It is the procedure in which we modify or transform the
current pixel intensity values to achieve a desired result like rotation, scaling, shear
etc.

- Negative: Reversing the intensity levels of a digital image produces the


equivalent of a photographic negative. The mathematical formula to calculate
the intensity is: s = L-1-r; where s is output intensity, r is input intensity and L is
number of intensity levels
- Log: Formula used s=c*log(1+r); c is a constant. Used when range of values in
an image is very huge and linear scaling is not very helpful
- Power Law (Gamma) transformations: s = cr γ
- Piece-Wise Linear Transformations: In these transformations we specify what
the corresponding output intensity will be for a given input intensity level by
de ning various linear functions and joining them to get a complete continuous
function. It is widely used in contrast stretching.

9
fi
Saturday, 1 March 2025
• Contrast stretching: Contrast stretching is a technique used in image processing
to improve the contrast of an image by stretching the range of intensity values (or
pixel values) across the entire image. This helps make the image appear clearer,
with more de ned di erences between light and dark areas. Below example
demonstrates this process:-

- Highlight Region of interest and remove focus from other areas.(a)


- Just increase the focus on the region of interest.(b)

(a) (b)
10
fi
ff
Saturday, 1 March 2025
- Bit Plane Slicing: Pixel values are integers composed of bits.Instead of highlighting
intensity-level ranges, we could highlight the contribution made to total image
appearance by speci c bits. This can be understood from the following images:-

- Correlation and Convolution


• Correlation consists of moving the center of a kernel over an image, and
computing the sum of products at each location. The mechanics of spatial
convolution are the same, except that the correlation kernel is rotated by 180°.
Thus, when the values of a kernel are symmetric about its center, correlation and
convolution yield the same result.
a b
• Mathematical formula: g(x, y) = Σs=−aΣt=−b w(s, t)f (x + s, y + t); here w(s,t)
represents the kernel in form of a 2D function.
• Padding to be applied is given by:
- (M-1)/2 when M is odd and is the size of kernel
- (M-1) when M is even and is the size of kernel
• Various types of padding:-
- In zero padding, the added pixels around the image are set to a value of 0. This
is one of the simplest and most commonly used padding techniques, especially
in convolutional operations.
- In mirror padding, the padding around the image is lled by re ecting the values
of the image at the edges. Essentially, the edge pixels are duplicated
symmetrically.

11
fi
fi
fl
Saturday, 1 March 2025

- In duplicate padding, the padding values are set to the value of the edge pixels
of the image. Essentially, the pixel at the edge of the image is repeated around
the borders.

• Numerical Example of 1-D correlation and convolution:

12
Saturday, 1 March 2025
• Numerical Example of 2-D correlation and convolution

- Smoothing Filters(Lowpass Spatial Filters):


• Box Filters: The simplest, separable lowpass lter kernel is the box kernel, whose
coe cients have the same value (typically 1). The name “box kernel” comes from
a constant kernel resembling a box when viewed in 3-D
• Gaussian lters:
s2 + t 2 r2
- Formula: w(s, t) = G(s, t) = Ke − 2σ 2 or G(r) = Ke − 2σ 2 where r = s 2 + t 2
- They are the only circularly symmetric lters which are separable in nature.
Hence they leverage the advantage of being computationally e cient like box
kernels while giving more advantages.

13
ffi
fi
fi
fi
ffi
Saturday, 1 March 2025
- General formation of kernel:

Comparison of Box and Gaussian Filter

- Sharpening Filters(Highpass Spatial Filters):


• Gradient
- Works on First order derivative
- Mathematical formula:

14
Saturday, 1 March 2025
• Laplacian
- Works on Second order derivative
- Mathematical
formula:
- Upon
implementing this equation, one gets the following kernels:-

- Unsharp masking and Highboost ltering


• Steps:-
- Blur the original image.
- Subtract the blurred image from the original (the resulting di erence is called the
mask.)
- Add the mask to the original.
• How this process improves the image quality:-

15
fi
ff
Saturday, 1 March 2025
- Image Transforms: Transforms in image processing are mathematical operations
applied to images to modify or analyze their data in a di erent domain (such as
frequency or spatial domain) to enhance, lter, or extract information.
• Need for transformation from spatial to frequency domain:
- E ciency: Certain tasks are more computationally e cient in transformed
spaces.
- Visualization: Some transforms help visualize aspects of an image that are not
easily seen in the spatial domain.
- Noise Filtering: Transforms like Fourier or Wavelet can help remove noise
without losing important details.
- Feature Extraction: Essential features can be extracted, such as in object
recognition or texture analysis.
- Compression: E cient representation of image data reduces storage or
transmission requirements.
- Data Enhancement: Transforms can help highlight features like edges or textures
that are not obvious in the original image.
• Fourier Tranform
- Converts an image from the spatial domain to the frequency domain.
- Helps identify periodic structures and patterns (e.g., for image compression or
ltering).
- Formula of Discrete Fourier Transform(1-D):
N−1

x[n]e −i N kn, k = 0,1,2,…, N − 1

X[k] =
n=0
- Formula of Discrete Fourier Transform(2-D):
M−1 N−1
2π 2π
x[m, n]e −i M k1me −i N k 2 n
∑∑
X[k1, k 2] =
m=0 n=0
- Formula of Inverse Discrete Fourier Transform(1-D):
1 N−1 2π
X[k]e i N kn, n = 0,1,2,…, N − 1
N∑
x[n] =
k=0
- Formula of Inverse Discrete Fourier Transform(2-D):
1 M−1 N−1 2π 2π
X[k1, k 2]e i M k1me i N k 2 n
MN k∑ ∑
x[m, n] =
=0 k =0
1 2

16
fi
ffi
ffi
fi
ffi
ff
Saturday, 1 March 2025
• Cosine transform
- Converts image data to a sum of cosine functions with di erent frequencies.
- Widely used in image compression (e.g., JPEG).
- Formula of 2-D Discrete Cosine transform:

[M ( 2) ] [N ( 2 ) 2]
M−1 N−1
π 1 π 1
∑∑
X[k1, k 2] = α(k1)α(k 2) x[m, n]cos m+ k1 cos n+ k
m=0 n=0
1
if k =0
where, α(k) = 2
1 if k >0
- Formula of Inverse 2-D Discrete Cosine transform:

[M ( 2) ] [N ( 2 ) 2]
M−1 N−1
2 π 1 π 1
MN k∑ ∑
x[m, n] = X[k1, k 2]α(k1)α(k 2)cos m+ k1 cos n+ k
1=0 k 2 =0

• 2-D Wavelet transform


- Wavelet Transform works by breaking down an image into smaller, more
manageable components. While traditional transforms like the Fourier Transform
analyze images based on sinusoidal waves (global analysis), wavelet transforms
analyze images using short, localized waveforms (called wavelets).
- Represents an image in terms of both time and frequency.
- Useful for image compression (e.g., JPEG2000), denoising, and multi-resolution
analysis.
- Key Steps:
• Wavelet Decomposition:
- The image is convolved with a low-pass lter (approximation lter) and a
high-pass lter (detail lter) in both horizontal and vertical directions.
- The image is split into four subbands:
• LL (Low-Low): Approximation of the image (low-frequency).
• LH (Low-High): Horizontal detail (high-frequency in the horizontal
direction).
• HL (High-Low): Vertical detail (high-frequency in the vertical direction).
• HH (High-High): Diagonal detail (high-frequency in both directions).
• Decomposition at Multiple Levels:

17
fi
fi
fi
ff
fi
Saturday, 1 March 2025
- To further decompose the image, you can repeat the process (convolution
with the low-pass and high-pass lters) on the LL subband, obtaining
another set of four subbands (LL, LH, HL, HH) at a deeper level. This gives a
multiresolution representation of the image.
• Inverse Wavelet Transform:
- The inverse wavelet transform is used to reconstruct the image from the
decomposed subbands. This step essentially reverses the ltering and
reconstruction process.
- Haar transform:
• The Haar transform uses step functions to analyze the signal or image. It
divides the signal into blocks and computes averages and di erences
between adjacent elements, essentially capturing both smooth and abrupt
changes in the data.
• The Haar wavelet is based on a piecewise constant function. It is simple to
compute, making it computationally e cient for tasks like image compression
and denoising.
• Steps:
- First Step - Horizontal Decomposition:
• The image is divided into rows, and for each row, the Haar transform is
applied. This involves averaging pairs of pixels (low-pass) and subtracting
them (high-pass).
- Second Step - Vertical Decomposition:
• After the horizontal transformation, the same process is applied to each
column of the resulting image (which now contains the horizontal detail
information).
- This process generates four subbands:
• LL (Low-Low): This is the approximation part of the image, which
captures the low-frequency components.
• LH (Low-High): This contains the horizontal detail (high-frequency in the
horizontal direction).
• HL (High-Low): This contains the vertical detail (high-frequency in the
vertical direction).
• HH (High-High): This contains the diagonal detail (high-frequency in both
directions).
• Example:

18
fi
ffi
fi
ff
Saturday, 1 March 2025

19
Saturday, 1 March 2025
• Comparison of various transform and their properties:
Propert Wavelet Transform Discrete Cosine Discrete Fourier Transform
y (Haar Example) Transform (DCT) (DFT)

Multi-resolution analysis, Compression, feature Signal analysis, frequency


Purpose
compression, denoising extraction domain ltering

Time/Frequency (space-
Domain Frequency domain Frequency domain
frequency)

Mathem
Piecewise polynomial Cosine functions (periodic Sinusoids (sine and cosine
atical
functions (wavelets) oscillations) functions)
Basis

Image compression Image/video compression


Applicat Signal processing, image
(JPEG2000), denoising, (JPEG, MPEG), signal
ions ltering, spectral analysis
feature extraction processing

Exact reconstruction
Reconst Exact reconstruction Exact reconstruction with
(quantization loss in
ruction (perfect inverse) inverse DFT
practice)

Comput
ational Ef cient (Fast Wavelet High (O(N²) without FFT,
Moderate (O(N log N))
Comple Transform) O(N log N) with FFT)
xity

Energy
High (especially for sharp Moderate (less effective for
Compac Excellent (smooth regions)
transitions) high-frequency content)
tion

High (especially for


Moderate (good for smooth Low (less compact for high-
Sparsity images with sharp
signals) frequency content)
transitions)

Data
Multi-resolution, Single resolution Single resolution (global
Represe
hierarchical (frequency bins) frequency representation)
ntation

20
fi
fi
fi
Saturday, 1 March 2025

Suitabili
ty for Excellent (e.g., Very good (e.g., JPEG, Less ef cient for image
Compre JPEG2000) MPEG) compression
ssion

Shift Low shift-invariance Low shift-invariance (can Not shift-invariant (shifting


Invarian (affected by position be reduced by affects the frequency
ce shifts) preprocessing) components)

Multi-resolution (captures
Single resolution, Single resolution, transforms
Resoluti both low and high
transforms the image into the signal into its frequency
on frequencies at different
frequency bins. components (sinusoids).
scales)

Localize
No. DFT is a global
d in Yes, localized both in No (not localized). DCT
transform, meaning the
Time time (space) and represents the entire signal/
frequency components
and frequency, useful for non- image in terms of global
represent the entire signal/
Frequen stationary signals. frequency components.
image.
cy

Yes, very effective for No, DFT does not provide


Multi- No, unlike wavelets, DCT
multi-scale representation multi-resolution (like
resolutio doesn’t inherently provide a
of signals/images, making wavelets). It provides a
n multi-resolution
it suitable for both coarse single global frequency
Analysis representation.
and ne representations. representation.

Sensitive to noise and high-


Sensitive to high-
Noise Sensitive to noise and frequency components. The
frequency noise, which
Sensitivi compression artifacts presence of noise can create
may affect the quality of
ty (blocking effects). signi cant distortions in the
representation.
frequency domain.

21
fi
fi
fi
Saturday, 1 March 2025

Very ef cient for


Excellent for
Compre compression, especially for
compression, particularly Not suitable for compression
ssion image and video (JPEG,
for images with sparse, tasks; less ef cient than
Ef cienc MPEG), but performance
discontinuous data wavelets or DCT.
y degrades with high-
(JPEG2000).
frequency content.

The Haar wavelet is not


Works well for smooth Works well on periodic
very smooth (piecewise
signals and compressing signals or images with
Smooth constant), but some other
images with smooth regions uniform texture but may not
ness wavelet functions (like
but may suffer on images perform well with sharp
Daubechies) are
with sharp edges. edges or discontinuities.
smoother.

Image compression Signal processing, image


Image compression (JPEG,
(JPEG2000), image ltering, frequency analysis,
JPEG2000), video
Best Use denoising, multi- pattern recognition, and
compression (MPEG),
Cases resolution analysis, applications requiring global
speech recognition, audio
feature extraction in frequency domain
compression (MP3).
medical imaging. representation.

22
fi
fi
fi
fi
Saturday, 1 March 2025

Aspect Dynamic Range Contrast


The range between the darkest and The difference in brightness between
De nition
lightest intensity values in the image.
different parts of the image.
Overall brightness extremes (darkest
The difference between adjacent regions in
Focus
shadows to brightest highlights). an image, affecting the visual appearance.
Determines how much detail is Affects how distinct or sharp the boundaries
Effect
preserved in bright and dark areas.are between light and dark areas.
Often adjusted by bit depth or HDR Adjusted by contrast enhancement or
Adjustment
processing. level manipulation.
Affects the sharpness and visual appeal of
Impact on Controls how much detail is visible
the image by changing the difference in
Image in both dark and light regions.
brightness.
High dynamic range is important for High contrast makes details stand out
Example scenes with both bright and dark sharply, while low contrast makes the image
areas (e.g., sunset). appear at.

Additional Difference for reference

23
fi
fl
Saturday, 1 March 2025

DIP Unit-2

- Image Compression: The term data compression refers to the process of reducing
the amount of data required to represent a given quantity of information.
- If we let b and b′ denote the number of bits (or information carrying units) in two
representations of the same information, the relative data redundancy, R, of the
representation with b bits is:
1
R =1−
C
where C, commonly called the compression ratio, is de ned as:
b
C=
b′
- Two-dimensional intensity arrays su er from three principal types of data
redundancies that can be identi ed and exploited
• Coding redundancy. A code is a system of symbols (letters, numbers, bits, and the
like) used to represent a body of information or set of events. Each piece of
information or event is assigned a sequence of code symbols, called a code word.
The number of symbols in each code word is its length. The 8-bit codes that are
used to represent the intensities in most 2-D intensity arrays contain more bits
than are needed to represent the intensities.
• Spatial and temporal redundancy. Because the pixels of most 2-D intensity arrays
are correlated spatially (i.e., each pixel is similar to or dependent upon neighboring
pixels), information is unnecessarily replicated in the representations of the
correlated pixels. In a video sequence, temporally correlated pixels (i.e., those
similar to or dependent upon pixels in nearby frames) also duplicate information.
• Irrelevant information. Most 2-D intensity arrays contain information that is ignored
by the human visual system and/or extraneous to the intended use of the image. It
is redundant in the sense that it is not used.
- Various methods of image compression techniques:-
• Hu man Coding:
- Hu man coding relies on the principle of frequency-based encoding, where
more frequent elements (e.g., pixel values in an image) are assigned shorter
codes, and less frequent ones are assigned longer codes. The goal is to
minimize the total number of bits needed to represent the data.

24

ff
ff
fi
ff
fi
Saturday, 1 March 2025
- E cient for Repeated Data: It works exceptionally well when the image contains
repeated or redundant data. For example, areas of uniform color in an image will
bene t from shorter codes for frequent pixel values.
- Lossless Compression: Hu man coding is a lossless algorithm, meaning no data
is lost during the compression process, so the image can be perfectly
reconstructed.
- Widely Used: It's part of many widely used compression formats, like JPEG and
PNG.

• Arithmetic Coding:
- Arithmetic coding is a lossless compression technique that is used to represent
a sequence of symbols (e.g., pixels or characters) as a single oating-point
number. Unlike Hu man coding, which assigns xed-length codes to symbols,
arithmetic coding encodes the entire message as a single number between 0
and 1, based on the probability distribution of the symbols
- Working:
• Probability Assignment: The rst step is to assign probabilities to each symbol
based on their frequency in the input data. These probabilities de ne how
likely each symbol is to occur.
• Interval Division: The algorithm divides the interval [0, 1) into sub-intervals
based on the symbol probabilities. For example, if you have three symbols: A,
B, and C, with probabilities 0.5, 0.3, and 0.2, the interval [0, 1) would be
divided as follows:
Symbol A: [0, 0.5)
Symbol B: [0.5, 0.8)
Symbol C: [0.8, 1)
• Encoding Process:

25
ffi
fi
ff
ff
fi
fi
fl
fi
Saturday, 1 March 2025
- The input data (a sequence of symbols) is processed one symbol at a time.
As each symbol is encountered, the interval is narrowed down based on the
probability distribution. For example:
- If the sequence starts with 'A', the interval is narrowed to [0, 0.5).
- If the next symbol is 'B', the interval becomes the range [0.25, 0.5) (half of
the previous interval).
- This process continues until all symbols have been processed.
• Final Code:
- After processing the entire sequence, the nal interval will be very small.
Any number within this interval can represent the entire message, but
typically, a single point (often the midpoint of the interval) is chosen as the
code for the sequence.
- For example, if the nal interval is [0.375, 0.4375), the number 0.40625
(midpoint) can be chosen as the arithmetic code for the entire sequence.

26
fi
fi
Saturday, 1 March 2025
• Lempel Ziv Welch (LZW) coding
- LZW (Lempel-Ziv-Welch) is a lossless data compression algorithm that is a
variant of the Lempel-Ziv family, speci cally an improvement over LZ78. It is
widely used for le compression and image formats due to its ability to
e ciently handle repetitive data without losing any information.
- LZW is notable for its dictionary-based compression, which builds a dictionary
of sequences (substrings) of data as it processes the input. It replaces repetitive
sequences of data with shorter codes, thus reducing the overall size of the data.
- Steps Involved:
• Initialize the Dictionary: The algorithm begins with a dictionary that contains all
the possible individual symbols (characters or pixel values) in the input data.
Each symbol is assigned a unique code.
For example, if you're compressing a text le, the dictionary will initially
contain all the individual characters in the le, each with a unique index or
code.
• Process the Input Data: As the algorithm processes the input data (symbol by
symbol), it tries to nd the longest sequence of symbols (substring) that
already exists in the dictionary.
• 3. Dictionary Expansion:
- When the algorithm encounters a sequence of symbols that is already in the
dictionary, it continues processing until it nds a sequence that is not in the
dictionary.
- When a new sequence is encountered, the algorithm adds this new
sequence to the dictionary and assigns it a new index.
- The previous sequence (before the new sequence is added) is encoded as a
reference to the dictionary index, which is shorter than the original
sequence.
• Encode Sequences: As the algorithm scans the input, it outputs the dictionary
index for the longest matching sequence and then continues processing the
next symbol. This continues until the entire input is processed.
• Final Output: The compressed output consists of a series of dictionary
indices. These indices are typically represented in a smaller number of bits
than the original sequence, achieving compression.
- Advantages
• E cient Compression: LZW is very e ective for data with repeated
sequences. As sequences are encountered repeatedly, they are replaced with
shorter dictionary indices, reducing the overall size of the data.

27
ffi
ffi
fi
fi
ff
fi
fi
fi
fi
Saturday, 1 March 2025
• Dynamic Dictionary: Unlike Hu man coding, LZW builds a dictionary
dynamically as it processes the input, which makes it exible for various kinds
of data without needing a xed set of probabilities.
• Lossless Compression: LZW is a lossless compression algorithm, meaning
that no data is lost during the compression process. The original data can be
perfectly reconstructed from the compressed data.
- Applications: LZW is used in many popular formats and protocols, such as GIF
images, TIFF images, PDF les, and Unix compress.
- For better understanding:
• K-L transform
- The K-L Transform, also known as the Karhunen-Loève Transform (KLT), is a
mathematical technique used in image compression and signal processing. It is
a linear transformation that converts a set of correlated variables (e.g., pixels in
an image) into a set of uncorrelated variables called principal components.
- The K-L Transform is closely related to Eigenvalue Decomposition and is used to
reduce the dimensionality of the data while preserving as much of the original
variance as possible. This is useful in image compression because it helps to
capture the most important features of an image and discard the less important
ones, leading to e cient data compression.
- Steps/Working:
• Centering the Data: The image is represented as a matrix where each pixel is
a data point. For KLT, the image data is typically "centered" by subtracting the
mean of the pixel values from each pixel value. This centers the data around
zero, which is important for the subsequent analysis.
• Covariance Matrix Calculation: The rst step is to calculate the covariance
matrix of the image. The covariance matrix captures how much each pixel in
the image is related to the others (i.e., their correlation).
• Eigenvalue Decomposition: Once the covariance matrix is computed, we
perform eigenvalue decomposition to nd the eigenvectors (directions of
maximum variance) and eigenvalues (the amount of variance captured by
each eigenvector). These eigenvectors and eigenvalues are key to the
transformation process.
• Projecting onto Principal Components: The eigenvectors are used to form a
basis set. The original image (which is correlated) is projected onto this basis,
resulting in a new set of uncorrelated components called principal
components.
The data is then represented in terms of these principal components, with the
most signi cant components (those corresponding to the largest eigenvalues)
being retained.

28
fi
ffi
fi
fi
ff
fi
fi
fl
Saturday, 1 March 2025
• Compression: In the compression step, the most signi cant principal
components are retained, while the less signi cant ones (those corresponding
to smaller eigenvalues) can be discarded. This reduces the dimensionality of
the image and leads to compression.
• Reconstruction: The compressed image can be reconstructed by projecting
the retained principal components back into the original space, using the
inverse of the transformation.
- Applications: It’s particularly useful in applications like JPEG 2000 and other
compression techniques that require high-quality results with reasonable
compression.
• Discrete Cosine Transform(DCT):
- The Discrete Cosine Transform (DCT) is a widely used technique for image
compression, particularly in formats like JPEG. The DCT transforms an image (or
signal) from the spatial domain (pixel values) into the frequency domain (a set of
cosine functions), which makes it easier to compress the image by eliminating
redundant or less important information.
- Steps involved:
• Block Division: The image is divided into small non-overlapping blocks of
pixels, typically 8x8 or 16x16 pixels.
• Apply the DCT:
- Each block of pixels is transformed from the spatial domain into the
frequency domain using the Discrete Cosine Transform. The result of this
transformation is a set of DCT coe cients.
- Mathematically, the DCT of a block of N×N pixels f(x,y) is calculated as:

[N ( ) ] [N ( ) ]
N−1 N−1
π 1 π 1
∑∑
F(u, v) = α(u)α(v) f (x, y)cos x+ u cos y+ v
x=0 y=0
2 2
where,
F(u, v) are the DCT coe cients for the frequencies u an v.
α(u) and α(v) are scaling factors (for normalization).
f(x, y)is the pixel intensity value at position (x, y)
• Quantization:
- After applying the DCT, you obtain a set of DCT coe cients, which are
organized into a 2D matrix.
- The DCT coe cients represent di erent frequency components of the
image:

29
ffi
ffi
ff
ffi
fi
ffi
fi
Saturday, 1 March 2025
• The top-left corner (low-frequency components) corresponds to the
average color of the block, capturing the smooth areas of the image.
• The bottom-right corner (high-frequency components) represents ne
details like edges and textures.
- Quantization is the next step. During quantization, the high-frequency
components (which are less important to human perception) are reduced or
discarded. This involves dividing the DCT coe cients by a quantization
matrix and rounding the results to a limited precision.
- This step signi cantly reduces the amount of data, and the greater the
quantization, the higher the compression, but also the greater the loss of
quality (lossy compression).
• Encoding:
- The quantized DCT coe cients are then encoded using entropy coding
techniques, such as Hu man coding or Run-Length Encoding (RLE), to
further reduce the size of the data.
- Typically, the low-frequency coe cients are kept, while the high-frequency
coe cients are often discarded or approximated. This results in a
signi cant reduction in le size, as most of the image's important
information is preserved in the low-frequency components.
• Compression Output: The compressed image consists of the encoded DCT
coe cients, which are much smaller in size than the original pixel values.
• Reconstruction (Decompression):
- To decompress the image, the encoded DCT coe cients are decoded, the
inverse quantization is applied, and the Inverse Discrete Cosine Transform
(IDCT) is used to transform the coe cients back into the spatial domain.
- The IDCT reverses the process and reconstructs an approximation of the
original image.
- Applications: JPEG compression
• BTC (Block Truncation Coding):
- Block Truncation Coding (BTC) is a relatively simple, lossy image compression
technique that divides an image into smaller blocks and compresses each block
independently based on its pixel intensity distribution.
- Working:
• Block Division: Divide the image into small blocks (e.g., 8x8 or 4x4 pixels).
• Calculate Mean & Variance: For each block, calculate the mean and variance
of the pixel values.

30
ffi
ffi
fi
fi
fi
ff
ffi
ffi
ffi
ffi
ffi
fi
Saturday, 1 March 2025
• Divide into Two Groups:
- Group 1: Pixels greater than or equal to the mean.
- Group 2: Pixels less than the mean.
• Generate Bitplane: Create a bitplane where:
- 1 = Pixel belongs to Group 1 (>= mean).
- 0 = Pixel belongs to Group 2 (< mean).
• Encode the Block: Store the bitplane and the mean value for each block.
• Reconstruction: During decompression —
- Set pixels with 1 in the bitplane to the mean value.
- Set pixels with 0 to a lower value (e.g., threshold or minimum in the block).
- Lossy Compression: BTC is a lossy technique.
- Simple: Easy to implement, especially for grayscale images.
- Block-based: Operates on small blocks of pixels.
- Ideal for Smooth Images: Works best for images with low variance and simple
textures.
- Histogram Processing:
• Histogram equalization
- Histogram Equalization is a technique in image processing used to enhance the
contrast of an image. The goal is to improve the visibility of details in areas that
may be too dark or too bright. It works by redistributing the pixel intensity values
of an image, so that the histogram (the distribution of pixel intensities) becomes
more uniform across the entire intensity range.
- Brief overview of its working:
• Calculate the Histogram: First, compute the histogram of the original image,
which shows the frequency of each pixel intensity.
• Calculate the Cumulative Distribution Function (CDF): The CDF is obtained by
summing the histogram values cumulatively. This step helps in understanding
how the pixel intensities are distributed.
• Normalize the CDF: Normalize the CDF so that it spans the entire range of
possible pixel values (e.g., from 0 to 255 for an 8-bit image).
• Map the Intensities: Each pixel intensity in the original image is mapped to a
new intensity value based on the normalized CDF. This transformation spreads
out the pixel values more evenly, thus enhancing contrast.

31
Saturday, 1 March 2025
• Result: The output image will have a histogram that is more uniformly
distributed, leading to better contrast and visibility of details.
- Advantages:
• Improves image contrast, especially in low-contrast areas.
• Simple and e ective for enhancing image quality.
- Disadvantages:
• Can lead to unnatural-looking results in some cases.
• May not be suitable for all images, especially those with already well-balanced
contrast.

32
ff
Saturday, 1 March 2025

33
Saturday, 1 March 2025
• Histogram Speci cation:
- Histogram Speci cation (also known as Histogram Matching) is a technique in
image processing that adjusts the histogram of an image to match a speci ed
histogram, rather than just equalizing it. This allows for more control over the
image’s contrast and brightness by transforming its pixel intensity distribution to
resemble a target histogram.
- How Histogram Speci cation Works:
• Compute the Histogram of the Original Image: The rst step is to calculate the
histogram of the input image, which shows the frequency of each intensity
level.
• Compute the Cumulative Distribution Function (CDF): Like in histogram
equalization, you compute the cumulative distribution function of the original
image’s histogram.
• Compute the CDF of the Target Histogram: The next step involves computing
the CDF of the target histogram that you want the original image to match.
• Map the Intensities: For each pixel intensity in the input image, map it to a
new intensity based on the CDF of the target histogram. The idea is to nd a
corresponding intensity in the target distribution that maintains the overall
shape and brightness levels of the image while adhering to the speci ed
histogram.
• Apply the Mapping: Once the mapping is done, the image is transformed by
replacing the original intensities with the mapped ones, resulting in the image
having a histogram that matches the target histogram.
- Advantages:
• Controlled Output: Allows precise control over how the image's histogram is
shaped, as you can specify a desired histogram.
• Improved Aesthetic Quality: By matching the histogram to a more natural or
aesthetically pleasing one, you can improve the visual quality of the image.
• Versatile: Useful when you have a reference image with an ideal histogram for
comparison.
- Disadvantages:
• Complexity: More computationally intensive than histogram equalization.
• Limited E ectiveness: The output depends on how closely the original
image's histogram can match the target one. If they are too di erent, the
result may not be as expected.
- Use Cases:

34
ff
fi
fi
fi
fi
ff
fi
fi
fi
Saturday, 1 March 2025
• Image enhancement: In medical imaging or satellite imagery, where you might
want to match histograms to known standards for better analysis.
• Image matching: In computer vision tasks, when you want to standardize
images for comparison with others.

- Directional Smoothing: Directional smoothing refers to the technique of smoothing or


ltering an image to reduce noise while preserving the important directional features,
such as edges and texture. This process is particularly useful when you need to
maintain the integrity of speci c features in an image (like edges or lines) while
reducing the impact of noise or unwanted variations.
- Median Filtering:
• Median ltering is a non-linear operation that replaces each pixel in an image with
the median value of the pixels in its neighborhood (often within a window).
• In directional smoothing, the median is used to remove noise, especially salt-
and-pepper noise (random black and white pixels), while preserving edges.
Since the median is less sensitive to extreme values compared to the mean, it
helps in eliminating outliers (noise) while maintaining the overall structure of the
image.
• Application: For example, if you have an image with high-frequency noise (like
random pixels that are much brighter or darker than the surroundings), applying a

35
fi
fi
fi
Saturday, 1 March 2025
median lter can smooth out these noise spikes without blurring important
features like edges.
- Harmonic Mean Filtering:
• Harmonic mean ltering is a technique where the pixel value is replaced by the
harmonic mean of the surrounding pixels. The harmonic mean gives more weight
to smaller values in the neighborhood, and it's often used in situations where you
are dealing with ratios or values that should not be excessively large.
• In directional smoothing, harmonic mean ltering can be useful for smoothing
images with inverse relationships, such as images that contain certain types of
noise or artifacts that involve bright spots (overexposed areas) or dark areas
(underexposed areas).
• Application: This method can help reduce noise in the image that is related to
smaller features or outliers by emphasizing smaller values and reducing the
in uence of large ones. It's particularly e ective in low-contrast regions of the
image where reducing bright noise might be necessary.
- Geometric Mean Filtering:
• Geometric mean ltering replaces each pixel with the geometric mean of its
neighborhood. This method is e ective in situations where you want to preserve
the relative proportions of pixel intensities, especially when dealing with images
where brightness or color follows a multiplicative or exponential relationship (e.g.,
in images with some form of gradient or lighting changes).
• In directional smoothing, the geometric mean is especially useful when dealing
with images that exhibit uniform scaling, or in images where variations in pixel
values are multiplicative (such as in certain types of texture, lighting, or frequency
components).
• Application: Geometric mean ltering is often used when you want to smooth out
multiplicative noise or remove high-frequency artifacts without losing
important directional features, such as texture or ne detail in images with
gradual transitions.
- Homomorphic Filtering:
• Homomorphic Filtering is an image processing technique used to enhance images
by separating and processing the illumination and re ectance components. It
improves contrast by reducing the e ects of uneven lighting while preserving ne
details like edges.
• Steps Involved:

36
fl
fi
fi
fi
fi
ff
ff
fi
ff
fi
fl
fi
Saturday, 1 March 2025
- Logarithmic Transformation: Apply a logarithm to the image, turning the
multiplication of illumination and re ectance into an additive process.
logI(x, y) = logL(x, y) + logR(x, y)
- Frequency Domain Processing: Convert the image to the frequency domain
(using Fourier Transform). The illumination (low-frequency) and re ectance (high-
frequency) components are easier to separate.
- High-Pass Filtering: Apply a high-pass lter to remove low-frequency
illumination (lighting variations) and enhance high-frequency re ectance (edges
and textures).
- Inverse Fourier Transform: Convert the image back to the spatial domain.
- Exponentiation: Undo the logarithmic transformation by applying the exponential
function to recover the processed image.
• Purpose:
- Enhance Image Contrast: By removing low-frequency illumination, high-
frequency details (like edges) become more prominent.
- Correct Uneven Lighting: Helps in images with non-uniform lighting, improving
clarity and feature visibility.
• Applications:
- Medical Imaging: Improves details in X-rays or MRI scans.
- Satellite Imaging: Corrects lighting inconsistencies in remote sensing images.
- Computer Vision & Photography: Enhances object details while reducing lighting
variations.

37
fl
fi
fl
fl

You might also like