Basics of Digital Images
Basics of Digital Images
Digital Images
Digital images are pictures that are created, stored, and manipulated in a digital format.
Digital images use pixels—tiny, square units of colour—to represent the image.
Digital images can be created using various devices such as digital cameras, smartphones, and
scanners.
They are typically stored in formats like JPEG, PNG, GIF, and TIFF, each with its own method of
compressing and encoding the image data.
These images can be easily edited, shared, and displayed on digital devices such as computers,
tablets, and smartphones.
Image Representation
Image representation refers to how visual information is encoded and stored in a digital format.
Here’s a breakdown of some key concepts in image representation:
1. Pixels
Definition: The smallest unit of a digital image. Each pixel represents a single point in the image.
Color and Brightness: Pixels have values for color and brightness. In color images, this often includes
red, green, and blue (RGB) values, while grayscale images have varying shades of grey.
2. Color Models
RGB (Red, Green, Blue): Commonly used for digital screens. Colors are created by combining red,
green, and blue light in varying intensities.
CMYK (Cyan, Magenta, Yellow, Key/Black): Used in color printing. It represents colors by subtracting
light using cyan, magenta, yellow, and black inks.
HSV (Hue, Saturation, Value): Represents colors in terms of their hue (color type), saturation
(intensity), and value (brightness).
3. Image Formats
JPEG (Joint Photographic Experts Group): Uses lossy compression to reduce file size while
maintaining reasonable image quality. Ideal for photographs and complex images.
PNG (Portable Network Graphics): Uses lossless compression and supports transparency. Good for
images with sharp edges, such as logos or graphics.
GIF (Graphics Interchange Format): Supports a limited color palette (256 colors) and is often used for
simple graphics and animations.
TIFF (Tagged Image File Format): Supports high-quality images and various color depths. Often used
in professional photography and imaging.
4. Resolution
Definition: The amount of detail an image holds, usually measured in pixels per inch (PPI) or dots per
inch (DPI). Higher resolution means more detail and clarity.
5. Compression
Lossy Compression: Reduces file size by removing some data from the image, which can reduce
quality (e.g., JPEG).
Lossless Compression: Reduces file size without losing any data or quality (e.g., PNG).
o Flatbed scanners are used for digitizing documents, photographs, and other flat objects
o Film scanners are designed specifically for scanning negative or positive film strips or
slides
Microscopes and telescopes use specialized optics to magnify small or distant objects, enabling
the acquisition of images at scales beyond the capabilities of standard cameras
Satellite and aerial imaging involve capturing images of the Earth's surface from high altitudes
using sensors mounted on satellites, aircraft, or drones
o Multispectral and hyperspectral sensors are often used to gather information about
vegetation, water, and mineral resources
Medical imaging techniques, such as X-ray, computed tomography (CT), magnetic resonance
imaging (MRI), and ultrasound, use various forms of energy to visualize internal structures of the
human body for diagnostic and research purposes
Image Sampling
Image sampling is the process of converting a continuous image (analog) into a discrete image
(digital) by selecting specific points from the continuous image. This involves measuring the image
at regular intervals and recording the intensity (brightness) values at those points.
Examples of Sampling
High Sampling Rate: A digital camera with a high megapixel count captures more details because it
samples the image at more points.
Low Sampling Rate: An old VGA camera with a lower resolution captures less detail because
it samples the image at fewer points.
Image Quantization
Image quantization is the process of converting the continuous range of pixel values (intensities)
into a limited set of discrete values. This step follows sampling and reduces the precision of the
sampled values to a manageable level for digital representation.
Value Range Definition: The continuous range of pixel values is divided into a finite number
of intervals or levels.
Mapping Intensities: Each sampled pixel intensity is mapped to the nearest interval value.
Assigning Discrete Values: The original continuous intensity values are replaced by the
discrete values corresponding to the intervals.
Examples of Quantization
High Quantization Levels: An image with 256 levels (8 bits per pixel) can represent shades of
gray more accurately.
Low Quantization Levels: An image with only 4 levels (2 bits per pixel) has much less detail
and appears more posterized.
Resolution Affects spatial resolution (detail and clarity Affects color/gray level resolution
Aspect of the image) (number of shades or colors)
High megapixel count in cameras for 8-bit color depth for more color
Example detailed images variations
Image Sampling:
Advantages:
1. Data Reduction: Converts a continuous signal into a finite set of points, making storage and
processing more manageable.
2. Compatibility: Sampled images are easily processed by digital systems and algorithms.
3. Resolution Control: Allows for control over image resolution by adjusting the sampling rate.
Disadvantages:
1. Information Loss: Inevitably loses some information by approximating a continuous signal.
2. Aliasing: Can cause distortions and artifacts if the sampling rate is too low.
Image Quantization:
Advantages:
1. Data Compression: Reduces the amount of data by limiting the number of possible values
for each pixel.
2. Simplified Processing: Makes image processing operations simpler and faster with fewer
distinct values.
3. Noise Reduction: Helps reduce the impact of noise by mapping small variations in intensity
to the same value.
Disadvantages:
1. Loss of Detail: Reduces the range of colors or intensity levels, leading to a loss of fine detail
and potential color banding.
2. Quantization Error: Introduces differences between the original and quantized values, which
can become noticeable.
3. Reduced Image Quality: Overly aggressive quantization can significantly degrade image
quality, making the image appear blocky or posterized
Histogram equalization
is a method in image processing of contrast adjustment using the image’s histogram. This method
usually increases the global contrast of many images, especially when the usable data of the image is
represented by close contrast values. Through this adjustment, the intensities can be better
distributed on the histogram. This allows for areas of lower local contrast to gain a higher contrast.
Histogram equalization accomplishes this by effectively spreading out the most frequent intensity
values. The method is useful in images with backgrounds and foregrounds that are both bright or
both dark. OpenCV has a function to do this,
cv2.equalizeHist()
. Its input is just grayscale image and output is our histogram equalized image.
Input Image :
Below is Python3 code implementing Histogram Equalization :
Python3 1==
# import Opencv
import cv2
# import Numpy
import numpy as np
img = cv2.imread(\'F:\\do_nawab.png\', 0)
equ = cv2.equalizeHist(img)
cv2.imshow(\'image\', res)
cv2.waitKey(0)
cv2.destroyAllWindows()
Output :
Spatial domain methods
Spatial domain processing is a fundamental approach in image analysis, working directly with pixel
values to enhance and modify visual information. This technique forms the basis for many image
processing methods, allowing for intuitive and efficient operations on digital images.
Understanding spatial domain concepts is crucial for effective image manipulation and analysis. From
basic pixel operations to complex filtering techniques, these methods enable a wide range of
applications in computer vision, medical imaging, remote sensing, and other fields where image data
plays a vital role.
scikit-image: image processing in Python [PeerJ] View original
Is this image relevant?
Frontiers | Grand Challenges in Satellite Remote Sensing View original
Is this image relevant?
Introduction to Analysis – Introduction to Geomatics View original
Is this image relevant?
scikit-image: image processing in Python [PeerJ] View original
Is this image relevant?
Frontiers | Grand Challenges in Satellite Remote Sensing View original
Is this image relevant?
1 of 3
Refers to the image plane itself, where processing occurs directly on pixel values
Operates on the spatial coordinates (x, y) of pixels in an image
Allows for direct manipulation of pixel intensities without transforming the image to another
domain
Enables localized modifications based on pixel neighborhoods or regions of interest
Pixel-based operations
Involve manipulating individual pixel values without considering neighboring pixels
Include point processing techniques such as thresholding, brightness adjustment, and contrast
enhancement
Utilize pixel-wise mathematical operations (addition, subtraction, multiplication) to modify image
intensities
Apply lookup tables (LUTs) for efficient implementation of complex pixel transformations
Neighborhood operations
Process pixels based on the values of surrounding pixels in a defined neighborhood
Utilize spatial filters or masks to perform local operations on pixel groups
Include smoothing filters for noise reduction and sharpening filters for edge enhancement
Implement operations like median filtering, which replaces pixel values with the median of
neighboring pixels
Histogram equalization
Redistributes pixel intensities to enhance overall image contrast
Spreads out the most frequent intensity values across the available range
Calculates cumulative distribution function (CDF) of pixel intensities
Applies a transformation function to map original intensities to new, equalized values
Particularly effective for images with poor contrast or limited dynamic range
Contrast stretching
Expands the range of intensity values in an image to utilize the full dynamic range
Identifies minimum and maximum intensity values in the original image
Applies a linear scaling function to map original intensities to the desired range
Can be performed globally or locally to enhance specific image regions
Useful for improving visibility of details in low-contrast images (satellite imagery)
Nonlinear filters use more complex operations that cannot be expressed as a linear combination
Convolution process
Fundamental operation in spatial filtering, involving sliding a kernel over an image
Multiplies kernel values with corresponding pixel intensities in the neighborhood
Sums the products to produce the output pixel value
Can be expressed mathematically
as:g(x,y)=∑s=−aa∑t=−bbh(s,t)f(x−s,y−t)g(x,y)=∑s=−aa∑t=−bbh(s,t)f(x−s,y−t)
Where g(x,y)g(x,y) is the output image, f(x,y)f(x,y) is the input image,
and h(s,t)h(s,t) is the kernel
Gradient-based approaches
Detect edges by calculating the first-order derivatives of image intensity
Compute gradient magnitude and direction to identify edge strength and orientation
Common gradient operators include:
o Sobel operator: uses 3x3 kernels to approximate horizontal and vertical gradients
o Prewitt operator: similar to Sobel but with different kernel weights
o Roberts cross operator: uses 2x2 kernels for diagonal edge detection
Laplacian operators
Utilize second-order derivatives to detect edges in all directions simultaneously
Identify edges as zero-crossings in the second derivative of image intensity
Laplacian operator is defined as: ∇2f=∂2f∂x2+∂2f∂y2∇2f=∂x2∂2f+∂y2∂2f
Often combined with Gaussian smoothing (Laplacian of Gaussian) to reduce noise sensitivity
Produce thin edges but are more sensitive to noise compared to gradient-based methods
Produces clean, thin edges with good localization and low false positive rate
Widely used in computer vision for its robust performance across various image types
Unsharp masking
Creates a sharpened image by subtracting a blurred version from the original
Process involves:
Enhances edges and fine details while preserving overall image structure
Amount of sharpening controlled by scaling factor applied to the unsharp mask
High-boost filtering
Extends unsharp masking by amplifying high-frequency components more aggressively
g(x,y)=A⋅f(x,y)−fsmooth(x,y) g(x,y)=A⋅f(x,y)−fsmooth(x,y)
General form:
Where A>1A>1 is a boost factor, f(x,y)f(x,y) is the original image,
and fsmooth(x,y)fsmooth(x,y) is the smoothed version
Allows for greater control over the degree of sharpening and overall image brightness
Useful for enhancing subtle details in low-contrast images (astronomical imagery)
Laplacian sharpening
Utilizes the Laplacian operator to enhance edges in all directions
Process involves:
Morphological operations
Morphological operations are nonlinear filters based on set theory and mathematical morphology
These techniques are powerful tools for analyzing and processing the shape and structure of
objects in binary and grayscale images
Understanding morphological operations is crucial for tasks such as noise removal, object
segmentation, and feature extraction in image data analysis
Dilation and erosion
Fundamental morphological operations that form the basis for more complex transformations
Dilation:
Erosion:
Closing:
Hit-or-miss transform
Advanced morphological operation used for shape detection and feature extraction
Utilizes two structuring elements: one for foreground and one for background
Identifies locations where the foreground structuring element fits the object and the background
element fits the complement
Applications include:
Computational efficiency
Spatial domain:
o Efficiency depends on the size of the image and the complexity of the operation
o Simple point operations are very fast
o Neighborhood operations can become slow for large kernels or images
Frequency domain:
o Initial transform (FFT) has complexity of O(N log N) for an NxN image
o Many operations become simple multiplications in frequency domain
o Efficient for global operations and large-scale filtering
Hybrid approaches:
Application scenarios
Spatial domain applications:
Implementation considerations
Implementing spatial domain processing techniques requires careful consideration of various
factors to ensure efficient and effective image analysis
Understanding these considerations is crucial for developing robust image processing systems
and optimizing performance in real-world applications
Balancing algorithm complexity, hardware capabilities, and processing requirements is essential
for successful implementation of image analysis tasks
Algorithm complexity
Impacts processing time and resource requirements for spatial domain operations
Considerations include:
Optimization techniques:
Hardware acceleration
Utilizes specialized hardware to speed up spatial domain processing tasks
Graphics Processing Units (GPUs):
Medical imaging
Enhances diagnostic capabilities and supports medical research through image processing
Applications include:
Supports computer-aided diagnosis (CAD) systems for early detection of diseases (cancer)
Remote sensing
Processes satellite and aerial imagery for environmental monitoring and mapping
Key applications:
Combines spatial domain techniques with machine learning for advanced image understanding
and analysis
The Fourier transform is key to frequency domain analysis, decomposing images into sinusoidal
components. This allows for specialized processing techniques, including filtering, noise reduction, and
edge sharpening, which can be more efficient in the frequency domain than in the spatial domain.
2D Fourier transform
Extends 1D Fourier transform to two-dimensional image data
Computes frequency components along both horizontal and vertical directions
Produces a 2D frequency domain representation with u and v frequency coordinates
Enables analysis of directional patterns and textures in images
Facilitates operations like filtering and compression in the frequency domain
Homomorphic filtering
Addresses non-uniform illumination issues in images by separating illumination and reflectance
components
Applies the logarithm to convert multiplicative illumination effects to additive components
Utilizes high-pass filtering in the frequency domain to reduce low-frequency illumination
variations
Enhances image contrast and normalizes brightness across the image
Inverse operation reconstructs the enhanced image with improved illumination characteristics
Convolution theorem
States that convolution in the spatial domain equals multiplication in the frequency domain
Expressed mathematically
as F{f(x,y)∗h(x,y)}=F(u,v)H(u,v) F{f(x,y)∗h(x,y)}=F(u,v)H(u,v)
Simplifies filtering operations by replacing spatial convolution with frequency domain
multiplication
Enables efficient implementation of large convolution kernels
Particularly useful for operations involving large filters or repeated convolutions
Implementation considerations
Practical implementation of frequency domain techniques requires careful consideration of
computational aspects
Efficient algorithms and software tools enable real-time processing of large image datasets
Understanding implementation details crucial for optimizing performance in image processing
applications
Ringing artifacts
Gibbs phenomenon causes oscillations near sharp discontinuities in frequency domain filtering
Results from truncation of high-frequency components in the Fourier series representation
Manifests as ripple-like patterns around edges in the processed image
Mitigated by using smooth transition filters (Gaussian) instead of ideal filters
Windowing techniques applied to reduce ringing artifacts in certain applications
Boundary effects
Periodic nature of DFT assumes image content repeats infinitely in all directions
Leads to artifacts at image boundaries when applying frequency domain operations
Discontinuities at image edges introduce high-frequency components in the spectrum
Mitigated by techniques like image padding, symmetric extension, or windowing
Careful handling of boundary conditions required for accurate frequency domain analysis
2.image restoration
Image degradration model :
Image restoration is the process of recovering an image that has been degraded by some
knowledge of degradation function H and the additive noise term . Thus in
restoration, degradation is modelled and its inverse process is applied to recover the original
image.
Terminology:
= degraded image
= input or original image
In frequency domain:
After taking fourier transform of the above equation:
(for restoration)
A digital image often contains noise. Noise introduces erroneous pixel values. Image
filtering is the process of removing these errors. Convolving a noisy image with an
appropriate kernel practically nullifies the noise.
Let us have a look at the different image filtering methods in the subsequent
paragraphs. For all the below filters, let the kernel size be K height*Kwidth.
————————————— eq(1)
In the above diagram, we can see that the values near the reference point are more
significant. This same Gaussian distribution is achieved in a 2-dimensional kernel
with the reference point being matrix center. As opposed to Normalized Box filter
which gives equal weight to all neighboring pixels, a Gaussian kernel gives more
weight to pixels near the current pixel and much lesser weight to distant pixels when
calculating sum.
Bilateral Filter:
While Gaussian filter gives us more control and accurate results than box filter, both
the filters err when taking weighted sum of edge pixels. Both of the filters smoothen
the edge pixels, thereby diminishing the intensity variant. Bilateral filter overcomes
this shortcoming.
While Gaussian filter assigns weight to neighbors based only on their distance from
the current pixel, bilateral filter brings intensity to the picture. Now the shape of the
Gaussian function is determined solely based on or the standard deviation. Let us
assume the following symbols:
We can again observe from the above equation that the weight given to I(α n) –
intensity at a neighbor pixel αn simply depends on its Euclidian distance from the
current pixel α in the sum. The closer a neighbor pixel is to reference (current) pixel,
the more weight it gets.
Let’s see what happens when we change the above equation to one as below:
————————————–eq(3)
Now the weight given to I(αn) – intensity at a neighbor pixel αn also depends on its
intensity difference when compared with the intensity at pixel α – I(α). The closer a
neighbor pixel is to reference (current) pixel, the more weight it gets. But
additionally, it also needs to be close in terms of intensity to reference pixel. Even if
a neighbor pixel is close but differs a lot when intensities are compared to reference
pixel, it will be given much less weight.
Now let’s see what happens when during convolution, we are at an edge pixel as
highlighted below:
Eq(2) gives the same weight to pixel value at the immediate left of edge pixel as the
weight given to immediate right. This weight distribution will blur the difference
made apparent by this edge pixel.
Eq(3) however, observes that there is a considerable difference in intensities for the
pixel at immediate left of edge pixel. As we know the behavior of Gaussian, the
larger the difference from reference point, the smaller the weight it generates. Here
the reference point is the intensity at edge pixel. Hence it gives more weight to pixel
at right and much lesser weight to pixel at left in comparison. Hence the edge is
preserved.
Analysis
Let us analyze each of the above three algorithms as image filtering operations via a
test case.
Problem Statement:
When we think of filtering in images, we always define image features that we want
to modify and those that we want to preserve. Let us observe the following original
image with no noise and surface plot of one of its patch.
Following are the observations:
1. edge pixels that form the boundary between bars and background can easily
be labeled by detecting pixels whose intensity difference with immediate
neighboring pixel exceeds δ.
2. peak pixels are ensured if we strictly detect that consecutive two adjacent
pixels on either side have a decreasing trend in intensity. Namely if p1, p2, p3,
p4, p5 are consecutive pixels horizontally, ensuring I(p2) < I(p3) & I(p1) <
I(p2) & I(p4) < I(p3) & I(p5) < I(p4) should strictly ensure p3 is a peak pixel.
Applying the above two constraints on the given image, the following feature image
is generated.
Red pixels are edge pixels with horizontal intensity difference > δ
Green pixels are edge pixel with vertical intensity difference > δ
Blue pixels are peak pixels
Entrance of the Enemy:
Let us introduce some reality into our ideal world till now – noise. Let us model the
real world noise as normally distributed in accordance with the Central Limit
Theorem. We are using the python opencv function randn to generate this noise.
When added to the above original image, it has the following effect:
The surface is not smooth anymore and has numerous distorted spikes
Peaks seem to be the most affected by noise
Performance Measure:
We want to measure how closely we have retrieved the original image features after
applying a noise removal algorithm. If after applying the same strict constraints on a
filtered image, red, blue or green pixels are detected, we want to measure how many
of them have preserved their original position respectively. True positives and false
positives are indicators of that. Equally, we also want to measure how much of our
expectations are met. Below are the four performance metrics for peak pixels (blue):
Positive indicators:
true positives: in percentage of all the blue pixels detected, how many were
at their original (correct) location
expectations met: in percentage from all the blue pixels from original
image, how many were detected in the current image
Negative indicators:
false positives: complementary of true positives, in percentage of all the
blue pixels detected, how many were at different (false) location
expectations failed: complementary of expectations met, in percentage
from all the blue pixels from original image, how many were not detected in
the current image
Similar metrics apply for red and green pixels.
The box filter algorithm only exposes one parameter that we can control – the kernel
dimension which can only be odd numbered. Keeping the kernel size as 3*3 has very
little effect on the image. We try with kernel size of 5, which most literature
recommends. We will find out the reason soon.
Peak pixels are now truly positive. 99.57% of all peaks detected are indeed true
peaks. But the reality is only 29.80% of expected true peaks are detected. More
drastic is the effect on edge pixels. Not a single red or green pixels are detected
which means an expectation failure of 100%.
Figure 4 clearly explains the reason. Though the surface has smoothed and noise has
been virtually eliminated, taking the average of neighbor pixels has resulted in peaks
becoming flatter and sharp edges demonstrating slanting behavior.
Increasing the kernel size increases the computational cost while further flattening
out peaks and diminishing edges.
Overcoming the shortcoming of box filter, Gaussian filter distributes weight among
its neighbor pixels depending on parameter –c d, the standard deviation in space.
Keeping the kernel size same as 5*5 and varying c d, we achieve the best result with
standard deviation as 1.
Gaussian Filtering
Gaussian filtering is a linear smoothing technique based on applying a Gaussian function to
the image. This technique is widely used in image processing to blur an image and reduce
the effect of noise. In OpenCV, we can apply Gaussian filtering using
the cv2.GaussianBlur() function.
import cv2
In the code above, we read an image using the cv2.imread() function. Then we apply
Gaussian filtering with a kernel size of (5, 5), which determines the amount of smoothing.
The larger kernel size results in more blurring. The last argument is the standard deviation,
which controls the spread of the Gaussian distribution. A value of 0 indicates that OpenCV
should automatically calculate it based on the kernel size.
Median Filtering
Median filtering is a non-linear noise reduction technique that replaces each pixel's value
with the median value of its neighborhood. This technique is particularly useful for removing
salt-and-pepper type of noise. OpenCV provides the cv2.medianBlur() function for applying
median filtering.
import cv2
In the code snippet above, we read an image using the cv2.imread() function and then apply
median filtering with a kernel size of 5. Similar to Gaussian filtering, higher kernel sizes
result in stronger noise reduction but may also blur the image's details.