0% found this document useful (0 votes)
7 views19 pages

Image_Processing

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 19

Image Processing

Clara Gonçalves
March 2024

Abstract

1 What is an image?
An image can be represented by a Matrix composed with values that represent the intensity of the color on
a certain point (also called pixel).
If we think of an image in gray scale as a function. Then f(x,y) will be the intensity of the pixel in a
given point in the image space.

2 Image Transformation and Filtering


This essay will look into some image transformations and ways of Filtering an image. Point processing, the
Linear filtering, Sampling and aliasing, Image derivatives and edge detection.

2.1 Point Processing


Point processing is one of the simplest kind of processing in image because it only requires one point.
Seeing an image has function will help to understand this part because, the point processing is the change
of each point in the image function (or image matrix) by a given transformation.
That allow us to transform the image brightness by adding or taking a value to the image function. To
invert the colors of the image by inverting the function and also to change the contrast of the image by
multiplying or dividing the function by a given value.

1
A image processor operator is a function that takes one or more input and returns an image with those
operations. Note that usually each function has two inputs the pixel value and it’s location.

f (x) = h(f (x)) − > f (i, j) = h(f (i, j)), being x = (i, j) (1)
A common function used in point processing is g(x) = af (x) + b, this function is set to allow to change
contrast(af (x)) and brightness(b).

2.2 Gamma Correction


The Gamma Correction is a non-linear function that is used to remove the non-linear mapping between
input radiance and quantize pixel values.

Vo ut = AVi nγ , gamma value ussually of γ ≈ 2.2 (2)


This equation represents an approximation, that should work in most tasks.

If we increase of gamma value leads to a darker image, and the decrease in gamma value leads to a lighter
image.
The frequancy espectrum is symetric on the amplitude (vertical) axis. So it averages at zero.

2.3 Histogram Equalization


The histogram Equalization is is essentially a tool that qualify the image that results of a process that
improves the contrast of an image by adapting his original histogram. That allow the image to have more
detail.
The equalized histogram distributes the values in all intensity of the image. And so the cumulative
histogram becomes more linear.

2
Intensity Histogram is the histogram that represent the pixel intensity thought an image. The distri-
bution of pixel intensity of an image, in other words, the probability of a value r is present in a given pixel,
is the following:
numberof pixelswithintensityr
p(r) = (3)
T otalnumberof pixels
This is important because when you calculate p(r) for the different values of r and plot, then you can
get the normalized histogram or probability density function (PDF) of that image intensity values. This
histogram provides information about the distribution of intensity in the image.
A High contrast image is given by an image in which his histogram is uniformly distributed between
all the intensity ranges. With this distribution, the image is more detailed.

2.3.1 Local Histogram processing


Local Histogram Processing, also known-ed as Local Histogram Equalization, divides an image in to small
regions (windows and neighborhood) and then apply histogram equalization to each of those neighborhoods.
1. Dividing the image — Image is divided between overlapping and non overlapping regions.
2. Local Histogram equalization — For each region, a local histogram is calculated based on the pixel
intensity within that region
3. Histogram Equalization — Histogram Equalization is the applied to each local histogram. These
processes allow the redistribution of the intensity values in each region, enhancing the contrast.
4. Combining Results — The process made in each region is then joined to form the final enhance
image.

3 Filtering of an Image
Filtering an image passes by given an image and a filter. We apply the filter to the given image and then the
filter does mathematics operations to the image, modifying the pixel values. This process has as main goal
to emphasize or suppress some features of an image. This allows in the end of the retrieve of information on
the original image.
In the context of image processing, we have different types of filter, that retrieve different information:

3
• Noise Reduction Filters : Also caled Smoothing Filters (Bluring), they are used as the name implies
to reduce the noise of an image and create a smother version of it. One of the example filter that does
this process is the Gausssian filter and median filter.
• Sharpening Filters : These filters are used to enhance the details of an image, per example enhancing
the edges. To get this result, we can use Laplacian or Sobel filters.

• Edge detection filters : The Edge detection filters, as the name suggests, apply a filter in an image
in such a way that only the edges are visible. One of the examples is the Canny Edge Detector Filter.
• Frequency Domain Filters : Filters applied in frequency domain, per example Fourier Transformed,
can be used in order to apply to an image a low-pass or high-pass filter.

Filtering an image involves manipulating the pixel values of the original image in such a way that will
enhance or suppress some features of the image. The manipulation of the pixel values is made based on some
function (this function depends on the filter that is being applied) applied on a local neighborhood of each
pixel.

3.1 Canonical Image Processing Problems


When processing an image we can encounter various types of problems. The most fundamentals problems
in image processing are the following:
1. Image Restoration

Denoising : The process of removing noise from an image, where noise refers to unwanted random
variations in brightness or color.
Debluring : Addressing blurred images which can result from factors like motion during image
captivation or limitations in the image system.

2. Image Compression
This involves reducing the file size of an image while maintaining the quality of the image. The
listed standards (JPEG, HEIF, MPEG) are examples of widely used image compression algo-
rithms.
3. Computing Field Properties
Optical Flow : Analysing the motion of objects with a sequence of images. It helps understand how
pixels move from one frame to another.
Disparity : Disparity in the context of image processing is all about figuring out how far away
different objects are in a pair of images. Imagine you take two pictures of the same scene from
slightly different angles, like your left and right eye might see a scene. Disparity helps you
understand the differences between these two images.
So, when you’re talking about disparity in image processing, you’re essentially looking at the
variations or variations in the apparent position of objects between those two images. It’s like
understanding the ”shift” or ”offset” of objects as seen from different viewpoints. This information
is useful in tasks like creating 3D models of a scene or understanding the depth in a stereoscopic
image.

4. Local Structural Features


Connors : Identifying points in an images where the intensity values change drastically in multiple
directions. Connors are used for tracking and detecting objects.
Edges : Detecting boundaries or abrupt positions in intensity within an image. Edge detecting is
crucial for multiple applications in computer vision.

4
3.2 Noise reduction
Noise reduction or denoising as mention above is the capability of reducing noise from an image.
On of the ways that this can be done is using temporal averaging this is done by taking a lot of images
from the same place and average the values of the intensity through.
Another way is to use a Linear Filtering in which you modify the pixel in an image based on a given
function on the local neighborhood of each pixel.

There are two simple examples of Linear Filtering:


Cross Correlation and Convolution. In each of this examples we replace each pixel with a linear
combination (weighted sum) of the value of intensity of its neighbor pixels
This Matrix(or tensor) with the linear combination can be called kernel, mask, filter.

3.3 Cross-Correlation
Cross-correlation in image processing is a technique used to measure the similarity between two signals
or images by sliding one over the other and computing the sum of products at each position, providing
information about spatial relationships and pattern matching.
With F representing the image, H the kernel with size 2k + 1x2k + 1 and G the output image we can
describe this operation by the following equation:

X v=−k
u=−k X
G[i, j] = H[u, v]F [i + u, j + v] (4)
k k

H[u, v] is the prescription for weights in linear combination. Also can be denoted by the ”dot product” of
the kernel and the image: O
G=H F (5)
The cross cross-correlation is neither associative nor commutative.

3.4 Convolution
The Convolution is very similar to the cross correlation. the difference lies in the handling of the kernel
during the operation. In convolution the kernel if ”flipped” both vertically and horizontally.
Having that into account we can describe the convolution equation as:
u=−k
X v=−k
X
G[i, j] = H[u, v]F [i − u, j − v] (6)
k k

Different from cross correlation the convolution is associative and commutative. which in a way turns
the convolution into a multiplication like operation with all its proprieties:

5
• commutative
• associative
• distributes over addition
• scalars factor out
• identity: use with unit impulse
This also means that we can apply multiple layers of convolution calculations and not worry about the
order in which they are applied.

3.5 Convolution Networks


Convolution Neural Networks (CNNs) are a class of deep learning models designed for processing structured
grid data, such as images. CNNs employ convolutional layers to automatically learn and extract hierarchical
features from input data. These networks are particularly effective in image recognition tasks, capturing
spatial relationships and patterns through convolutional filters, and often include pooling layers1 to reduce
dimensionality.
In these we have different types of features learnt in different layers, per example:

• Low-level feature : lines, oriented-edges


• Mid-level features : Combined edges: curves and shapes
• High-level features : Combined shapes: objects, scenes
• Predictor : Process features and predict output

With these CNNs have proven successful in various computer vision applications, including image clas-
sification, object detection, and semantic segmentation.

3.6 Handling boundaries


In practical image processing, when applying a filters (like convolution or correlation) near the edges of the
images, methods, or padding are needed to handle the filter window extending beyond the image boundaries.
Common padding methods include clipping with black (zero) pixels, wrapping around the image, copying
the edge values, or reflecting values across the edge. Additionally, the size of the output depends on the
chosen convolution mode:
• full results in an output size equal to the sum of sizes of the input filter and image
• same preserves the input size
• valid produces an output size equal to the difference between the sizes of the input filter and image.

In Convolution Neural Networks (CNNs), padding and stride are two important concepts associated
with the convolutional layers:
Padding :
1 Pooling layers in neural networks, often used in conjunction with convolutional layers, downsample the spatial dimensions
of feature maps, reducing their resolution and retaining essential information for efficient processing.

6
• Definition: Padding refers to the addition of extra pixels (usually zero-valued) around the input image
or feature map before applying the convolution operation.
• Purpose: Padding helps to retain spatial information at the edges of the image and mitigates the
problem of shrinking feature maps as they pass through convolutional layers.

• Types: Common padding types include zero-padding (adding zeros around the image), valid padding
(no padding), and reflective or symmetric padding.
Stride:
• Definition: Stride is the step size with which the convolutional filter moves across the input image or
feature map.

• Purpose: A larger stride reduces the spatial dimensions of the output feature map, effectively down
sampling it. This can be useful for reducing computational complexity and memory requirements.
• Impact: Smaller strides provide higher spatial resolution in the output, but may increase computa-
tional cost. Larger strides result in more aggressive down sampling.

In summary, padding, and stride are parameters that influence the spatial dimensions of the feature maps
produced by convolutional layers in CNNs. Padding helps maintain information at the edges, while stride
controls the step size of the filter, affecting the down sampling or up sampling of the feature maps. These
concepts play a crucial role in determining the architecture and performance of a CNN in various computer
vision tasks.

3.7 Mean Filtering


Mean filtering, also known as moving average or box filtering, is a simple image processing technique used
for smoothing or blurring an image. The fundamental idea behind mean filtering is to replace the value of
each pixel with the average of the pixel values in its neighborhood.
Local Neighborhood:
For each pixel in the image, a local neighborhood is defined. The size of this neighborhood is determined
by a parameter often referred to as the filter size or kernel size.
Averaging Operation:
The pixel value at the center of the neighborhood is then replaced by the average (mean) of all the pixel
values within that neighborhood.
Smoothing Effect:
This process has a blurring or smoothing effect on the image because high-frequency variations (like
noise) tend to get averaged out, and the overall image becomes more uniform.
Application:
Mean filtering is commonly used for noise reduction, especially salt-and-pepper noise or other random
variations in pixel intensity.
Mathematically, for a pixel at position (i, j) in the image, the new value after mean filtering (F ′ ) can be
calculated as follows:
k k
1 X X
F ′ (i, j) = F (i + u, j + v)
k×k
u=−k v=−k

Here, F (i, j) represents the original pixel value, and k is the size of the filter/kernel.
Mean filtering is a straightforward and computationally efficient method for basic image smoothing, but
it may not be suitable for preserving fine details or edges in the image. More advanced filtering techniques,
such as Gaussian filtering, are often employed when a smoother yet more visually appealing result is desired.

7
3.8 Separable filters
Separable filters are a type of filter in image processing that can be decomposed into two one-dimensional
filters, often applied in succession along different axes (horizontal and vertical). This decomposition simplifies
the computation and makes the overall filtering process more efficient.

1. Filter Decomposition: A separable filter can be expressed as the outer product of two vectors:
one for the row and one for the column. Mathematically, if F is a 2D filter, it can be presented as
F = AB T , where A is a vector representing the horizontal filter, and B is a vector representing the
vertical filter.

2. Separable Convolution: Instead of applying the 2D filter directly to the image, the separable
approach involves applying the horizontal filter along the rows and then applying the vertical filter
along the columns (or vice versa). This is computationally more efficient than applying the full 2D
filter in a single step.
3. Efficiency Benefits: The main advantage of separable filters lies in the reduced computational
complexity. Convolution with a separable filter takes fewer operations compared to the equivalent
non-separable filter. This can lead to significant speed improvements in image processing algorithms,
especially for large images and filters.
4. Examples: Common examples of separable filters include Gaussian filters and Sobel filters. These
filters can be decomposed into horizontal and vertical components, making them separable.

5. Implementation: When implementing separable filters, the filter kernel is often factored into two 1D
vectors. The image is convolved first with the horizontal vector, and then the result is convolved with
the vertical vector. This reduces the overall computational cost, especially for larger filter sizes.

Separable filters are widely used in real-time image processing applications, computer vision, and other
domains where efficiency is crucial. They offer a balance between computational savings and filter expres-
siveness, making them a valuable tool in image filtering algorithms.
If the image has M x N pixels and the filter kernel has size LxL:
• What is the cost of convolution with a non-separable filter? L2 xM xN
• What is the cost of convolution with a separable filter? 2x(LxM xN )

3.9 Gaussian filter


Gaussian filter is a filter that applies to an image a kernel based on the Gaussian function. When this kernel
is convolved with the image, each pixel in the image is replaced with a weighted average of its neighboring
pixels, with the weights determined by the Gaussian distribution.
The equation of the Gaussian filter is the following:
X
h[m, n] = g[k, l]f [m + k, n + l] (7)
k,l

This process effectively reduces high-frequency (low-pass filter) noise and sharpens transitions between
different image regions. The result image of the Gaussian filter to the image to achieve smoothing or blurring.
As we can see in the image bellow Selecting an appropriate value for σ is crucial for achieving the desired
level of smoothing without overly blurring the image or losing important details.
The Gaussian filter, while highly efficient in smoothing and blurring images has some limitation:

8
• Loss of detail - Since the Gaussian filter applies blur to the image, it can result in a loss of fine
details(sharp edges or small feature).
• Border effects - In Gaussian filter near the edges of an image, the pixels outside the image boundary
are typically ignored or handled in a special way (e.g., by zero-padding or by mirroring). This can lead
to artifacts or border effects in the filtered image, such as halos or ringing around edges.
• Computationally Intensive - It can be computer demanding to convolute a Gaussian kernel with
large standard deviation. However the separability feature of the Gaussian kernel can help with this.
• Parameter Sensitivity - The performance of the Gaussian filter can be sensitive to the choice of
parameters, particularly the standard deviation (σ).
• Linear Nature - Gaussian filtering is linear, meaning that it treats all image regions equally regardless
of their content.
• Trade-off Between Noise Reduction and Detail Preservation - There is often a trade-off be-
tween noise reduction and detail preservation when applying Gaussian filtering. Increasing the standard
deviation (σ) leads to stronger smoothing and noise reduction but may also result in more loss of detail.

3.10 Mean (or box filtering) vs Gaussian filtering


Mean filter and Gaussian filter are both commonly used techniques for image smoothing in image processing,
but they have different characteristics and applications.

• Filter Principal
Mean Filter: Replace each pixel values with the average values of the pixel neighborhood. Computes
a simple average giving every pixel the same weights.
Gaussian Filter: Replace each pixel value with the weighted2 average in the image using the
Gaussian kernel.
• Smoothing Effect
Mean Filter: Performs uniform smoothing and blurring through the image. The intensity of the
blurring is given by box kernel.
Gaussian Filter: Performs weighted smoothing and blurring.The amount of smoothing can be
controlled by adjusting the standard deviation (σ) of the Gaussian kernel
• Edge Preservation
Mean Filter: Blured edges
Gaussian Filter: Not as blured
• Computational Complexity
Mean Filter: simple computation calculation
Gaussian Filter: more complex computation computation
2 The weights are determined by the Gaussian distribution, with more weight given to nearby pixels and less weight to distant
pixels.

9
• Parameter Sensitivity
Mean Filter: few parameters to adjust
Gaussian Filter: a small change in the parameters can result in a lot more blluring.

While both Mean and Gaussian filter are used for image blurring, the Gaussian filter tends to preserve
more details in an image and is also better at removing noise.

3.11 Sharpen filter


The Sharpen filter is a filter that enhances the image sharpening the edges and the enhancing the details.
It does so by emphasizing the high frequency (high-pass filter).
There are several methods for sharpening images, but one common approach is to use a technique called
”unsharp masking” (USM). The basic idea behind unsharp masking is to create a mask that highlights the
differences between the original image and a blurred version of the image. This mask is then added back to
the original image to enhance its sharpness.
Here’s a simplified explanation of how unsharp masking works:
1. Blur the image - First we generate a blurred image using a something filter (e.g Gaussian filter).
Which represent the low frequency components of the mage

2. Subtract the blurred image- In order to get the high frequency component of the image. We
subtract the blurred image to the original image.
3. Enhance the Original image- Finally we take the original image and the high frequency image and
we add them together.
Mathematically, the process can be represented as follows:

SharpenedImage = OriginalImage + Amountx(OriginalImage − BluredImage) (8)

Where:
”Original Image” is the input image. ”Blurred Image” is the result of applying a smoothing filter to the
input image. ”Amount” is a parameter that controls the strength of the sharpening effect.
It’s worth noting that while sharpening filters can enhance image details, they can also amplify noise and
artifacts in the image.

3.12 Filters: Thresholding


Thresholding is a fundamental technique in image processing used to segment an image into regions based on
pixel intensity values. It involves comparing each pixel’s intensity value to a threshold value and classifying
the pixel as belonging to one of two categories:
(
255, if P (f (m, n) > A, Aisthethresholdvalue
g(m, n) = (9)
0, otherwise
There are two main Thresholding Types:
Global Thresholding: A single threshold value is applied uniformly to the entire image. Local or
Adaptive Thresholding: Different threshold values may be applied to different regions of the image,
allowing for more flexibility, especially in images with non-uniform illumination or varying contrast.

3.13 Non-Linear filters


Non-linear filters are a class of image processing filters that do not rely on linear operations like convolution.
Unlike linear filters such as Gaussian or mean filters, which apply a weighted average to neighboring pixel
values, non-linear filters use more complex operations that consider the local pixel neighborhood in a non-
linear manner.

10
3.14 The bilateral filter
Bilateral filtering is a non-linear, edge-preserving, and smoothing filter used in image processing to reduce
noise while preserving edges in the image. It differs from linear smoothing filters like the Gaussian filter in
that it considers both spatial proximity and intensity similarity when performing the filtering operation.
The bilateral filter operates by applying a weighted average to each pixel in the image, where the weights
are determined by both spatial distance and intensity difference between neighboring pixels. Pixels with
similar intensities and spatial proximity are given higher weights, while pixels with larger intensity differences
or greater spatial distances are given lower weights.
The formula for bilateral filtering can be expressed as:
1 X
h[m, n] = g[k, l]rmn [k, l]f [m + k, n + l] (10)
Wmn
k,l

Where:
1
Normalization factor: Wm n
Spatial weighting: g[k, l]
Intensity range weighting: rmn [k, l]

4 Sampling Aliasing
4.1 Sampling an image
Sampling an image involves capturing or representing it in a digital form. This process can include resizing,
sub-sampling, undersampling, and upsampling.
Image Resizing Image resizing refers to changing the dimensions of an image. This can involve either
increasing (upsampling) or decreasing (downsampling) its size while trying to preserve its visual quality.
Undersampling Undersampling is a specific form of sub-sampling where the image is reduced in size
by capturing fewer samples than necessary. It can lead to aliasing artifacts if not properly handled.
Upsampling Upsampling, also known as interpolation, is the process of increasing the resolution of
an image by adding new pixels between existing ones. Various interpolation techniques, such as nearest-
neighbor, bilinear, or bicubic interpolation, can be used to estimate the values of the new pixels.
Image Sub-sampling Image sub-sampling involves reducing the resolution of an image by keeping only
a subset of its original pixels. This is typically done by discarding alternate rows and columns of pixels,
resulting in a smaller image.
These techniques are fundamental in image processing and are used in various applications such as image
resizing for display or printing, compression, and digital image analysis.

4.2 Aliasing
Aliasing refers to the distortion that occur when continuous signals, such as images , are sampled at too low
a rate or when high-frequency information is not adequately represented. Which means that aliasing occurs
when the sampling image that not allows us to recreate the original image.
In computer vision, aliasing can occur during image acquisition, processing, or display. It can manifest
as distortion or loss of detail in images, particularly when resizing, rotating, or transforming images. Proper
anti-aliasing techniques, such as low-pass filtering or supersampling, are often employed to mitigate aliasing
effects in computer vision applications.
Aliasing can also affect neural networks, particularly in tasks involving image recognition or classifica-
tion. If not properly addressed, aliasing in training data or during the image preprocessing stage can degrade
the performance of neural networks. Techniques such as data augmentation, filtering, or higher-resolution
input images can help reduce aliasing effects in neural network training and inference.
Understanding and mitigating aliasing is essential to ensure accurate representation and analysis of
images.

11
5 Gaussian Pyramid
A Gaussian pyramid is a type of image pyramid used in image processing and computer vision tasks. It is
constructed by iteratively applying Gaussian smoothing and downsampling operations to an input image,
resulting in a series of images at different scales or resolutions. Each level of the pyramid represents a
smoothed and downsampled version of the original image, with progressively lower resolutions.

5.1 Construction Process


1. Level 0 (Base Level):
• The original input image serves as the base level of the pyramid.
2. Gaussian Smoothing:

• The input image is convolved with a Gaussian kernel to blur or smooth the image, reducing
high-frequency noise.
• The amount of smoothing applied is determined by the standard deviation (σ) of the Gaussian
kernel. A larger σ results in more smoothing.

3. Downsampling:
• The smoothed image is then down-sampled or subsampled to reduce its size, typically by a factor
of 2 in each dimension (halving the width and height).
• Down-sampling is achieved by discarding every other row and column of pixels in the image,
effectively reducing its resolution.

4. Repeat Steps 2 and 3:


• Steps 2 and 3 are repeated iteratively to generate additional levels of the pyramid.
• At each level, the smoothed and downsampled image becomes the input for the next iteration.
5. Stopping Criteria:

• The process continues until a stopping criterion is reached, such as reaching a predefined number
of levels or reaching a minimum image size.

The resulting Gaussian pyramid consists of a series of images arranged in a hierarchical structure, with
the base level containing the original input image and subsequent levels containing progressively smoothed
and down-sampled versions of the input. The pyramid enables multi-scale analysis of the image, allowing
algorithms to operate at different resolutions for tasks such as image blending, image alignment, and scale-
invariant feature detection.

6 Bilinear and Bicubic Interpolation


Bilinear and bicubic interpolation are commonly used techniques in image processing and computer graphics
to estimate the values of pixels at non-grid positions within an image. These methods are used to improve
the visual quality of resized or transformed images by generating smoother transitions between neighboring
pixels.

12
6.1 Bilinear Interpolation
Bilinear interpolation is a simple and efficient method that estimates the value of a pixel by interpolating
between its four nearest neighbors in a 2x2 pixel grid. The interpolated value is calculated as a weighted
average of the values of these neighboring pixels, where the weights are determined by the distances between
the target position and the neighboring pixels.
The formula for bilinear interpolation is given by:

I(x, y) = (1 − α)(1 − β)I00 + α(1 − β)I10 + (1 − α)βI01 + αβI11


where:
• I(x, y) is the interpolated value at position (x, y),
• Iij are the values of the four nearest neighboring pixels,

• α and β are the fractional parts of the target position (x, y) within the grid.
Bilinear interpolation provides a good balance between computational efficiency and visual quality, mak-
ing it suitable for real-time applications such as image resizing and texture mapping.

6.2 Bicubic Interpolation


Bicubic interpolation is a more sophisticated method that estimates the value of a pixel by fitting a cubic
polynomial to a 4x4 pixel grid surrounding the target position. The interpolated value is computed using
the coefficients of the cubic polynomial, which are determined by solving a system of linear equations based
on the values of the neighboring pixels.
The formula for bicubic interpolation involves a cubic polynomial function, typically represented in the
form:

f (t) = at3 + bt2 + ct + d


where t represents the distance from the target position to the neighboring pixels. Bicubic interpolation
requires solving a system of 16 linear equations to determine the coefficients of the cubic polynomial.
Bicubic interpolation produces smoother and more accurate results compared to bilinear interpolation,
especially for larger image transformations or resizing operations. However, it is computationally more
expensive and may not be suitable for real-time applications.

6.3 Comparison
Bilinear interpolation is simpler and faster, making it suitable for real-time applications where computational
efficiency is critical. On the other hand, bicubic interpolation produces higher-quality results at the cost of
increased computational complexity, making it more suitable for offline image processing tasks where visual
quality is paramount.

7 Super-resolution with multiple images


Super-resolution with multiple images is a technique used in image processing and computer vision to enhance
the resolution and quality of an image by combining information from multiple low-resolution images of the
same scene. This approach leverages the redundant information present in different images to generate a
single high-resolution image with improved visual quality.
Several methods have been proposed for super-resolution with multiple images, including:

• Averaging: Simple averaging of pixel values across multiple images to reduce noise and enhance image
quality. This method assumes that the low-resolution images are aligned and have similar content.

13
• Interpolation: Interpolation techniques such as bilinear or bicubic interpolation can be used to
estimate high-resolution details between pixels in the low-resolution images. These methods provide
smoother transitions between neighboring pixels and can generate visually appealing results.
• Learning-based Approaches: Deep learning techniques, such as convolutional neural networks
(CNNs), can be trained to learn the mapping between low-resolution and high-resolution images using
a large dataset of paired images. These models can capture complex relationships in the data and
generate high-quality super-resolved images.

8 Image Derivatives
8.1 Partial Derivatives with Convolution
Partial derivatives are a fundamental concept in calculus that describe how a function changes with respect to
its input variables. In this context we see images as functions and in image processing, partial derivatives are
used to quantify the rate of change of pixel values in an image along different directions, typically horizontal
and vertical.

Gradient
The gradient of an image represents the rate of change of pixel values in both the horizontal and vertical
directions. It is computed using partial derivatives with convolution operations. The gradient of an image
can be described as follow:
df df
∇f = [ , ]
dx dy
where the fraction on the left represent the x derivative and the fraction on the right represents the y
derivative of the image.

Horizontal Derivative
The horizontal derivative, also known as the x-derivative or the derivative with respect to the x-axis, is
computed by convolving the image with a derivative filter such as the Sobel filter:
 
−1 0 1
Gx = I ∗ −2 0 2
−1 0 1
where I is the input image and Gx is the horizontal derivative.

Vertical Derivative
The vertical derivative, also known as the y-derivative or the derivative with respect to the y-axis, is computed
by convolving the image with a derivative filter similar to the horizontal derivative but rotated by 90 degrees:
 
−1 −2 −1
Gy = I ∗  0 0 0
1 2 1
where I is the input image and Gy is the vertical derivative.

14
Magnitude and Direction
The magnitude of the gradient, often denoted as |∇I|, represents the overall rate of change of pixel values
in the image and is computed as:
The direction of the gradient, often denoted as θ, represents the orientation of the edges in the image
and is computed as:
 
Gy
θ = arctan
Gx
In summary, partial derivatives with convolution are used to compute the gradient of an image, which
provides information about the rate of change of pixel values in different directions. This information is
useful for various image processing tasks such as edge detection, feature extraction, and image enhancement.

9 Edge Detection
Edges can be characterized by sudden changes or discontinuities in the intensity or color values of neighboring
pixels. These changes can occur across different directions, such as horizontal, vertical, or diagonal.

Images as Functions
In image processing, an image can be viewed as a function f (x, y), where x and y are spatial coordinates,
and f (x, y) represents the intensity or color value at each point in the image.

Image Gradient and Edges


The image gradient represents the rate of change of pixel values in different directions. High gradient values
indicate the presence of edges in the image. The gradient magnitude and direction provide information about
the strength and orientation of edges, respectively.

Effects of Noise
Noise in images can interfere with edge detection algorithms by introducing false edges or reducing the
contrast between true edges and background regions.
To mitigate the effects of noise on edge detection, a common approach is to apply a smoothing or blurring
filter to the image before performing edge detection. Smoothing filters help to reduce high-frequency noise
while preserving the overall structure of the image.

Steps in Edge Detection


The process of edge detection typically involves multiple steps, including preprocessing (e.g., noise reduction),
computing gradients, detecting edges based on gradient information, and post-processing (e.g., edge thinning
or linking). The exact number of steps may vary depending on the specific edge detection algorithm and the
characteristics of the input image.

10 Derivative Theorem of Convolution


Derivative Theorem
The derivative theorem of convolution states that the derivative of a convolution operation is equivalent to
the convolution of the original function with the derivative of the convolution kernel. Mathematically, if
f (x) is a function and g(x) is a kernel function, then the derivative of their convolution is given by:
d d
(f ∗ g) = f ∗ g
dx dx

15
This theorem is useful in image processing and signal processing for computing the derivative of a
smoothed image or signal efficiently.

Application
In image processing, the derivative theorem of convolution is often applied in edge detection algorithms. By
convolving an image with a Gaussian kernel and then taking the derivative of the resulting smoothed image,
the edges in the image can be detected more effectively.

11 Derivative of Gaussian Filter


Definition
The Gaussian filter is a commonly used kernel for smoothing or blurring images. It is defined by the Gaussian
function:
1 x2
g(x) = √ e− 2σ2
2πσ 2
where x is the distance from the center of the kernel and σ is the standard deviation, which controls the
width of the filter.

Derivative
The derivative of the Gaussian filter with respect to x can be computed analytically. The derivative of the
Gaussian function is given by:
x 1 x2
− 2σ
g ′ (x) = − · √ e 2
σ2 2πσ 2
This derivative represents the rate of change of the Gaussian function with respect to distance x from
the center of the kernel.

Application
The derivative of the Gaussian filter is often used in edge detection algorithms, such as the Canny edge
detector. By convolving an image with the derivative of the Gaussian filter, the gradient magnitude and
direction at each pixel can be computed, which are used to detect edges in the image effectively.

12 Laplace Filter
The Laplace filter, also known as the Laplacian filter, is a kernel used for edge detection and image enhance-
ment in image processing. It computes the second derivative of the image intensity function, highlighting
regions of rapid intensity change.
The Laplace filter kernel is defined as:

∂ 2 f (x, y) ∂ 2 f (x, y)
∇2 f (x, y) = +
∂x2 ∂y 2
where ∇2 represents the Laplacian operator and f (x, y) is the intensity function of the image.
In edge detection, the Laplace filter is convolved with the image to highlight regions of rapid intensity
change, which typically correspond to edges in the image. The Laplace filter is particularly sensitive to noise,
so it is often applied after smoothing the image with a Gaussian filter to reduce noise.

16
13 Laplacian of Gaussian (LoG) Filter
The Laplacian of Gaussian (LoG) filter is a combination of the Laplace filter and the Gaussian filter. It first
applies Gaussian smoothing to the image to reduce noise and then computes the Laplacian of the smoothed
image to detect edges more effectively.
The LoG filter kernel is obtained by convolving the Laplacian filter kernel with the Gaussian filter kernel:

LoG(x, y) = ∇2 (G ∗ f (x, y))


where G is the Gaussian filter kernel and f (x, y) is the intensity function of the image.
The LoG filter is widely used in edge detection algorithms, such as the Marr-Hildreth edge detector.
By combining Gaussian smoothing with Laplacian edge detection, the LoG filter provides improved edge
detection performance, particularly in the presence of noise.

14 Zero-Crossing
In image processing, a zero-crossing refers to the point in an image where the intensity changes sign along a
particular direction. Zero-crossings often occur at edges in the image, where the intensity transitions from
dark to light or vice versa.
Zero-crossings can be detected by examining the signs of neighboring pixel intensities. A zero-crossing is
identified when the intensity values on opposite sides of a pixel transition from negative to positive or from
positive to negative.
It is commonly used in edge detection algorithms to identify the location and orientation of edges in
an image. By detecting zero-crossings, edge pixels can be accurately located, allowing for precise edge
localization and segmentation.

15 Derivative Filters
Derivative filters are convolution kernels used to compute the derivative of an image function with respect
to spatial coordinates. They are commonly used in edge detection and feature extraction tasks in image
processing.

Types
There are several types of derivative filters, including the Sobel filter, Prewitt filter, and Roberts filter.
These filters compute the first-order derivative of the image intensity function along horizontal and vertical
directions.

Application
Derivative filters are widely used in edge detection algorithms to compute the gradient of the image intensity
function. By convolving the image with derivative filters, the rate of change of intensity in different directions
can be measured, allowing for the detection of edges and other features in the image.

16 The Sobel Filter


The Sobel filter consists of two 3x3 kernels: one for computing the gradient along the horizontal direction
(Sobel-X) and one for computing the gradient along the vertical direction (Sobel-Y).

16.1 The Sobel Operator


The Sobel operator is a derivative filter used for edge detection in image processing. It computes the gradient
of the image intensity function using convolution with a pair of 3x3 kernels.

17
16.2 Reconstruction from 2D Derivatives
The gradient magnitude and direction can be reconstructed from the derivatives computed using the Sobel
filter. The gradient magnitude represents the strength of the edge, while the gradient direction indicates the
orientation of the edge in the image.

17 Edge Detector
An edge detector is an image processing algorithm used to identify and localize the boundaries of objects or
regions within an image. It works by detecting sharp changes in intensity, which often correspond to edges
or boundaries between different objects or textures in the image.

17.1 Finding Edges


The process of finding edges in an image typically involves several steps, including gradient computation,
non-maximum suppression, thresholding, and edge tracking.

17.2 Get Orientation at Each Pixel


At each pixel in the image, the edge detector computes the gradient magnitude and orientation. The gradient
magnitude represents the strength of the edge, while the orientation indicates the direction of the edge.

17.3 Non-Maximum Suppression


Non-maximum suppression is a technique used to thin the edges detected by the gradient computation step.
It works by suppressing all gradient values except for the local maxima along the direction of the edge.

17.4 Thresholding Edges


After non-maximum suppression, the edges are thresholded to retain only those pixels with gradient mag-
nitudes above a certain threshold. This helps remove noise and retain only the most significant edges in the
image.

17.5 Hysteresis Thresholding


Hysteresis thresholding is a more advanced thresholding technique used in edge detection. It involves applying
two thresholds: a high threshold to identify strong edges and a low threshold to identify weak edges. If an
edges is between the 2 thresholds we decide if it is an edge if it is connected to a strong edge.

17.6 Edge Tracking by Hysteresis


Edge tracking by hysteresis involves connecting the strong edges identified by the high threshold with neigh-
boring weak edges above the low threshold. This helps ensure that edges are continuous and not fragmented.

17.7 Canny Edge Detector


The Canny edge detector is a popular edge detection algorithm that combines all of the above steps into
a single, coherent framework. It is known for its high accuracy and robustness in detecting edges in noisy
images.
To run the Canny edge detector, you typically follow these steps:

1. Load the Image


2. Preprocess the Image (Optional) - Common preprocessing steps include noise reduction and
smoothing using techniques like Gaussian blur.

18
3. Compute Gradient Magnitude and Orientation ate each pixel. This is usually done using
derivative filters like Sobel or Prewitt.
4. Non-Maximum Suppression to thin the edges detected in the previous step. This involves sup-
pressing all gradient values except for the local maxima along the direction of the edge.

5. Apply Hysteresis Thresholding to classify the edges as strong, weak, or non-edges based on their
gradient magnitudes. This involves using two thresholds: a high threshold to identify strong edges and
a low threshold to identify weak edges.
6. Edge Tracking by Hysteresis Connect the strong edges identified by the high threshold with neigh-
boring weak edges above the low threshold. This helps ensure that edges are continuous and not
fragmented.
7. Output the Detected Edges - Finally, output the detected edges as a binary image, where edge
pixels are marked as white and non-edge pixels as black.

19

You might also like