A. Image Pre-Processing:: Grayscale Conversion Fig.5.f
A. Image Pre-Processing:: Grayscale Conversion Fig.5.f
Image Pre-processing:
Image data pre-processing converts image data into a form that allows machine learning
algorithms to solve it. It is often used to increase a model’s accuracy, as well as reduce its
complexity. The techniques used to pre-process image data are image resizing, converting
images to grayscale, and image augmentation etc., when the images exist in different formats,
i.e., natural, fake, grayscale, etc., we need to standardize them before feeding them into a
neural network.
The important steps in image pre-processing techniques are
● Grayscale conversion
● Normalization
● Data Augmentation
● Image standardization
i) Grayscale conversion simply converting images from colored to black and white as
shown in Fig.5.f. It is normally used to reduced number of pixels that need to be processed
and reduce computation complexity in machine learning algorithms.
This could be a not good approach for applications depend on color information because it
losses information in conversion. Since most pictures don’t need color to be recognized, it is
wise to use grayscale, which reduces the number of pixels in an image, thus, reducing the
computations required.
Fig.5.f. color image transformation to gray scale
Converting images to grayscale might not always be practical in solving some problems.
For examples where it would be impractical to use grayscale include: trafficlights, healthcare
diagnosis, autonomous vehicles, agriculture, etc. The best way to know whether to use it or
not depends on your human visual ability to identify an object without color.
ii)Normalization
Image normalization is a typical process in image processing that changes the range of pixel
intensity values.
For example, when we perform a function that produces a normalization of an input image
(grayscale or RGB). Then, we understand a representation of the range of values of the scale
of the image represented between 0 and 255. i.e., very dark images become clearer.
Linear normalization of a digital image adjusts the pixel intensity values to a common scale,
typically to improve contrast or prepare the image for further processing. The formula for linear
normalization is often expressed as:
Linear normalization used in digital image processing to rescale pixel intensity values is given by the
formula:
For grayscale image, normalize using one channel and color images normalize a RGB (3 channels)
Examples:
The left image depicts the original image is too dark and results clear after the normalization
process.
Example 2: The left image depicts the right-side original image is very bright results with
better contrast after the normalization process.
iii)Data augmentation:
Data augmentation helps in preventing a neural network from learning irrelevant features.
This results in better model performance.
Data augmentation is the process of making minor alterations to existing data to increase its
diversity without collecting new data. It is a technique used for enlarging a dataset.
There are two types of augmentation are Offline augmentation is used for small datasets.
Online augmentation is used for large datasets. It is normally applied in real-time.
Standard data augmentation techniques include horizontal & vertical flipping, rotation,
cropping, shearing, etc. as shown in Fig.5.h
Shifting is the process of shifting image pixels horizontally or vertically.
Flipping reverses, the rows or
columns of pixels in either
vertical or horizontal cases,
respectively.
Rotation process involves rotating
an image by a specified degree.
Changing brightness is the
process of increasing or
decreasing image contrast.
Cropping is the process of
creating a random subset of an
original image which is then
resized to the size of the original
image.
Scaling image can be scaled
either inward or outward. When
scaling an image outward, the
image becomes more significant
than the original and vise versa.
Fig.5. h. Data Augmentation
C. Feature extraction
Features are parts or patterns of an object in an image that help to identify image. The entire
DL model works around the idea of extracting useful features that clearly define the objects
in the image.
A raw data (image) is transformed into a feature vector using learning algorithm, which can
learn the characteristics of the object.
For example — a square has 4 corners and 4 edges, they can be called features of the square,
and they help us humans identify it’s a square. Features include properties like corners,
edges, regions of interest points, ridges, etc.
Example: when we feed the raw input image of a motorcycle into a feature extraction
algorithm. the extraction algorithm produces a vector that contains a list of features as shown
in figure below. This feature vector is a 1D array that makes a robust representation of the
object.
The Process relies on domain knowledge (or partner with domain experts) to extract features
that make ML algorithms work better. Feeding the produced features to a classifier like a
support vector machine (SVM) or AdaBoost to predict the output (Fig 5.l).
KR21 Deep Learning & Vision Systems CSE(AI & ML)III/I
Fig 5.l: Traditional machine learning algorithms require handcrafted feature extraction.
When a raw image is fed to the network, while passing through the network layers, identifies
features/ patterns within the image as shown in figure (Fig 5.m). Neural networks can be
thought of as feature extractors plus classifiers that are end-to-end trainable, as opposed to
traditional ML models that use handcrafted features.
Fig.5.m : A DNN the input image through its layers to automatically extract features