Image_Processing_Technology_Based_on_Machine_Learning
Image_Processing_Technology_Based_on_Machine_Learning
Image Processing
Technology Based on
Machine Learning
Qiong Qiao
Communication University of China
2162-2248 ß 2022 IEEE Published by the IEEE Consumer Technology Society IEEE Consumer Electronics Magazine
90
Authorized licensed use limited to: HKBK College of Engineering. Downloaded on February 07,2025 at 04:41:27 UTC from IEEE Xplore. Restrictions apply.
& FOR IMAGE EXTRACTION, useful information
detection, which proves the superiority of
has become vital, and image processing technol- image processing technology based on machine
ogy has become vital. Image processing technol- learning.
ogy has been widely used in various fields,
including video surveillance, automatic vehicle
driving, industrial defect detection, agriculture, MACHINE LEARNING AND IMAGE
transportation, medicine, military, and other PROCESSING
fields.1 Following the growth of science and tech- Image Processing
nology, machine training techniques have rap- Image Enhancement Image enhancement
idly returned to the forefront of people’s minds. technology adjusts various attributes of the
Machine learning technology provides conve- image to make the image clearer, such as
nience for many aspects of modern society. adjusting the brightness, contrast, saturation,
Digital picture processing is now widely used. and hue of the image to increase its clarity and
Due to the significance of picture handling skills, reduce noise. The method of image enhance-
there has been a great advancement in image ment is to add some information or transform
processing technology. Zhu et al.2 suggested a data to the original image by certain means,
new multimodal approach to image merging selectively highlight the features of interest in
based on image factorization and thin presenta- the image, or suppress some unwanted features
tion, which can effectively fuse images. Mauryaa in the image, so that the image matches the
et al.3 and Zou et al.4 proposed a social spider visual response characteristics. Sometimes the
optimized image fusion method, which can acquired image is dark, has low contrast, and is
increase contrast while maintaining brightness noisy. Among them, image enhancement can be
while fusing images. Li et al.5 used multipeak fea- divided into two categories: 1) a frequency
ture fusion for map labeling, which can effectively domain method and 2) a space domain method.
improve labeling accuracy. Montesinos et al.6 The former regards the image as a 2D signal
used a Bayesian network to classify images, and and performs signal enhancement based on the
the classification accuracy can be more than 2D Fourier transform. The representative algo-
90%. Singh et al.7 used a genetic algorithm to train rithms in the latter spatial domain method
a genetic function network and then used the include a local averaging method and a median
trained model for satellite video categorization, filtering meth-od. At this time, the image needs
which solved the problem of inaccurate satellite to be enhanced. Histogram equalization is an
image categorization. Maa et al.8 analyzed an important image enhancement technology that
object-based supervised land cover image classi- can be used for the entire image or the
fication algorithm. Liu et al.9 used polyscale extracted parts of the image. Both the genetic
depth functions to classify scenes from satellite algorithm and the particle swarm algorithm
images at high resolution, and the simulation have the limitation of falling into local mini-
accuracy was significantly improved. Meraa et al. mum. There is also a direct transformation of
10
used feature selection methods to detect image the image to achieve the effect of image
targets, which can achieve real-time detection enhancement, such as Laplacian transforma-
and a high detection rate. tion of the image using Laplacian operator, log
This article summarizes the image process- transformation, and gamma transformation of
ing technology, compares various image tech- the image. Table 1 selects several image
nologies in detail, and explains the limitations enhancement algorithms for comparison. Digi-
of each image processing method. In addition, tal image processing has rapidly developed
this article introduces a machine learning algo- into an independent subject with strong vitality
rithm in image processing technology, and in more than 40 years. Image enhancement
applies a convolution neural network (CNN) to technology has gradually involved all aspects
feature extraction in image processing, so as to of human life and social production, and it
effectively improve the accuracy of image seg- plays a role in the fields of aerospace, biomedi-
mentation, image classification, and target cine, industrial production, and public security.
July/August 2024
91
Authorized licensed use limited to: HKBK College of Engineering. Downloaded on February 07,2025 at 04:41:27 UTC from IEEE Xplore. Restrictions apply.
Special Section on Security, Trust, and Services for AI-Empowered Sensing in the Next-Generation IoT
TABLE 1. Several image enhancement algorithms. contrast is increased, and the brightness cannot
be reversed; the value range of pixel mapping func-
Author Work Algorithm tion is between 0 and 255. The cumulative distribu-
Contrast and tion function is a single growth function, and the
Chang Histogram
brightness
[11] transplantation range is 0–1.
enhancement
The second step is to realize the cumulative
Suresh Contrast and Modified
and Lal brightness differential
distribution function.
[12] enhancement evolution Comparing the probability distribution func-
tion with the cumulative distribution function, the
2D image of the former is uneven, and the latter is
In recent years, the combination of various monotonically increasing. In the process of histo-
optimization techniques has also been of great gram equalization, the calculation formula of map-
interest. One method is the combination of a ping method is as follows:
socket search (CS) with a particle cluster optimi-
X
k
nj
zation algorithm. The cuckoo search is a global sk ¼ K ¼ 1; 2; 3; . . . L 1: (1)
search algorithm based on population. Combin- j¼0
n
ing the particle cluster algorithm and the genetic
algorithm has better results than the separate Image Enhancement Based on the Laplacian
particle algorithm and the genetic algorithm in Operator. The image strengthening algorithm
the best solution approach.13 based on the Laplace operator uses the Laplace
Next, we select several the most common and operator for image strengthening. The major
simple image enhancement algorithms for detailed thought is to degrade the image by using the sec-
description. ond differential of the image. In the image field,
the differentiation is sharpening, and the integral
Image Enhancement Based on Histogram is blurring. The use of second-order differentia-
Equalization. The main principle of the tool for tion to degenerate the image is the use of neigh-
histogram equalization-based image enhancement boring pixels to improve contrast. There is
algorithm is to redistribute the pixel values of the also a Laplacian function in OpenCV, because
image. Its general application scenario is to OpenCV performs the Laplace transform on
increase the local contrast of the image. The an image. The image is a grayscale image, so
image applied by the algorithm needs to have it is equivalent to extracting more edge infor-
similar contrast between the local images of mation of the image. The Laplace transform
the interested part. For example, histogram of digital images generally uses a 33 convolu-
equalization can be used to make the contrast tion kernel to convolve the image, and then
of the overexposed and underexposed images an enhanced image can be obtained. Among
more prominent, and the image with obvious them, the main medium for human transmis-
difference between foreground and back- sion of information is language and image.
ground. Among them, the image enhancement According to statistics, visual information
algorithm still has certain defects. Some accounts for 80% of the various information
images have high peaks and the contrast will received by humans, so image information is
not be naturally enhanced after processing, a very important information transmission
and the grayscale of the transformed image is medium and method. The convolution kernel
reduced, and some details are reduced. is self-defined according to experimental
The calculation process of histogram equali- needs. The convolution kernel used in this
zation algorithm is as follows. article is shown in Figure 1.
In the first step, equalization process, Histo-
gram equalization ensures that the original size Image Enhancement Based on Gamma
relationship remains unchanged in the process of Transformation. Gamma transform mainly rec-
image pixel mapping, that is, the brighter area is tifies pictures having high or weak grayscale val-
still brighter, the darker area is still darker, but the ues to strengthen the comparison and achieve
Authorized licensed use limited to: HKBK College of Engineering. Downloaded on February 07,2025 at 04:41:27 UTC from IEEE Xplore. Restrictions apply.
FIGURE 1. Laplacian convolution kernel.
s ¼ Crg r 2 ½0; 1 (2) blurred, and even the outline of the object is dif-
ficult to see clearly. The features of images
where C is a constant, g is the gamma coefficient, should have the following properties.
and S is the pixel value after transformation.
Choose different g to get different gamma 1) Scale invariance.
curves, as shown in Figure 2. 2) Rotation invariance.
From Figure 2, we can see some rules. g ¼ 1 is 3) It has strong antinoise ability and stable
the dividing line. When g < 1, the small gray robustness to illumination.
value of the image has a strong expansion effect. 4) At the same time, it has lower feature
The smaller the value, the stronger the effect. In dimension.
addition, when the value of g > 1, the expansion
effect of a large gray value of the image will also Table 2 tabulates the comparison of some
be enhanced, and the larger the value, the stron- feature extraction.
ger the effect. In this way, we can change the These are artificially designed features. With
value of gamma to achieve the purpose of the popularity of deep learning, the most popu-
enhancing low gray. lar feature extraction method is feature extrac-
tion based on CNN. Now, we will explain the
Feature Extraction Image features contain feature extraction of HOG and LBP.
the basic information of the image, including
geometric features, shape features, texture fea- HOG Features. The HOG characteristic is a
tures, etc. Image features are very important in local feature and is obtained from computing
the process of image analysis and processing. and counting the grade histogram of the image.
According to the previous research experience, The HOG feature has image geometry and optical
the number of extracted image features is not invariance characteristics because it operates
the more the better. The more image features, on a partial image. The procedure of HOG func-
on the contrary, will increase the time of feature tion selection is as follows.
extraction in the process of recognition and In the first step, the image is converted into
detection, thus reducing the efficiency of detec- a gray image, and the conversion formula is as
tion. Among them, there are many factors that follows:
affect the clarity of image quality. The uneven
outdoor illumination will cause the image gray Gray ¼ 0:3R þ 0:59G þ 0:11B: (3)
to be too concentrated; the image obtained by
In the second step, gamma correction is used
the camera undergoes digital/analog conversion,
to normalize the color information of the image
and noise pollution occurs during line transmis-
to eliminate the color interference. The calcula-
sion. The image quality will inevitably decrease.
tion formula is as follows:
In the lighter case, the image is accompanied by
noise and it is difficult to see the details of the
I2 ðx; yÞ ¼ I1 ðx; yÞGamma (4)
image; in the more severe case, the image is
July/August 2024
93
Authorized licensed use limited to: HKBK College of Engineering. Downloaded on February 07,2025 at 04:41:27 UTC from IEEE Xplore. Restrictions apply.
Special Section on Security, Trust, and Services for AI-Empowered Sensing in the Next-Generation IoT
Feature
Algorithm Limitation
extraction
Unwanted
Multiimage background
Extract ROI
saliency analysis information
appears
Different
Unified capability Rotation and
matching
feature scale-invariant
results for
extraction local features
different data
FIGURE 3. LBP feature extraction.
Pixel and
Multifeature
Digital surface feature-level
classification
model extraction of
is difficult
LBP Features. The LBP feature is a local fea-
urban scenes ture of an image. Its core idea is to use the pixel
Reversible jump
Extract
Too sensitive
value of the center pixel of the image block as the
features, such threshold, and then compare the surrounding pixel
Markov chain to
as rivers,
Monte Carlo experimental values with the threshold. If the value exceeds the
channels, and
sampler settings
roads limit, it shall be recorded as 1, otherwise recorded
as 0. Classify these 1 and 0 to create a binary num-
ber to represent the text information of the image.
where I1 ðx; yÞ is the value before correction, If the image block size is 3 3, an eight-bit binary
I2 ðx; yÞ is the value after correction, and Gamma number is generated. Taking a 3 3 image block
is the correction coefficient. as an example, the LBP extraction process is shown
Then, use the results of formulas (5) and (6) to in Figure 3, and calculated as follows:
calculate the amplitude and phase. The calculation
formula is as follows: X
7
LBP ¼ 2p f gp gc : (6)
p¼0
Gy ðx; yÞ
a ðx; yÞ ¼ tan1 (5) Among them, gc and gp are the pixel value of
Gx ðx; yÞ
the central pixel and the pixel value of the pth
where Hðx; yÞ is the value at (x, y), and Gðx; yÞ and domain pixel, respectively. f(x) is a step func-
aðx; yÞ are amplitude and phase, respectively. tion, and is expressed as follows:
In the fourth step, the image is divided into
1x0
image blocks, and then, the image blocks are f ð xÞ (7)
0 x < 0:
divided into cell units. The shape of cell unit
can be set by itself. For example, the size of SIFT Features. SIFT is a scale-invariant fea-
image block is divided into 16 16, and then, ture inspection method. To achieve scale invari-
each image block is divided into four 8 8 cell ance, SIFT constructs an image scale space,
units. which is constant to the scaling, rotation, and
The fifth step is to count the gradient histo- radiation shift of the image.
gram of each cell to form the feature descriptor First, the image scale space is generated, that
of each cell; suppose that the histogram is is, the original image is sampled according to dif-
divided into nine bins from 0 to 360 to count ferent frequencies to obtain multiple zoomed
the gradient information of 8 8 pixels to form a images; then, the local extremum points in the
9D feature vector. scale space are detected, which may include
In the sixth step, four cells are formed into edge response points and some points with low
an image block, and the HOG feature descrip- contrast, which need to be excluded to leave the
tors of all cells in an image block are put local extremum points, which can reflect the
together to get the HOG feature descriptor of image features more accurately, The histogram
the image block. gradient direction of the region centered on the
Authorized licensed use limited to: HKBK College of Engineering. Downloaded on February 07,2025 at 04:41:27 UTC from IEEE Xplore. Restrictions apply.
TABLE 3. Comparison of image segmentation which merits attention for researchers using this
algorithms. model, and the export limit is very vague.
Image
Advantage Image Target Detection The target detection
segmentation
Cuckoo search, High computational efficiency method could be classified into two sections:
McCulloch method and good convergence 1) target placement and 2) target identification.
The position of the target and the target category
Markov random Use fewer features to achieve
field algorithm higher accuracy are precisely in the known image. Under normal
circumstances, the target in the image is uncertain,
Efficient boundary extraction
Deep CNN length, width, height, angle, etc., is random or
improves classification
there is a situation where the target is not uniform,
Zhenghang firefly Multilevel subdivision and less
algorithm calculation time but includes multiple categories, which bring rec-
ognition and the target position of the day. A cer-
tain degree of complexity.
extreme point is counted, and the maximum
direction is taken as the main direction to gener- Image Filtering Filtering is a common method
ate the feature descriptor. The calculation to eliminate interference in image preprocessing.
method of the feature descriptor is to take a It can not only suppress the image noise, but
16 16 window around the feature and use also ensure that the edge information of the tar-
Gaussian weighting to draw a histogram of the get in the image is not destroyed. There are
gradient direction in eight directions on a 4 4 many methods for image filtering, including
image block, and count the accumulated value of Gaussian filtering, mean filtering, and other lin-
each gradient direction to form a seed point. ear filtering, as well as median filtering and bilat-
The gradient histogram of each seed point con- eral filtering in nonlinear filtering.
tains eight values, a total of 128 values, which
are combined into a 128D SIFT feature vector.
Gaussian Filtering. Gaussian filtering is mainly
The matching of feature points is calculated
used to eliminate Gaussian noise. Its filtering pro-
by the nearest neighbor algorithm, that is, the
cess is to weighted average the gray value of the
Euclidean distance between feature description
image, that is, a 2D scale factor of the Gaussian ker-
vectors is calculated, and the point with the
nel convolutes with the pixels in the image to
smallest distance is selected for matching.
remove the noise.
July/August 2024
95
Authorized licensed use limited to: HKBK College of Engineering. Downloaded on February 07,2025 at 04:41:27 UTC from IEEE Xplore. Restrictions apply.
Special Section on Security, Trust, and Services for AI-Empowered Sensing in the Next-Generation IoT
Authorized licensed use limited to: HKBK College of Engineering. Downloaded on February 07,2025 at 04:41:27 UTC from IEEE Xplore. Restrictions apply.
FIGURE 4. Convolution operation process.
FIGURE 5. Maximum pooling process.
output of the convolution layer. The activa- calculation parameters selected in this document
tion function is shown as follows: include FSIM, SSIM, the accuracy, and the with-
drawal rate.
f ðxÞ ¼ maxð0; xÞ: (8)
FSIM The FSIM refers to the similarity of charac-
teristics between the input image and the final
3) Pooling layer: Reduce the dimensionality of
image, which is calculated as follows:
the output of the excitation. Since the input
passes through the convolutional layer, if the P
S ðxÞPCm ðxÞ
convolution kernel is relatively small, while FSIM ¼ x2XP L : (9)
x2X PCm ðxÞ
maintaining the image depth while reducing
the dimensionality, the pooling layer gener-
SSIM The SSIM entry specifies the format of the
ally uses maximum pooling, and the calcula-
sequence between the last image and the
tion process is shown in Figure 5.
sequence. The calculation formula is as follows:
The pooling process is also moved on the 2mx my þ C1
input graph with a window. The difference SSIM ¼ : (10)
mx 2 þ my 2 þ C1
between the three pooling methods lies in the
value calculation in the window. The maximum
Accuracy Exactly, it is an attempt to underesti-
pooling is the maximum value; the average pool-
mate the overall impact of his work. The calcula-
ing is the average value of the subsampling fast
tion formula is as follows:
elements; and the random pooling is the random
value according to its probability. Pooling is also PN
i¼1 GTi BWi
a kind of special convolution kernel, but the dif- precision ¼ P N
(11)
i¼1 BWi
ference is that pooling acts on noncoincident
regions in the image. where BWi is the binary image of the ith pixel,
G Ti ¼ 1 means that the ith pixel belongs to the
1) Fully connected layer: Refitting features to
region of interest (ROI), and G Ti ¼ 0 means
reduce the loss of feature information, turn-
that the ith pixel does not exist in the ROI.
ing the feature map output by the pooling
layer into a 1D feature vector, this 1D feature Recall Rate The recall rate is the reason for
vector is the feature that can be used for sub- the correct comments correctly provided for in
sequent processing. the overall comments. The calculation type shall
be as follows:
Performance Evaluation Method and PN
Analysis i¼1 GTi BWi
recall ¼ P N
: (12)
i¼1 GTi
Performance Evaluation Method
We know that quality evaluation is very subjec- This metric includes both false positives and
tive. Many parameters must be used for quality false negatives, taking false negatives into account.
evaluation. Moreover, there is opposition to it
being exceptional. Quality evaluation should be Performance Evaluation Analysis
compared with existing technologies to highlight The ROC curve, also known as receiver operat-
the superiority of the algorithm in this article. The ing characteristic curve, is also a performance
July/August 2024
97
Authorized licensed use limited to: HKBK College of Engineering. Downloaded on February 07,2025 at 04:41:27 UTC from IEEE Xplore. Restrictions apply.
Special Section on Security, Trust, and Services for AI-Empowered Sensing in the Next-Generation IoT
Authorized licensed use limited to: HKBK College of Engineering. Downloaded on February 07,2025 at 04:41:27 UTC from IEEE Xplore. Restrictions apply.
5. K. Li, C. Zou, S. Bu, Y. Liang, J. Zhang, and M. Gong, 11. Y. C. Chang, “A flexible contrast enhancement method
“Multimodal feature fusion for geographic image with visual effects and brightness preservation:
annotation,” Pattern Recognit.,vol. 1, no. 73, pp. 1–14, Histogram planting,” Comput. Elect. Eng., vol. 10,
2017, doi: 10.1016/j.patcog.2017.06.036. no. 12, pp. 796–807, 2017, doi: 10.1016/j.
6. J. A. Montesinos, M. Martınez-Durba n, J. del Sagrado, compeleceng.2017.11.004.
I. M. del Aguila, and F. J. Batlles, “The application of 12. S. Suresh and S. Lal, “Modified differential evolution
Bayesian network classifiers to cloud classification in algorithm for contrast and brightness enhancement of
satellite images,” Renewable Energy, vol. 10, no. 97, satellite images,” Appl. Soft Comput., vol. 2, no. 61,
pp. 155–161, 2016, doi: 10.1016/j.renene.2016.05.066. pp. 622–641, 2017, doi: 10.1016/j.asoc.2017.08.019.
7. A. Singh and K. K. Singh, “Satellite image 13. H. Singh, A. Kumar, L. K. Balyan, and G. K. Singh, “A
classification using genetic algorithm trained radial novel optimally weighted framework of piecewise
basis function neural network, application to the gamma corrected fractional order masking for satellite
detection of flooded areas,” J. Vis. Commun. Image image enhancement,” Comput. Elect. Eng., vol. 7,
Representation, vol. 2, no. 42, pp. 173–182, 2016, doi: no. 10, pp. 245–261, 2017, doi: 10.1016/j.
10.1016/j.jvcir.2016.11.017. compeleceng.2017.11.014.
8. L. Maa, M. Li, X. Mac, L. Cheng, P. Dua, and Y. Liu, “A
review of supervised object-based land-cover image
Qiong Qiao was born in Jining, Shandong, China, in
classification,” ISPRS J. Photogram. Remote Sens.,
1989. She is currently studying with the School of Infor-
vol. 10, no. 130, 277–293, 2017, doi: 10.1016/j.
mation and Communication Engineering, Communica-
isprsjprs.2017.06.001.
tion University of China, Beijing, China. Her research
9. Q. Liu, R. Hang, H. Song, and Z. Li, “Learning
interests include picture processing, artificial intelli-
multiscale deep features for high-resolution satellite gence, and audio signal processing. Qiao received
image scene classification,” IEEE Trans. Geosci. the master’s degree in communication and information
Remote Sens., vol. 1, no. 56, pp. 117–126, Jan. 2018, systems from the Communication University of China,
doi: 10.1109/TGRS.2017.2743243. in 2013. Contact her at [email protected].
10. D. Meraa, V. Bolon-Canedo, J. M. Cotosa, and A.
Alonso-Betanzos, “On the use of feature selection to
improve the detection of sea oil spills in SAR images,”
Comput. Geosci., vol. 1, no. 100, pp. 166–178, 2016,
doi: 10.1016/j.cageo.2016.12.013.
July/August 2024
99
Authorized licensed use limited to: HKBK College of Engineering. Downloaded on February 07,2025 at 04:41:27 UTC from IEEE Xplore. Restrictions apply.