0% found this document useful (0 votes)
3 views14 pages

Underwater Image Processing and Target Detection From Particle Swarm Optimization Algorithm

This paper presents a novel approach for underwater image processing and target detection using a combination of visual saliency analysis (VSA) and particle swarm optimization (PSO) within an intelligent blockchain framework. The proposed method enhances image quality, reduces distortion, and improves target detection accuracy compared to existing algorithms, demonstrating significant improvements in image entropy and reduced relative errors. The findings offer valuable insights for future research in underwater image processing and target detection technologies.

Uploaded by

eng.tasneem0987
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views14 pages

Underwater Image Processing and Target Detection From Particle Swarm Optimization Algorithm

This paper presents a novel approach for underwater image processing and target detection using a combination of visual saliency analysis (VSA) and particle swarm optimization (PSO) within an intelligent blockchain framework. The proposed method enhances image quality, reduces distortion, and improves target detection accuracy compared to existing algorithms, demonstrating significant improvements in image entropy and reduced relative errors. The findings offer valuable insights for future research in underwater image processing and target detection technologies.

Uploaded by

eng.tasneem0987
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Signal, Image and Video Processing (2025) 19:132

https://doi.org/10.1007/s11760-024-03638-8

ORIGINAL PAPER

Underwater image processing and target detection from particle


swarm optimization algorithm
Yangmei Zhang1 · Yang Bi1 · Junfang Li1

Received: 6 September 2023 / Revised: 2 September 2024 / Accepted: 11 October 2024 / Published online: 14 December 2024
© The Author(s) 2024

Abstract
The underwater image obtained is difficult to satisfy human visual perception because of the particle scattering and water
absorption phenomena when visible light propagates underwater. In underwater images, light absorption easily leads to image
distortion and reduction of image contrast and brightness. Therefore, this work aims to improve the quality of underwater
image processing, reduce the distortion rate of underwater images, and further improve the efficiency of underwater image
extraction, processing, and tracking. This work combines intelligent blockchain technology in emerging multimedia industries
with existing image processing technology to improve the target detection capability of image processing algorithms. Firstly,
the theory of visual saliency analysis (VSA) is studied. The steps of image processing using VSA are analyzed. Based on
the original Itti model, the visual significance detection step is optimized. Then, the theoretical basis and operation steps of
particle swarm optimization (PSO) algorithm in intelligent blockchain technology are studied. VSA theory is combined with
PSO to design underwater image processing algorithms and target detection optimization algorithms for underwater images.
The experimental results show that: (1) the method has a higher F value and lower Mean Absolute Error. (2) Compared with
the original image, the restored image entropy through this method is greatly improved, and the information in the image
increases. Therefore, this method has good performance. Besides, this method performs well in image definition, color, and
brightness. The quality of the restored image through this method is better than that of other algorithms. (3) Compared with
similar algorithms, the relative errors of this method are reduced by 2.56%, 3.24% and 3.89%, respectively. The results show
that the method has high accuracy. The research results can provide a reference for future underwater image processing and
target detection research. In addition, the designed underwater image processing and target detection and tracking algorithms
can improve the detection efficiency and accuracy of underwater targets and help to accurately obtain underwater target
images.

Keywords Visual saliency analysis · Emerging multimedia · Underwater image processing · Particle swarm optimization
algorithm · Itti model · Target tracking

1 Introduction global production. Also, the ocean is rich in nickel, cobalt,


and other mineral resources, with much higher content than
The ocean is the origin of life, a crucial space for human land [2]. However, humans have only exploited about 5% of
survival, and a precious treasure for sustainable develop- the ocean. Hence, it is of great significance to further explore
ment. With the progress of society, the understanding of the and exploit the ocean. Nowadays, faced with the pressure
strategic position and value of the ocean is deepening. The of the continuous shortage of land resources, human beings
mineral resources in the ocean mainly include seabed oil, focus more on the development and rational application of
combustible ice, and natural gas [1]. Among these resources, marine resources. Due to the marine environment’s partic-
oil and natural gas production has reached one-fourth of ularity, the ocean’s exploration and development are often
beyond human capacity. In order to replace people to com-
B Yangmei Zhang plete underwater operations, various underwater robots come
[email protected] into being [3]. In order to improve people’s understanding of
underwater images and expand the exploration of the marine
1 School of Electronic Engineering, Xi’an Aeronautical
field, the underwater target detection and tracking algorithm
Institute, Xi’an 710077, China

123
132 Page 2 of 14 Signal, Image and Video Processing (2025) 19:132

is studied and optimized based on the existing research to and summarized the existing relatively mature and represen-
help improve the quality and efficiency of underwater target tative underwater image processing models, and divided the
detection. underwater image processing models into seven categories:
Underwater robot technology has gradually become a enhancement, defogging, noise reduction, segmentation, sig-
hotspot and trend of research. Underwater robots can replace nificant target detection, color constancy and restoration.
humans in underwater operations, such as marine fishery Then, they objectively evaluated the current and future devel-
exploration, seawater sampling, and resource exploration. opment trends of underwater image processing [10]. Qian
It can also break through the limitations of human under- et al. [11] converted the original low illumination image
water operations and can operate at great depths and in from the color space of red, green and blue to the color
harsh environments for long periods. For example, China’s space of hue saturation intensity. Then, the image’s over-
manned submersible "Striver" has carried out deep-sea oper- all brightness was adaptively improved by using the bilateral
ations many times, and the maximum diving depth is more gamma correction function and the cuckoo search algorithm.
than 10,000 m [4]. With the deeper exploration of the In addition, a brightness-preserving double histogram con-
ocean, the role of underwater robots will become more struction based on a visual saliency algorithm was proposed
important. Underwater robots have a great place to use. At to perform brightness conservation and contrast analysis for
present, the most widely used unmanned underwater vehi- low-illumination color images. Finally, the processed color
cles are divided into cabled remote-control vehicles and space was converted to obtain the enhanced image [11]. Kan-
uncabled autonomous underwater vehicles. A cable-operated nan [12] tried to identify objects in underwater images using
submersible is a device controlled from the surface with an adaptive Gaussian mixture model. The Gaussian mixture
operating tools such as thrusters. The surface powers under- model performs accurate object segmentation with a pre-
water televisions and underwater manipulators. On the other defined number of clusters. The initialization of parameter
hand, a cable-free Autonomous Underwater Vehicle (AUV) set by optimization techniques such as genetic algorithm,
has its own energy and certain intelligence and can auto- PSO and differential evolution was analyzed. Differential
matically complete navigation planning, obstacle avoidance, evolution is famous for its accurate decision-making in fewer
and operation implementation according to the underwater iterations, and has been proven to be more suitable for initial-
environment and operation tasks in water [5]. The envi- izing the number of clusters of the Gaussian mixture model.
ronment sensing layer of AUV is the basis of intelligent It is further used for object recognition and applies the inter-
underwater vehicles. Therefore, underwater target detec- nal distance shape-matching technology [12]. To sum up,
tion based on AUV has important research significance. previous studies have shown that in the field of underwa-
Accurate identification of underwater targets is a difficulty ter target detection and tracking, more scholars use visual
in underwater target detection by AUV, which is also an saliency theory alone or smart blockchain technology or PSO
important research content in computer vision [6]. Particle technology alone, and fewer researchers use the two tech-
swarm optimization (PSO) algorithm mainly includes opti- nologies together for research. Besides, using visual saliency
cal, acoustic and magnetic detection technologies. A single alone for analysis cannot apply to more complex images, and
sensor is affected by various factors. These factors will reduce there are large restrictions on use. Therefore, this work com-
the system’s reliability and identification accuracy, lead- bines the two methods and theories. On the basis of ensuring
ing to misjudgment and identification of underwater targets. the quality of underwater image extraction, the theory and
Multi-sensor information fusion can effectively improve the application scope of research methods are expanded.
system’s robustness, expand the observation range, enhance Underwater image target detection and tracking tech-
the data credibility, and improve the system recognition abil- nology is an important part of underwater light vision
ity [7]. The underwater target detection system based on the technology. Image segmentation has always been a clas-
PSO algorithm is based on the image. It adopts image pre- sic problem in image processing, especially for underwater
processing, segmentation, morphology processing, feature images. The tracking and recognition of the target image
extraction and other technologies to complete target recog- is the key step of the whole target recognition system.
nition and realize machine vision. However, the detection According to the specific content of underwater image tar-
objects in the underwater scene usually differ from the back- get detection and tracking, this work takes the underwater
ground, such as obstacles on the route, companions in the image target fusion detection and tracking method in the
cooperative underwater robot, and targets in the underwater navigation process as the starting point. Based on visual
fishing operation [8]. Unlike general feature extraction meth- saliency theory and smart blockchain technology, image tar-
ods, the target detection method based on visual saliency get detection algorithm and autonomous tracking are studied
analysis (VSA) compares the target with the image back- in combination with PSO. The research results expand the
ground. Hence, it is more suitable for underwater scenes with applicable field of the underwater target detection algorithm,
relatively simple backgrounds [9]. Jian et al. [10] surveyed

123
Signal, Image and Video Processing (2025) 19:132 Page 3 of 14 132

improve the quality and efficiency of underwater target detec- is selected. Finally, the return inhibition method completes
tion, and provide further technical support and suggestions the focus shift [18]. The model mainly includes feature
for the autonomous underwater detection of the follow-up extraction, saliency map generation and attention focus shift.
robot. Meanwhile, it provides technical support for subse- (1) Feature extraction: the extracted image of the primary
quent marine exploration and marine resource collection. visual features is obtained from the external input image
information in the feature extraction module. The primary
visual features mainly include color, motion, orientation, and
2 Experimental methods and procedures brightness. (2) Saliency map generation: images in natural
environments contain many redundancies, and the brain can
2.1 Visual saliency analysis effectively remove these redundancies and focus on useful
information. When the visual attention calculation model
VSA simulates the human visual attention mechanism, which completes the image processing, it also needs to effectively
ignores irrelevant areas in the image and focuses attention on eliminate the redundant information in the natural image to
the object of interest. Visual saliency detection highlights the further prepare for the extraction of salient information. The
salient object in the visual scene and obtains the binarized information contained in a still image can be divided into
mask image of the object after segmentation [13]. Visual two types: salient information and background information.
saliency detection is shown in Fig. 1: Significant information can be obtained by removing the
Underwater scenes are usually simple; the background redundant information in the image. After obtaining the fea-
is mostly water or seabed sand. In the underwater target ture saliency maps of brightness, color and direction, features
detection work, the target is generally different from the back- of these saliency maps are fused to obtain the final com-
ground. Therefore, the VSA method can be used to detect prehensive saliency map. (3) Attention focus shift: after the
objects in simple underwater scene images [14]. final saliency map is generated, each target displayed in the
According to the information processing mechanism, saliency map attracts attention focus through the competition
there are two kinds of visual saliency detection. One is the mechanism. The competition of each target in the saliency
bottom-up approach, which is driven by data. Objects that map is realized by the Winner-Take-All (WTA) competition
strongly contrast an area in the visual field and the surround- mechanism. The winner detected in the WTA mechanism is
ing area will be noticed [15]. This approach has no task the focus of attention with higher saliency. Since the target
guidance and is usually guided by underlying visual features to be noticed is always the most significant among all the
such as color, intensity, orientation, and texture. The other is a targets participating in the competition and will always win
top-down approach, which is mission-driven. This is related in the competition, if there is no specific control mechanism,
to preset targets, such as expected information, color, and the focus will always point to the same target, while other
object features [16]. targets will not get the chance to be noticed, and the focus of
The classic Itti model belongs to the bottom-up way, and attention cannot be shifted. Attention and transfer of focus
the extracted features are color, brightness, and orientation. can be achieved through the detection mechanism of pro-
This work analyzes the visual saliency detection process hibiting return, the principle of transferring nearby, and the
based on the Itti model [17]. Itti visual saliency model is a determination of the size of the attention area [19].
visual attention model designed based on the visual nervous Firstly, feature extraction is carried out. A nine-layer
system of early primates. The model first uses the Gaussian Gaussian pyramid is constructed, and the size of the Gaussian
sampling method to construct the Gaussian pyramid of image filter is 5 × 5. The original image is used as the 0th layer for
color, brightness and direction. Then, it uses the Gaussian layer-by-layer filtering to obtain the next layer. Then features
pyramid to calculate the brightness feature map, color fea- are extracted for each layer [20]. The specific calculation is
ture map and direction feature map of the image. Finally, the shown in Eqs. (1)–(5):
brightness, color and direction saliency maps can be obtained
by combining the feature maps of different scales, and the r+g+b
I (1)
final visual saliency map can be obtained by adding. The 3
specific structure of the Itti model is shown in Fig. 2:
In Fig. 2, the model extracts primary visual features: color, g+b
Rr− (2)
brightness, and orientation of an input image. It uses central- 2
periphery operations at multiple scales to produce feature
r+b
maps that embody saliency measures. After these feature Gg− (3)
maps are combined, the most salient spatial position in the 2
image is obtained using the competition mechanism of the r+g
winner taking all in biology. As such, the position of attention Bb− (4)
2

123
132 Page 4 of 14 Signal, Image and Video Processing (2025) 19:132

Fig. 1 Visual Saliency detection


map a Underwater target map;
b visual saliency processing;
c Target recognition result map

Y  r + g − 2(|r − g| + b) (5) 2.2 Image recognition system based on smart


blockchain technology
R, G, B and Y are the image’s red, green, blue and yellow
color characteristics. r. g, and b are the intensity values of Blockchain technology is a new distributed infrastructure and
the image’s red, green and blue channels. I is the brightness computing paradigm. It uses the block and chain data struc-
characteristic. Then, the feature map is further generated. ture to verify and store data, uses distributed node consensus
According to the "center-periphery difference", the differ- algorithm to generate and update data, uses cryptography to
ence between the images of different layers of the pyramid is ensure the security of data transmission and access, and uses
calculated to obtain the brightness, color and direction feature smart contracts composed of automated script code to pro-
images. Before calculating the difference value, the method gram and operate data. In short, in the blockchain system,
of reducing the difference value of the image is adopted to the transaction data generated by each participant will be
unify the size of the image [21]. The calculation method is packaged into a data block after some time. The data blocks
as follows. are arranged in chronological order to form a chain of data
blocks. All participants have the same data chain that cannot
I(c, s)  |I(c)I(s)| (6) be tampered with unilaterally [24]. Any information modifi-
cation can only be carried out with the consent of the agreed
proportion of subjects. Besides, only new information can be
RG(c, s)  |(R(c) − G(c))(G(s) − R(s))| (7)
added, and the old information cannot be deleted or modified.
It can achieve information sharing and consistent decision-
BY(c, s)  |(B(c) − Y(c))(Y(s) − B(s))| (8) making among multiple subjects, and ensure that the identity
of each subject and the transaction information between sub-
O(c, s, θ)  |O(c, θ)O(s, θ)| (9) jects are not tampered with, open and transparent. Compared
with traditional networks, blockchain has two core charac-
 is channel-wise multiplication. RG (c,s) and BY (c,s) teristics. The first is that data are not easy to tamper with, and
are the red-green color feature map and the blue-yellow color the second is decentralization. Based on these two character-
feature map. I (c,s) is the brightness feature map. O (c, s, θ) istics, the information recorded by the blockchain is more
is the direction feature map. c is the number of central layers. authentic and reliable [25].
s is the number of surrounding layers. θ is the operation of The proposed image recognition system based on smart
unifying the image size by using the method of conducting blockchain technology adopts the form of Web, and adopts
interpolation to small images and then subtracting [22]. the architecture mode of front and back separation. The front
Finally, a saliency map is generated. First, the feature maps end is implemented using the Vue framework. Vue’s core
of each feature are normalized to obtain the luminance, color, is the view layer, which is easy to use and integrated with
and orientation saliency maps. Then, the saliency map of third-party libraries or existing projects. This is a major
each color feature is normalized to obtain the comprehensive Vue componentization feature, making it easy to create
saliency map. single-page applications. The back-end blockchain includes
In the preliminary detection, the acquisition of a saliency blockchain services and tamper detection services. A block
map mainly considers brightness, color, direction, and other is equivalent to a storage unit. With Bitcoin as an example,
factors. The depth factor also affects the saliency of under- Bitcoin transaction records are stored in blocks. Bitcoin sys-
water vision. Next, the preliminary detection results are tem generates a block every 10 min, and each data block
combined with depth information to generate a comprehen- generally includes two parts: block header and block body.
sive saliency map. The depth features to refine and optimize The block header stores the hash values of the previous and
the saliency map to improve the quality of the saliency map
[23].

123
Signal, Image and Video Processing (2025) 19:132 Page 5 of 14 132

Fig. 2 Itti model a Modules The extraction map of


contained in Itti; b Structure primary visual features is
Feature extraction obtained in the feature
diagram of Itti model extraction module

The redundant information


in the image is removed
ITTI model Saliency generation and significant information
is obtained

The various goals shown in


the saliency chart compete
Focus shift
for attention through a
mechanism

(a)

Input Object

Linear Filtering

Color Brightness Direction

Central-peripheral Difference And


Normalization

12 6 24

Cross-scale Merging And Normalization

Linear Merge

Winner Takes All

(b)

123
132 Page 6 of 14 Signal, Image and Video Processing (2025) 19:132

current blocks’ hash values, which are linked to each other, Suppose the adaptation function by f . In that case, the
similar to the "linked list" in the data structure [26]. individual optimal position yi of particle i can be adjusted
The blockchain service of this system adopts the according to Eq. (10):
blockchain architecture of Hyperledger Fabric. Due to hard-

ware limitations, this work chooses to build the nodes of yi (t) if f (xi (t + 1)) ≥ f (yi (t))
blockchain services in five Dockers on a Linux server to yi (t + 1)  , (10)
xi (t + 1) i f f (xi (t + 1)) < f (yi (t))
simulate multiple blockchain server nodes. These nodes
include two Peer nodes, a Certificate Authority (CA) node,
an Orderer node, and a couchdb node [27]. xi is the current position of particle i. yi is the individual
optimal position of particle i [33].
2.3 Particle swarm optimization It is set that the particle neighborhood size is l and the
particle population size is s. When l < s, the PSO algorithm
PSO is based on Swarm Intelligence (SI), like the ant colony is the local version of the PSO algorithm. When l  s, that is,
algorithm. Its idea comes from artificial life and evolutionary the neighborhood of the particle is the whole population, the
computing theory. Kennedy and Eberhart first proposed PSO PSO algorithm at this time is the global version of the PSO
to optimize the problem by simulating the social behavior of algorithm, and the optimal location of the population  y can
birds [28]. In a sense, PSO, like evolutionary algorithms, is be obtained from the following equation.
based on the population, and each individual has an adapta-
tion function value. The adjustment of each individual (called y(t) ∈ {y0 , . . . , ys }  min{ f (y0 ), . . . , f (ys )}
 (11)
particle) in PSO is also similar to the crossover operator in
evolutionary algorithms, but it stems from the simulation of Neighborhood topologies influence the performance of
social behavior rather than the idea of survival of the fittest the PSO algorithm. Usually, each structure has its advan-
[29]. Unlike evolutionary algorithms, in PSO, each benefits tages, and different structures need to be selected for different
from its previous motion history, whereas in evolutionary practical problems. Star and ring are two commonly used
algorithms, no such mechanism exists. Moreover, PSO is topological structures [34]. The specific steps of the PSO
simple to implement, and there are few parameters to adjust algorithm are shown in Figure 3:
[30]. Figure 3 shows the steps of the PSO algorithm. (1) Ini-
In a PSO system, a population containing a certain num- tialize particle swarm velocity and position, inertia factor,
ber of individuals (often referred to as particles) moves in acceleration constant, the maximum number of iterations,
the search space, where each particle represents a potential and minimum error of algorithm termination. (2) Evaluate
solution to a particular optimization problem. The optimal the initial fitness value of each particle. That is to substitute
position influences the position of each particle in the pop- it into the objective function. (3) The initial adaptation value
ulation in its own movement (individual experience). The is taken as the local optimum of the current particle (depen-
optimal position of the particle in its neighborhood (neigh- dent variable), and the position is taken as the position of
borhood experience) [31]. When the neighborhood of a the current local optimum (independent variable). (4) Take
particle is the whole particle population, the optimal posi- the best local optimum (initial adaptation value) among all
tion of the neighborhood corresponds to the optimal global particles as the current global optimum and as the current
particle. The algorithm is called the global PSO algorithm. global optimum value (the strongest one). (5) Replace the
Accordingly, if a small neighborhood is used in the algo- velocity update with updating particles’ flying speed. The
rithm, it is usually called the local PSO algorithm. The global amplitude processing is limited to exceeding the maximum
PSO converges fast but is easy to fall into a local minimum. particle flying speed of the particle. (6) Then, substitute the
By comparison, local PSOs can usually search for a better displacement update expression to update the position of each
solution, but it is slow. Additionally, a problem-related adap- particle. (7) Compare whether the corresponding adaptive
tation function is needed to evaluate the performance of each value is better than each particle’s historical local optimal
particle in different optimization problems [32]. value. If it is good, the current adaptive value is the optimal
local value of the particle, and the corresponding position is
the optimal local position. (8) Find out the optimal global
value in the current particle swarm, and take the correspond-
ing position as the optimal global position. (9) Steps 5 ~ 8 are
repeated until the set minimum error is met or the maximum
number of iterations is reached. (10) The optimum, position,
the local optimum, and the local position of other particles
are outputted [35].

123
Signal, Image and Video Processing (2025) 19:132 Page 7 of 14 132

Calculate fitness

Start

Initialization Initialize particles


parameters

Whether the
number of
Final
iterations is
reached
Update particle
Find the global Recalculate particle
velocity and position
optimal particle fitness

Fig. 3 Steps of the PSO algorithm

In the star structure, only one particle is selected as the reaches a certain threshold, the particle population is consid-
center and is connected to the other particles in the group, ered to have converged at this time. The iterative equation of
while all the other particles are connected to the center only. particle velocity is as follows:
For the ring topology, the particles are circularly distributed.
Each particle is connected to one particle on the left and    
k+1  w • v k + k y k − x k + k 
vid k k k
another on the right. In this work, von Neumann’s topology id 1 id id 2 yd − xid + k3 (y jd − xid )

establishes a new PSO model. In a von Neumann structure, (13)


each particle is connected to its neighborhood [36].
Suppose the current position of moving particles happens v is the velocity vector, and yidk is the optimal historical posi-
to be the optimal global position. In that case, the velocity tion of the i-th particle to the k-th time. w is the inertia factor,
iteration of particles will only depend on the iteration speed. which is non-negative. The larger the value is, the stronger
This will lead to "premature." Therefore, some researchers the global optimization ability is. The smaller the value is,
proposed an improved method to ensure PSO convergence the stronger the local optimization ability is.
to the local optimum. The strategy is to update the optimal k1 , k2 and k3 are self-set parameters. Equation (14) is the
global particle in a new way and reset the particle’s position calculation equation of y kjd .
to the global extremum point. The other particles are still iter-
atively updated according to the original equation. Compared  
f xik − f (y kj )
with the original PSO algorithm, the convergence speed of y kjd    (14)
 k k 
this method is greatly improved. The algorithm principle is y jd − xid 
as follows:
The parameter values of PSO are analyzed. (1) Popula-
(1) Randomly initialize the whole particle population.
tion size N: population size N affects the algorithm’s search
(2) Run the algorithm until it converges to a local optimum,
ability and calculation amount. PSO has low requirements for
and save the position of this point.
population size, and it can achieve good solution results when
(3) Repeat steps (1) and (2) until stopping criteria are met
taking a size of 20–40. However, for more difficult problems
[37].
or specific categories of problems, the number of particles
can be 100 or 200. (2) Particle length D: the particle length
In step (2), the convergence speed of the algorithm is con-
D is determined by the optimization problem itself, which
trolled according to the change rate of the objective function,
is the length of the solution. The optimization problem itself
and the calculation of the change rate reads:
determines the particle range R, and each dimension can be
y(t)) − f (
f ( y(t − 1)) set with different ranges. (3) Maximum speed: each dimen-
fratio  (12) sion of the maximum speed can generally take 10%–20% of
f (
y(t))
the search space of the corresponding dimension. (4) Inertia
If fratio is less than a self-defined threshold, then the weight: inertia weight controls the influence of the previ-
counter is incremented by one bit. When the counter finally ous speed on the current speed and is adopted to balance the

123
132 Page 8 of 14 Signal, Image and Video Processing (2025) 19:132

exploration and development ability of the algorithm. Gener-


ally, it is set to decrease linearly from 0.9 to 0.4, and there is Set the number of iterations
Step1 and initialize
also a nonlinear decrease setting scheme. It can be set in the
way of fuzzy control, or take random values between [0.5,
1.0]. It is set to 0.729 and k1 , k2 , and k3 are set to 1.49445,
which is conducive to the algorithm’s convergence. (5) Ter-
mination condition: the termination condition determines the
end of the algorithm operation, which is determined by the Set the objective function and
Step2 estimate the initial value
specific application and the problem itself. The maximum
number of cycles is set to 500, 1000, and 5000, or the maxi-
mum number of function evaluations. The algorithm can be
solved to obtain an acceptable solution as the termination
condition, or it can be terminated if the algorithm has not
Step3 Set particle position
improved in a long period of iteration.

2.4 Underwater target image detection and analysis


using PSO algorithm

Image segmentation is a key step from image processing to Adjust the speed of each
Step4
particle
image analysis. Because of its simple principle, the threshold
rule has become the simplest and most commonly used tech-
nique in image segmentation. In the research and application
of images, people are usually only interested in some parts or
some regions of the image. These parts are often referred to Adjust the position of each
as the target or foreground (other parts are referred to as the Step5 particle
background). Generally, they correspond to specific, unique
areas in the image. Detection objects must be separated and
extracted from the image to be further used. In a broad sense,
image segmentation groups and clusters image pixels accord-
ing to the similarity criteria of some image features or feature
sets (including pixel gray level, color, and texture). The image Step6 A new iteration
plane is divided into several non-overlapping regions with
some consistency. The features of pixels in the same region
are similar. However, when there is non-consistency, there are
abrupt changes in pixel features between different regions.
The key of threshold segmentation is selecting the opti-
mal threshold quickly and effectively. The proposed method Go back to step (2) and repeat
utilizes the PSO algorithm to optimize the two threshold Step7
the operation until it stops
segmentation methods. The specific segmentation steps are
shown in Fig. 4:
In Fig. 4, threshold segmentation steps using PSO are as Fig. 4 Step diagram of the threshold segmentation algorithm
follows. (1) Set the iteration number t to 0. The size of the
population S is defined as m, and it is randomly initialized
such that the position p0t of each particle satisfies some prede- At the same time, in image recognition, the feature extrac-
fined conditions. (2) According to the specific optimization tion of the target and the classifier’s design is the key to the
problem, an appropriate objective function F() is established. whole recognition process. An important feature of the com-
The adaptive function value F( pit ) is estimated for each parti- puter system in image recognition is invariance to the image’s
cle. (3) Set the particle’s position with the optimal adaptation translation, rotation, and proportion transformation. Thus,
function value in the population as the optimal gBest. (4) invariance recognition is an important work in image recog-
Adjust the movement speed of each particle. (5) Adjust the nition. The recognition result is insensitive to the target’s
position of each particle. (6) Let t  t + 1 to carry out a new position, orientation, size, and deformation in a reasonable
round of iteration. (7) Go back to step (2) and perform the range. There are two ways to realize invariance recognition.
operation again until the stopping criterion is met. (1) The invariance recognition ability of the classifier. (2)

123
Signal, Image and Video Processing (2025) 19:132 Page 9 of 14 132

Neurons
Input In Fig. 5, the scalar input P is multiplied by the weight
W to WP. Then, it is fed into the accumulator and added to
P the bias value B to obtain the value, often referred to as the

N
F
A net input. The input to the transfer function F. The scalar
W output A is obtained by the operation of F. At the same time,
BP neural network. F is the input layer node. j denotes the
hidden layer node. k represents the node of the output layer.
A=F(WP+B)
This work targets image recognition of underwater targets.
Thus, each input node of the network represents a component
(a)
data of the image feature vector. The output node represents
Input layer Hidden layer Output layer the category number. BP algorithm is divided into two stages.
In the first stage, the input information passes from the input
m1 ... n1 layer to the output layer to calculate the output value of each
unit layer by layer. In the second stage (backpropagation
process), the output error of each unit in the hidden layer
is calculated forward layer by layer. This error corrects the
m2 ... n2 weight of the previous layer. The activation function of the
network node adopts a hyperbolic tangent function, so the
...

input–output relation of the network is defined through this


...

...

activation function (the forward process). The PSO algorithm


mi ... ni is combined with BP neural network, and the specific flow
j
k chart is shown in Fig. 6:
(b) Figure 6 shows the recognition process combining the
PSO algorithm and BP neural network. Firstly, the invari-
Fig. 5 ANN structure diagram a Neuron structure diagram; b Structure ant matrix features of the input image are extracted, and the
diagram of Backpropagation (BP) neural network
invariant matrix features of the one-dimensional input image
of the target are taken as the recognition features. After fea-
ture extraction and vector standardization, the input image is
sent to the neural network optimized by PSO for classification
The extracted features are invariant. Generally, both meth- and discrimination. After classification and discrimination,
ods focus on extracted features’ invariance, namely feature output recognition results.
invariants. Besides, the classifier will also affect the recogni-
tion rate. The Artificial Neural Network (ANN), also known
as Parallel Distributed Processing, is a network formed by 3 Results and discussion
the interconnection of many artificial neurons similar to nat-
ural nerve cells. ANN solves problems completely different 3.1 Saliency target outcome analysis
from traditional statistical methods. It stimulates the human
brain’s thinking and connects many neurons into a complex The proposed method is tested on the Water-Net dataset. Four
network. The network is trained with known samples, similar visual saliency algorithms: Context-Aware (CA), Histogram-
to human brain learning. The ANN stores nonlinear relation- based Contrast (HC), Graph-Regularized (GR), and Spectral
ships between variables, similar to the memory function of Residual (SR), are selected for comparison. The F-measure
the human brain. The stored network information classifies value (an index to evaluate the degree of fit between the pre-
or predicts unknown samples, similar to the associative func- dicted saliency map and the true value of the saliency map)
tion of the human brain. It is an intelligent data processing and the Mean Absolute Error (MAE) of each algorithm are
method. Its ability to deal with nonlinear relational data is calculated and compared in Fig. 7:
unmatched by other methods. Figure 7 reveals that the F value of GR is 0.806, the F
ANN comprises many neurons with nonlinear mapping value of the algorithm proposed is 0.843, the F value of CA
connected by weight coefficients. The information of the is 0.599, the F value of HC is 0.769, and the F value of SR
network is distributed and stored in the connection weight is 0.777. The MAE value of CA is 0.223, the MAE value of
coefficient, which makes the network have good parallel pro- HC is 0.23, the MAE value of GR is 0.217, the MAE value of
cessing ability, nonlinear processing ability, and robustness. SR is 0.271, and the MAE value of the algorithm proposed is
The basic processing unit of ANN, the neuron’s model, is 0.166. Therefore, both GR and the proposed method have a
shown in Fig. 5: large F value and small MAE. Compared with other methods,

123
132 Page 10 of 14 Signal, Image and Video Processing (2025) 19:132

BP Neural
Processed Image Network
2
PSO Algorithm

Invariant Matrix Feature Vector Processed Image Discriminant


Image Input Feature Extraction Normalization 1 Output Result
Mechanism

Fig. 6 Image recognition flow chart

F value 3.2 Analysis of underwater image sharpening


Accuracy

1.0 The underwater images used in the experiment are derived


from the publicly available underwater image dataset con-
0.8 taining 950 images. The proposed underwater image target
detection algorithm is compared with the Dark Channel Prior
lue

0. 6
(DCP) and Underwater DCP (UDCP) algorithms. Each algo-
Numerical va

0.4 rithm is calculated to evaluate the two-dimensional image


entropy, underwater color image quality, and image natural-
0. 2
ness of the original image in Fig. 8:
0. 0 Figure 8 shows the comparison of the restoration results
e sed
Th ropo ithm
p gor
al SR
Ac
c ur
of the two-dimensional entropy of underwater images. Com-
ac
y
GR pared with the original image, the DCP method test sample
HC results are 9.4434, 9.2938, 9.5743, 9.9875, 9.4773, and
F
va
lu

CA 10.3662. The image entropy of the restored image by the


e

DCP method is slightly larger than that of the original


(a) image, which has a certain effect, but is not obvious. Com-
MAE pared with the original image, the test sample results of the
Recall UDCP method are 9.789, 10.7861, 9.5438, 10.002, 10.7817
0.8 and 10.924, respectively. The test results of the algorithm
0.7 designed are 10.885, 10.6195, 9.8634, 10.3524, 11.4143, and
0.6
11.0356. Hence, the proposed method’s image entropy has
lue

0.5
been greatly improved, and the amount of information in the
Numerical va

0.4

0.3 image has increased. In particular, the image entropy of the


0.2 designed method is higher, and the information value in the
0.1

0.0
image is higher in contrast, so the designed method has more
e

al
sed
Th ropo ithm
p g or
SR
Re
c all
advantages. Through the evaluation of the quality of under-
GR water color images, the average score of the original image
HC
is 11.95, the average score of the DCP method is 15.13, the
M
A
E

CA
average score of the UDCP method is 21.76, and the average
(b) score of the underwater target detection algorithm designed
is 27.82. Therefore, the quality of the recovered image by the
Fig. 7 Saliency target result graph a F value; b MAE DCP method is slightly higher than the original image, and
the definition of the recovered image by the UDCP method
is greatly improved. The UDCP method is better than the
DCP method. Meanwhile, this method is superior to the
DCP method and UDCP method in terms of clarity, color
and brightness. The evaluation of image naturalness is taken
they have greater advantages and the results are consistent
as an indicator. The average score of original image natu-
with subjective evaluation. Compared with the GR method,
ralness is 11.97, the average score of the DCP method is
the algorithm proposed has a higher F value and lower MAE
12.47, the average score of the UDCP method is 11.41, and
value. Therefore, the proposed algorithm is superior to the
the average score of the method proposed is 11.27. Hence,
GR method.

123
Signal, Image and Video Processing (2025) 19:132 Page 11 of 14 132

Fig. 8 Underwater image Original


sharpening results a image DCP
two-dimensional entropy UDCP
comparison; b color quality The proposed algorithm
result map; c naturalness result
map 12

10

value
8

Numerical
6

6 0
Th
5 ep
ro
po
UD se
4 da
CP lgo
rit
hm
3 DC
P
2

O
ri
gi
1

nal
(a)
Original
DCP
UDCP
The proposed algorithm

35

30

25

20

Value
15

10

6 0
Th
5 ep
ro
po
UD se
4 d alg
CP or
ith
m
3 DC
P
2
O
ri
gi

1
na
l

(b)
Original
DCP
18 UDCP
16 The proposed algorithm

14
12
10
Value

8
6
4
2
0
1 2 3 4 5 6
Gender
(c)

123
132 Page 12 of 14 Signal, Image and Video Processing (2025) 19:132

The proposed algorithm relative errors of the methods proposed are 2.56%, 3.24%
Normal and 3.89%, respectively. The results show that the method
has high accuracy. This is because it considers the influence
1,40
0 of multiple underwater refraction, and obtains complete and
1,20
0 more accurate image parameters after calibration to obtain
1, 0
00
more accurate positioning results.

value
800

Numerical
600
4 Discussion
40
0

20
0
First, the test is conducted on the Water-Net dataset. Four
0
00 No visual saliency algorithms, CA, HC, GR and SR, are selected
12 rm
al
for comparison. The F-Measure value and MAE of each algo-
00
10
Th
rithm are calculated. The results show that compared with
ep
0
ro
po
se
other algorithms, the algorithm proposed has an F value of
80 da
l go
rit
hm
0.843 and an MAE value of 0.166. Thereby, the F value of this
method is higher and the MAE is lower. Next, the underwater
(a)
image target detection algorithm designed is compared with
The proposed algorithm DCP and UDCP algorithms, and the evaluation results of
Normal
each algorithm for the two-dimensional entropy of the orig-
0.07 0.0688
inal image, the quality of the underwater color image and
0.061
0.06 the naturalness of the image are calculated. It is found that
0.0541
the average score of image quality extracted by the under-
0.05
water target extraction and tracking algorithm designed is
0.0389 27.82, and the average score of image naturalness is 11.27.
Value

0.04
0.0324 Therefore, the overall effect of the method designed is bet-
0.03
0.0256 ter. Finally, the classification error rate of target recognition
0.02
is analyzed. The results show that the relative errors of
the method designed are 2.56%, 3.24% and 3.89%, respec-
0.01 tively, suggesting a high accuracy. To sum up, the designed
0.00
underwater target detection and tracking algorithm has better
800 1000 1200 extraction accuracy, and has a certain degree of assurance of
Distance the image quality.
(b)

Fig. 9 Positioning error result diagram a Comparison diagram of mea-


surement results; b Error result
5 Conclusion

With the rapid progress of multimedia information technol-


UDCP and the method proposed have high image quality. In ogy and network technology, image processing and pattern
general, the effect of the method proposed is the best. recognition technology research is increasingly extensive.
Image processing and pattern recognition is a frontier subject
3.3 Image target recognition and tracking analysis with important theoretical research and practical applica-
tion value, especially in the field of underwater vision. The
Different experimental samples are selected to study the clas- development of smart blockchain technology can also help
sification error rate of the underwater image object detection the image recognition algorithm to further improve recogni-
algorithm proposed. At the same time, the gap and rela- tion accuracy. Combined with the PSO algorithm, this work
tive error between the proposed method designed and the designs an underwater image target detection and tracking
common positioning method are compared at different posi- algorithm based on VSA and smart blockchain technology.
tioning distances in Fig. 9: Meanwhile, the PSO algorithm is employed to optimize the
The experimental results in Fig. 9 show that the relative underwater image target detection to a greater extent, and the
errors of common underwater image target detection and existing underwater image database is used for simulation
tracking methods at 800 mm, 1000 mm and 1200 mm are experiments. The results show that the algorithm designed i
5.41%, 6.10% and 6.88%, respectively. By comparison, the has a higher F value, lower MAE and better performance

123
Signal, Image and Video Processing (2025) 19:132 Page 13 of 14 132

than other VSA algorithms. The average score of image copyright holder. To view a copy of this licence, visit http://creativeco
quality extracted by the designed underwater target extrac- mmons.org/licenses/by-nc-nd/4.0/.
tion and tracking algorithm is 27.82, and the average score
of image naturalness is 11.27. Compared with other algo-
rithms, the designed algorithm performs better in the overall
extraction of underwater images and has better positioning References
accuracy in the target location of underwater images. The
1. Zeng, L., Sun, B., Zhu, D.: Underwater target detection based on
research innovation is that the research results can provide Faster R-CNN and adversarial occlusion network. Eng. Appl. Artif.
references and suggestions for target detection and tracking Intel. 100, 104190 (2021)
in subsequent underwater images. By combining the visual 2. Qi, J., Gong, Z., Xue, W., Liu, X., Yao, A., Zhong, P.: An unmixing-
based network for underwater target detection from hyperspectral
saliency algorithm with PSO, the quality and efficiency of
imagery. IEEE 14, 5470–5487 (2021)
underwater target extraction are optimized. The research lim- 3. Wei, X., Yu, L., Tian, S., Feng, P., Ning, X.: Underwater target
itation is that the research time is short and the sample size detection with an attention mechanism and improved scale. Mul-
is limited. There are still some deficiencies in the scope and timed. Tools Appl. 80(25), 33747–33761 (2021)
4. Qi, J., Gong, Z., Yao, A., Liu, X., Li, Y., Zhang, Y., Zhong, P.:
depth of the investigation, which need to be further expanded.
Bathymetric-based band selection method for hyperspectral under-
Besides, underwater image target recognition technology is water target detection. Remote Sens. 13(19), 3798 (2021)
also keeping pace with the times, and new technologies will 5. Wang, X., Zhu, Y., Li, D., Zhang, G.: Underwater target detection
be constantly updated and used. Later, the theory and practice based on reinforcement learning and ant colony optimization. J.
Ocean Univ. China 21(2), 323–330 (2022)
will be deeply combined for further research.
6. Zhang, D., Gao, L., Teng, T., Jia, Z.: Underwater moving target
detection using track-before-detect method with low power and
Supplementary Information The online version contains supplemen- high refresh rate signal. Appl. Acoust. 174, 107750 (2021)
tary material available at https://doi.org/10.1007/s11760-024-03638-8. 7. Shi, J., Zhuo, X., Zhang, C., Bian, Y.X., Shen, H.: Research on
key technologies of underwater target detection. NPTA 11763,
Author contributions Zhang and Bi made substantial contributions to 1128–1137 (2021)
the conception and design of the work. Li was mainly responsible 8. Zheng, Y., Yu, M., Liu, R., Liu, Y.: Underwater target detection
for data acquisition, analysis and interpretation. Bi wrote the main based on deep neural network and image enhancement. J. Phys.
manuscript text. Zhang revised it critically for important intellectual Conf. Ser. 2029(1), 012145 (2021)
content. All authors reviewed the manuscript, approved the version to 9. Zhang, L., Li, C., Sun, H.: Object detection/tracking toward under-
be published and agreed to be accountable for all aspects of the work in water photographs by remotely operated vehicles (ROVs). Futur.
ensuring that questions related to the accuracy or integrity of any part Gener. Comput. Syst. 126, 163–168 (2022)
of the work are appropriately investigated and resolved. 10. Jian, M., Liu, X., Luo, H., Lu, X., Yu, H., Dong, J.: Underwater
image processing and analysis: a review. Signal Process. Image
Funding National Natural Science Foundation of China Youth Fund Commun. 91, 116088 (2021)
under Grant 12004293. Aeronautical Science Foundation under Grant 11. Qian, S., Shi, Y., Wu, H., Liu, J., Zhang, W.: An adaptive enhance-
2019ZH0T7001. Key Research and Development Program of Shaanxi ment algorithm based on visual saliency for low illumination
under Grant No. 2023-YBGY-019 and 2024GX-YBXM-262. images. Appl. Intell. 52(2), 1770–1792 (2022)
12. Kannan, S.: Intelligent object recognition in underwater images
Data and materials availability The data used to support the findings using evolutionary-based Gaussian mixture model and shape
of this study are included within the article and Supplemental Files. matching. Signal Image Video Proc. 14(5), 877–885 (2020)
13. Li, X., Camerer, C.F.: Predictable effects of visual salience in exper-
Declarations imental decisions and games. Q. J. Econ. 137(3), 1849–1900 (2022)
14. Krüger, A., Scharlau, I.: The time course of salience: not entirely
caused by salience. Jpn. Psychol. Res. 86(1), 234–251 (2022)
Conflict of interest The authors declare no competing interests.
15. Tay, D., Jannati, A., Green, J.J., McDonald, J.J.: Dynamic
inhibitory control prevents salience-driven capture of visual atten-
Ethical approval Not applicable.
tion. J. Exp. Psychol. Hum. Percept. Perform. 48(1), 37 (2022)
16. Rust, N.C., Cohen, M.R.: Priority coding in the visual system.
Open Access This article is licensed under a Creative Commons
Nature 23(6), 376–388 (2022)
Attribution-NonCommercial-NoDerivatives 4.0 International License,
17. Yutong, G., Khishe, M., Mohammadi, M., Rashidi, S., Nateri, M.S.:
which permits any non-commercial use, sharing, distribution and repro-
Evolving deep convolutional neural networks by extreme learning
duction in any medium or format, as long as you give appropriate credit
machine and fuzzy slime mould optimizer for real-time sonar image
to the original author(s) and the source, provide a link to the Creative
recognition. Int. J. Fuzzy Syst. 24(3), 1371–1389 (2022)
Commons licence, and indicate if you modified the licensed material.
18. Beffara, B., Hadj-Bouziane, F., Hamed, S.B., Boehler, C.N.,
You do not have permission under this licence to share adapted material
Chelazzi, L., Santandrea, E., Macaluso, E.: Dynamic causal interac-
derived from this article or parts of it. The images or other third party
tions between occipital and parietal cortex explain how endogenous
material in this article are included in the article’s Creative Commons
spatial attention and stimulus-driven salience jointly shape the dis-
licence, unless indicated otherwise in a credit line to the material. If
tribution of processing priorities in 2D visual space. Neuroimage
material is not included in the article’s Creative Commons licence and
255, 119206 (2022)
your intended use is not permitted by statutory regulation or exceeds
19. Lawrence, R.K., Pratt, J.: Salience matters: distractors may, or
the permitted use, you will need to obtain permission directly from the
may not, speed target-absent searches. Atten. Percept. Psychophys.
84(1), 89–100 (2022)

123
132 Page 14 of 14 Signal, Image and Video Processing (2025) 19:132

20. Zhao, L., Bo, Q., Zhang, Z., Chen, Z., Wang, Y., Zhang, D.: Altered 31. Fernandes, P.B., Oliveira, R.C.L., Neto, J.F.: Trajectory planning of
dynamic functional connectivity in early psychosis between the autonomous mobile robots applying a particle swarm optimization
salience network and visual network. Neuroscience 491, 166–175 algorithm with peaks of diversity. Appl. Soft Comput. 116, 108108
(2022) (2022)
21. Verma, G., Kumar, M.: Systematic review and analysis on underwa- 32. Li, N., Hou, G., Liu, Y., Pan, Z., Tan, L.: Single underwater image
ter image enhancement methods, datasets, and evaluation metrics. enhancement using integrated variational model. Digit. Signal Pro-
J. Electron. Imaging 31(6), 060901 (2022) cess. 129, 103660 (2022)
22. Pahnehkolaei, S.M.A., Alfi, A., Machado, J.T.: Analytical stability 33. Supreeth, S., Patil, K.: Hybrid genetic algorithm and modified-
analysis of the fractional-order particle swarm optimization algo- particle swarm optimization algorithm (GA-MPSO) for predicting
rithm. Chaos Solitons Fractals 155, 111658 (2022) scheduling virtual machines in educational cloud platforms. Int. J.
23. Cui, Y., Meng, X., Qiao, J.: A multi-objective particle swarm opti- Emerg. Technol. 17(7), 208 (2022)
mization algorithm based on two-archive mechanism. Appl. Soft 34. Han, F., Zheng, M., Ling, Q.: An improved multiobjective parti-
Comput. 119, 108532 (2022) cle swarm optimization algorithm based on tripartite competition
24. Afroz, Z., Shafiullah, G.M., Urmee, T., Shoeb, M.A., Higgins, G.: mechanism. Appl. Intell. 52(5), 5784–5816 (2022)
Predictive modelling and optimization of HVAC systems using 35. Gao, Q., Xu, H., Li, A.: The analysis of commodity demand
neural network and particle swarm optimization algorithm. Build. predication in supply chain network based on particle swarm opti-
Environ. 209, 108681 (2022) mization algorithm. J. Comput. Appl. Math. 400, 113760 (2022)
25. Zhang, J.: Processing and compression of underwater image based 36. Venker, C.E., Neumann, D., Aladé, F.: Visual perceptual
on deep learning. Optik 271, 170168 (2022) salience and novel referent selection in children with and with-
26. Huang, H., Zuo, Z., Sun, B., Wu, P., Zhang, J.: DSA-SOLO: double out autism spectrum disorder. Autism Dev. Lang. Impair. 7,
split attention SOLO for side-scan sonar target segmentation. SN 23969415221085476 (2022)
Appl. Sci. 12(18), 9365 (2022) 37. Gaspar, A., Oliva, D., Hinojosa, S., Aranguren, I., Zaldivar, D.: An
27. Hu, P., Pan, J.S., Chu, S.C., Sun, C.: Multi-surrogate assisted binary optimized Kernel extreme learning machine for the classification
particle swarm optimization algorithm and its application for fea- of the autism spectrum disorder by using gaze tracking images.
ture selection. Appl. Soft Comput. 121, 108736 (2022) Appl. Soft Comput. 120, 108654 (2022)
28. Li, L., Zhang, Y., Fung, J.C., Qu, H., Lau, A.K.: A coupled compu-
tational fluid dynamics and backpropagation neural network-based
particle swarm optimizer algorithm for predicting and optimizing
Publisher’s Note Springer Nature remains neutral with regard to juris-
indoor air quality. Build. Environ. 207, 108533 (2022)
dictional claims in published maps and institutional affiliations.
29. Hu, K., Weng, C., Zhang, Y., Jin, J., Xia, Q.: An overview of under-
water vision enhancement: from traditional methods to recent deep
learning. J. Mar. Sci. Eng. 10(2), 241 (2022)
30. Zhang, X., Wang, Z., Lu, Z.: Multi-objective load dispatch for
microgrid with electric vehicles using modified gravitational search
and particle swarm optimization algorithm. Appl. Energy 306,
118018 (2022)

123

You might also like