YOLOv8n-FAWL Object Detection For Autonomous Driving Using YOLOv8 Network On Edge Devices
YOLOv8n-FAWL Object Detection For Autonomous Driving Using YOLOv8 Network On Edge Devices
ABSTRACT In the field of autonomous driving, common challenges include difficulties in detecting
small vehicles and pedestrians on the road, high computational demands of algorithms, and low accuracy
of detection algorithms. This paper proposes a YOLOv8n-FAWL object detection algorithm tailored for
edge computing, incorporating the following three improvements: (1) The Faster-C2f-EMA module is
created, designed through the synergy of the FasterNet architecture and the concept of EMA modules,
effectively addressing the challenge of suboptimal feature extraction for small objects. (2) The WIOU loss
function is adopted to resolve the issue of imbalanced training samples. (3) The LAMP pruning technique
is applied to reduce the model parameters and complexity, thereby enhancing the overall model accuracy.
The experimental results show that compared to the baseline model, the proposed algorithm achieves
improvements of 6.2% and 4.5% in the [email protected], and 3.8% and 2.7% in the [email protected]:0.95, on the
Udacity and BDD100K-tiny datasets,respectively. In addition, the model parameters we’re reduced by 49.2%
and 46%. The model achieved real-time performance at 54 FPS, thereby advancing the development of
autonomous driving technology.
object detection models while ensuring detection accuracy. inference speed. Although YOLOv1 initially lagged behind
This not only helps improve the real-time response capability the Faster R-CNN [9] in terms of accuracy, its outstanding
of autonomous driving systems but also facilitates the real-time performance makes it highly attractive for practical
deployment and application of models on embedded devices applications.
with limited computational power [5]. Subsequent YOLO versions, such as YOLOv4 [10], have
Therefore, to address the aforementioned challenges, further improved the detection accuracy by incorporating
this study proposes a YOLOv8n-Faster-C2f-EMA-WIOU- advanced deep learning techniques. YOLOv4 combines
LAMP(FAWL) model. The primary contributions of this a cross-stage partial network (CSPNet) [11] and Spatial
model are as follows: Pyramid Pooling (SPP) [12] to enhance the feature extraction
1. This study devised a module named Faster-C2f-EMA, and object detection capabilities. These enhancements con-
which, upon comparison with the original model across two tributed to the continuous improvement of the YOLO series
datasets, significantly enhanced the accuracy of the model in in the field of autonomous driving.
detecting small objects while maintaining its inference speed. In terms of current usage, YOLOv5 [13] and YOLOv7 [14]
2.We adopted the WIoU [6] loss function, which reduces are the two most widely accepted algorithms. Compared to
the harmful gradients generated by suboptimal training YOLOv4, YOLOv5 improved the in model structure, training
samples and enhances the overall detection performance of strategy, and performance. It can effectively reduce redundant
the model. computations and improve the computational efficiency.
3.We propose employing the LAMP [7] pruning algorithm However, YOLOv5 has certain drawbacks. For example,
to reduce the model parameters and computational complex- it still has some deficiencies in small object detection, and
ity while simultaneously enhancing the overall accuracy of its detection effect on dense objects also needs improvement.
the model, making it suitable for deployment on embedded YOLOv7 proposes a new training strategy called the
devices. Compared with existing algorithms, the pruned Trainable Bag of Freebies (TOF) to enhance the performance
model, when evaluated on two datasets, demonstrates signif- of real-time object detectors. The TOF method encompasses
icant improvements in both detection accuracy and speed for a series of trainable techniques, such as data augmentation
the YOLOv8n-FAWL algorithm. and MixUp. Applying TOF to three different types of object
4. We conducted tests on a high-performance Jetson Orin detectors (SSD, RetinaNet, and YOLOv3) can significantly
Nano edge computing device, achieving an improved model improve the accuracy and generalization ability of object
inference speed of 54FPS, which meets the requirements detectors. However, YOLOv7 is also constrained by low-
for real-time detection. This validates the feasibility and quality annotated data, model structures, and hyperparame-
effectiveness of the lightweight framework of the YOLOv8n- ters, which can lead to performance degradation in certain
FAWL model in practical applications. scenarios.
The rest of this paper is organized as follows: Section II The latest YOLOv8 [15] further refined its network
introduces the evolution of YOLO series algorithms and architecture and computational flow to enhance response
lightweight model techniques. Section III details the speed and accuracy in high-speed driving environments.
YOLOv8n-FAWL algorithm. In Section IV, we present the Depending on the depth and width of the network, YOLOv8
experimental datasets, parameter settings, and evaluation can be categorized as YOLOv8n, YOLOv8s, YOLOv8m,
metrics, along with the results of the ablation studies YOLOv8l, and YOLOv8x.
and comparative experiments. Section V summarizes the
proposed algorithm and provides an outlook for future work.
backbone section of YOLOv8 is largely similar to that of model complexity by reducing redundant connections or
YOLOv5, leveraging the CSP (Cross Stage Partial) concept neurons in the network. Slimming [23], on the other hand,
but substituting the C3 modules with C2f modules. The is an example of fine-grained pruning that trims parameters
C2f module, featuring a dense residual structure, enables by removing unimportant weights or neurons. Zhang et al.
YOLOv8 to obtain richer gradient flow information while [24] proposed an adaptive pruning method for optimizing
maintaining a lightweight design. At the end of the backbone, lightweight transformers (PAOLTransformer), which uses
the most popular SPPF (Spatial Pyramid Pooling Fast) norm information to assess the contribution of each element
module is utilized. The SPPF layer enhances the receptive in the model to the output and automates the pruning
field and captures the feature information across different process through reinforcement learning to achieve the
levels. optimal compression ratio. PSE-Net [25] is a new approach
The neck part employs the PAN-FPN feature fusion for channel pruning in convolutional neural networks that
method to, strengthen the representation of the entire feature accelerates supernetwork training using a parallel subnet
hierarchy as well as the fusion and utilization of information training algorithm and employs a prior distribution sampling
through a bottom-up path augmentation approach. strategy to identify optimal substructures subject to resource
Finally, the Head section obtains the target class informa- constraints. The LAMP algorithm adopted in this study
tion by decoupling the classification and detection processes integrates hierarchical automatic pruning with adaptive
through the three prediction branches. pruning rate adjustments customized to the characteristics of
In summary, despite the significant progress made by the each layer, thereby effectively eliminating redundant data in
YOLO series in the field of autonomous driving, challenges the model.
remain in areas such as adapting to low-quality annotated Finally, after being accelerated with TensorRT [26], the
data, improving the small-object detection performance, and improved model was successfully deployed on the embedded
streamlining deployment on embedded platforms. Therefore, device Jetson Orin Nano, meeting the standards for the
this study focused on lightweight YOLOv8n. By integrating real-time detection and laying a solid foundation for future
novel techniques and methods, the model achieves real-time development and application of lightweight autonomous
performance while further enhancing detection accuracy and driving algorithms..
robustness, providing a more reliable perception capability
for autonomous driving systems.
III. ENHANCED LIGHTWEIGHT YOLOV8N ALGORITHM
A. SYSTEM OVERVIEW
B. MODEL LIGHTWEIGHTING TECHNIQUES To address the challenges of the low accuracy and lim-
ited portability inherent in traditional autonomous driv-
In the realm of model lighting, the common approaches
ing detection networks, this study presents a streamlined
include replacing the backbone and pruning.
YOLOv8n-FAWL model. The model meticulously integrates
In the approach to replacing lightweight backbone net-
enhancements across three key areas of the YOLOv8n
works, networks such as ShuffleNet [16], MobileNet [17],
network, and its core architecture is depicted in Figure 2.
and GhostNet [18] are commonly used. They effectively
Initially, the paper embraces the faster-block-EMA module
reduce floating-point operations (FLOPs) but may increase
was designed to replace the original C2f module in the
the memory access latency or data manipulation overhead.
backbone. Following this, the Wise-IoU loss function was
To address this issue, this study draws inspiration from
introduced as a replacement for the original loss function
the design philosophy of the FasterNet [19] network and
in the YOLOv8 detection mechanism. In the final stage,
creates a Faster-C2f module. However, the experimental
we employ the LAMP pruning algorithm to compress the
results showed subpar accuracy, which we speculated was
model and reduce its complexity.
due to insufficient feature scale extraction by Faster-
C2f. To improve this, we integrated the EMA attention
mechanism [20] into Faster-C2f and redesigned it as Faster- B. FASTER-C2F-EMA LIGHTWEIGHT MODULE
C2f-EMA. As detailed in Section 4.4.5, the experimental To address the issue of the large model size affecting the
results demonstrate that Faster-C2f-EMA achieves signif- detection speed in autonomous driving detection tasks, this
icant improvements in accuracy and robustness for small study employs a self-designed faster-block-EMA module to
object detection. replace the bottleneck part of the original C2f module in
Pruning reduces the size of a network by eliminating YOLOv8n. This resulted in a new module named Faster-C2f-
redundant connections or neurons. Structured pruning is EMA, as illustrated in Figure 3.
also a method that reduces the size of a network by The design emphasis of this module is to accelerate
removing redundant connections or neurons. Techniques inference speed and enhance the detection of small objects.
such as DeepLayer [21] and TropNNC [22] are examples Integrating the FasterBlock module effectively improved
of structured pruning. DeepLayer reduces complexity by the inference speed. However, there are limitations in
progressively discarding entire layers, whereas TropNNC, terms of feature extraction capability. To address this
based on the principles of tropical geometry, aims to decrease issue and strengthen the detection of small objects, this
(W [u])2
score(u; W ) = (4)
6(v≥u) (W [v])2
In this equation, variables v and u represent different
parameters within the network, and W denotes the weight
parameters of the neural network. W[u] is the weight of the
u-th parameter. The term 6(v≥u) (W [v])2 represents the sum of
squares of all weights starting from the u-th weight (inclusive)
up to the last weight in the current layer. This formula rescales
the magnitude of each weight by computing its ratio to the
sum of the squares of all subsequent weights in the same For this experiment, two classic datasets in the field of
layer, resulting in a layer-adaptive pruning importance score. autonomous driving, the Udacity Self-Driving Car Dataset
This score aims to approximate the distortion in the model [30] and BDD100K-tiny, were used for testing. First,
output caused by pruning, enabling efficient neural network we experimented with a model improvement method using an
pruning through the automatic selection of sparsity levels for udacity dataset. The final improvements were validated using
each layer without the need for hyperparameter tuning. BDD100K-tiny dataset.
The pruning process based on the importance of the The Udacity Self-Driving Car Dataset, originally designed
weights, is illustrated in Figure 5. After the model input, for autonomous vehicle algorithm competitions, contains
the LAMP calculates the scores, resulting in LAMP scores. 15,000 urban road images. In this experiment, the entire
The pruning algorithm then removes the weights with lower dataset was randomly divided into three distinct subsets
scores, leading to a pruned model. Finally, fine-tuning was with a distribution ratio: training, validation, and testing.
performed as needed to recover some of the performance As depicted in Figure 6, the target points in this dataset
losses. we’re predominantly concentrated around the central point,
In this study, we implemented fine-tuning of model and small targets constituted a significant proportion of the
pruning by introducing an additional parameter, Speed_up, dataset.
which is numerically equivalent to the ratio of the model’s The BDD100K [31] dataset, released by the University of
computational cost before pruning divided by the number of California, Berkeley, is a large-scale and diverse dataset for
parameters after pruning. For instance, if the computational research in the field of autonomous driving. From the 70,000
cost of the model before pruning is 8.2 the GFLOPs, and images in the dataset, 20,000 containing at least two types
the target computational cost after pruning is 5.47 GFLOPs, of objects were randomly selected. These images were also
Speed_up can be set to 1.5-. This allows the model to randomly divided into three subsets: training, validation, and
automatically prune until it reaches the target computational testing, following the same 6:2:2 ratio to form a new dataset
cost, at which point the pruning process stops.Similarly, named BDD100K-tiny, as shown in Figure 7. Compared with
if the target computational cost after pruning is 4.1GFLOPs, the Udacity dataset, the target distribution in this dataset
Speed_up can be set to 2. was more uniform, and the proportion of small targets was
relatively smaller.
IV. EXPERIMENTS AND ANALYSIS Subsequently, the enhanced model was deployed on a
A. TRAINING ENVIRONMENT AND METHODOLOGY Jeston Orin Nano platform to evaluate its performance, with
The research in this study involved training the model on the specific test environment parameters for the Jeston Orin
the server side and conducting an initial model evaluation, Nano delineated in Table 3.
followed by deployment to the embedded device Jeston
Orin Nano with acceleration through TensorRT for model B. EVALUATION METRICS
validation. This study primarily focused on, several evaluation metrics.
The server-side experimental environment configurations (1) It is noteworthy that precision (P) as a metric
used in this study are presented in Table 1. indicates the proportion of correctly predicted positive
The training parameters are shown in Table 2. samples among all predicted positive samples, calculated
(4)This study used Giga Floating-Point Operations Per TABLE 4. Udacity dataset comparison experiment.
Second (GFLOPs) and the number of parameters to measure
the complexity of the models.
Upon applying LAMP pruning to YOLOv8n-FAW, as indi- enhancing detection accuracy. Additionally, replacing the
cated in Table 5, the model attained a peak accuracy when the loss function with Wise-IoU effectively handles low-quality
speed_up factor was set to 1.2. Consequently, we designated samples during training, thereby improving the model’s
this optimized model YOLOv8n-FAWL. Figure 9 shows a overall performance and generalization capabilities. To fur-
comparison of the model channel configurations before and ther lighten the model, the LAMP pruning algorithm is
after pruning. utilized, which compresses the model while maintaining the
Merely relying on data may not effectively enhance the original network structure and image feature representation,
detection performance. To visually demonstrate the improved thereby eliminating the redundant weights in each layer.
detection capabilities resulting from these modifications, This results in the improved robustness of the model and
this study first presents a heatmap visualization, as shown an additional increase in detection precision. Therefore, the
in Figure 10. This figure compares the heatmaps gener- improved YOLOv8n-FAWL model, which is characterized
ated by the YOLOv8-FAWL and YOLOv8n algorithms. by fewer parameters and computations yet achieves the
As shown in the heatmaps, the YOLOv8-FAWL model highest recognition accuracy, is suitable for deployment on
exhibits a higher degree of focus on small objects than the embedded devices.
YOLOv8n model. The results indicate that the YOLOv8- The final improved model, YOLOv8n-FAWL, further
FAWL algorithm pays more attention to feature information, reduces the number of parameters and computational
resulting in higher sensitivity to target detection and better requirements to 49.2% and 74.4% of their original values,
performance. respectively. Compared with the original model, the model’s
Figure 11 shows a comparison of the detection results average precision metrics, [email protected] and [email protected]:0.95,
between the YOLOv8-FAWL and YOLOv8n algorithms increased by 6.2% and 3.8%, respectively.
on the Udacity and BDD100K-tiny datasets, showing the We conducted ablation experiments on the BDD100K-
actual detection outcomes under various conditions including tiny dataset, as shown in Table 7. The final improved
daytime, nighttime, significant changes in lighting con- model, YOLOv8n-FAWL, further reduces the number of
ditions, and environments with numerous small targets. parameters and computational requirements to 49.2% and
It is evident that the YOLOv8-FAWL algorithm exhibits 74.4% of their original values, respectively. Compared with
superior generalization and applicability in different sce- the original model, the average precision metrics of the
narios. It can detect small targets in various environments. model, [email protected] and [email protected]:0.95, increased by 4.5%
By contrast, the YOLOv8n network struggles to detect and 2.7%, respectively. In summary, the improved model
smaller targets. Additionally, compared to the YOLOv8n has made significant progress in reducing the parameter
algorithm, the YOLOv8-FAWL algorithm achieves similar count and computational intensity, while also achieving
confidence levels when detecting large and medium-sized considerable enhancement in model precision. It exhibits
targets, demonstrating its robustness. This underscores the good portability and is suitable for autonomous driving
potential of YOLOv8-FAWL for applications in diverse detection tasks in complex road environments.
autonomous driving scenarios with high accuracy.
In summary, this study enhances the YOLOv8n model 4) EMBEDDED DEVICE EXPERIMENT ANALYSIS
by introducing the Faster-C2f-EMA module to improve To ensure real-time inference of the model on edge devices,
its backbone C2f module. This modification reduces the the following acceleration measures were taken for the
model’s parameter count and computational load while model proposed in this paper. First, the improved model
was converted into a universal ONNX model file format. used to measure real-time FPS data while deploying the
Subsequently, TensorRT parses the ONNX model to create model on a Jetson Orin Nano for live detection. Concurrently,
an FP32 engine-model file. This approach maintains floating- the converted model was employed to perform inference
point precision, resulting in a minimal impact on the accuracy on the validation set images, and the results were saved as
of the detection model, thereby demonstrating its broad JSON files for testing the average precision at [email protected] and
applicability. In this study, an external USB camera was [email protected]:0.95.
in the backbone was replaced with Faster-C2f, as well as on embedded devices to achieve rapid and precise recognition
with Faster-C2f fused with an attention mechanism. It can while meeting the requirements for real-time detection. This
be observed that the Faster-C2f-EMA designed in this study validates the feasibility and portability of the Yolov8n-FAWL
achieved the best results in all aspects. detection model and offers valuable information for the
potential deployment of future autonomous driving detection
TABLE 9. Ablation experiment results of module replacement locations. models on mobile devices.
(3)A limitation of this study is that the improved model
does not demonstrate a significant increase in inference
speed. In subsequent research, we plan to perform model
distillation while maintaining the accuracy to further reduce
the number of model parameters and alleviate the compu-
tational burden on hardware systems in autonomous driving
applications.
V. CONCLUSION
ACKNOWLEDGMENT
For the autonomous driving recognition task studied in
The authors would like to thank Prof. Rongrong Chen and
this paper, considering the significant variations in lighting
Prof. Wuyang Xue for their guidance in the experiments and
conditions and uneven distribution of target scales across the
for revising the article during the research period. This article
two datasets, as well as to ensure the model’s portability and
solely utilizes AI to assist in translating sentences.
real-time performance, we made improvements to the high-
performance YOLOv8n model. This has led to the creation
of a lightweight detection model called YOLOv8n-FAWL. REFERENCES
Through a series of ablation experiments and comparative [1] O. Tuzel, F. Porikli, and P. Meer, ‘‘Pedestrian detection via classification on
Riemannian manifolds,’’ IEEE Trans. Pattern Anal. Mach. Intell., vol. 30,
evaluations with other object detection models, including no. 10, pp. 1713–1727, Oct. 2008.
experimental tests on embedded boards, the following [2] S. Ren, K. He, and R. Girshick, ‘‘Faster R-CNN: Towards real-time
conclusions were drawn: object detection with region proposal networks,’’ in Proc. Adv. Neural Inf.
(1)By integrating the Faster-Block from the FasterNet Process. Syst., vol. 28, 2015.
[3] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, ‘‘Focal loss for dense
network with an EMA attention mechanism, the bottleneck object detection,’’ in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Oct. 2017,
module in the C2f segment of the YOLOv8n backbone was pp. 2999–3007.
replaced, creating a lightweight Faster-C2f-EMA module. [4] R. Huang, J. Pedoeem, and C. Chen, ‘‘YOLO-LITE: A real-time object
detection algorithm optimized for non-GPU computers,’’ in Proc. IEEE
In addition, the Wise-IoU loss function was introduced
Int. Conf. Big Data, Dec. 2018, pp. 2503–2510.
as a replacement for the CIoU loss function. The model [5] J. Redmon and A. Farhadi, ‘‘YOLOv3: An incremental improvement,’’
is then compressed using the LAMP pruning algorithm. 2018, arXiv:1804.02767.
The improved lightweight model reduced the computational [6] Z. Tong, Y. Chen, Z. Xu, and R. Yu, ‘‘Wise-IoU: Bounding box regression
loss with dynamic focusing mechanism,’’ 2023, arXiv:2301.10051.
complexity and parameter sizes were 74.3% and 49.2% of the
[7] J. Lee, S. Park, S. Mo, S. Ahn, and J. Shin, ‘‘Layer-adaptive sparsity for
original model, respectively. In server-side evaluations, the the magnitude-based pruning,’’ 2020, arXiv:2010.07611.
enhanced model shows a 6.1% increase in [email protected] and a [8] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, ‘‘You only look once:
3.7% increase in [email protected]:0.95 over the original YOLOv8n, Unified, real-time object detection,’’ in Proc. IEEE Conf. Comput. Vis.
Pattern Recognit. (CVPR), Jun. 2016, pp. 779–788.
indicating that the enhanced model not only reduced its
[9] R. Girshick, ‘‘Fast R-CNN,’’ in Proc. IEEE Int. Conf. Comput. Vis. (ICCV),
volume but also significantly enhances recognition accuracy Dec. 2015, pp. 1440–1448.
while being lightweight. [10] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, ‘‘YOLOv4: Optimal
(2)Compared with the current mainstream autonomous speed and accuracy of object detection,’’ 2020, arXiv:2004.10934.
driving detection methods, the lightweight autonomous [11] C.-Y. Wang, H.-Y. Mark Liao, Y.-H. Wu, P.-Y. Chen, J.-W. Hsieh, and
I.-H. Yeh, ‘‘CSPNet: A new backbone that can enhance learning capability
driving detection model improved from YOLOv8n in this of CNN,’’ in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.
study after being accelerated by TensorRT can be deployed Workshops (CVPRW), Jun. 2020, pp. 1571–1580.
[12] K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Spatial pyramid pooling in deep [33] L. Fu, Y. Feng, J. Wu, Z. Liu, F. Gao, Y. Majeed, A. Al-Mallahi, Q. Zhang,
convolutional networks for visual recognition,’’ IEEE Trans. Pattern Anal. R. Li, and Y. Cui, ‘‘Fast and accurate detection of kiwifruit in orchard
Mach. Intell., vol. 37, no. 9, pp. 1904–1916, Sep. 2015. using improved YOLOv3-tiny model,’’ Precis. Agricult., vol. 22, no. 3,
[13] X. Zhu, S. Lyu, X. Wang, and Q. Zhao, ‘‘TPH-YOLOv5: Improved pp. 754–776, Jun. 2021.
YOLOv5 based on transformer prediction head for object detection on [34] Y. Zhao, W. Lv, S. Xu, J. Wei, G. Wang, Q. Dang, Y. Liu, and
drone-captured scenarios,’’ in Proc. IEEE/CVF Int. Conf. Comput. Vis. J. Chen, ‘‘DETRs beat YOLOs on real-time object detection,’’ 2023,
Workshops (ICCVW), Oct. 2021, pp. 2778–2788. arXiv:2304.08069.
[14] C.-Y. Wang, A. Bochkovskiy, and H.-Y.-M. Liao, ‘‘YOLOv7: Trainable [35] A. Wang, H. Chen, L. Liu, K. Chen, Z. Lin, J. Han, and G. Ding,
bag-of-freebies sets new state-of-the-art for real-time object detectors,’’ in ‘‘YOLOv10: Real-time end-to-end object detection,’’ 2024,
Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2023, arXiv:2405.14458.
pp. 7464–7475.
[15] P. Yan, W. Wang, G. Li, Y. Zhao, J. Wang, and Z. Wen, ‘‘A lightweight coal
gangue detection method based on multispectral imaging and enhanced
YOLOv8n,’’ Microchemical J., vol. 199, Apr. 2024, Art. no. 110142.
[16] X. Zhang, X. Zhou, M. Lin, and J. Sun, ‘‘ShuffleNet: An extremely
efficient convolutional neural network for mobile devices,’’ in ZIBIN CAI was born in Shaoguan, Guangdong,
Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., Jun. 2018,
China, in 2001. He is currently pursuing the
pp. 6848–6856.
bachelor’s degree with the School of Electronic
[17] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand,
and Electrical Engineering, Zhaoqing University.
M. Andreetto, and H. Adam, ‘‘MobileNets: Efficient convolutional neural
networks for mobile vision applications,’’ 2017, arXiv:1704.04861.
[18] K. Han, Y. Wang, Q. Tian, J. Guo, C. Xu, and C. Xu, ‘‘GhostNet: More
features from cheap operations,’’ in Proc. IEEE/CVF Conf. Comput. Vis.
Pattern Recognit. (CVPR), Jun. 2020, pp. 1577–1586.
[19] J. Chen, S.-H. Kao, H. He, W. Zhuo, S. Wen, C.-H. Lee, and S.-H.-G. Chan,
‘‘Run, don’t walk: Chasing higher FLOPS for faster neural networks,’’ in
Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2023,
pp. 12021–12031.
[20] D. Ouyang, S. He, G. Zhang, M. Luo, H. Guo, J. Zhan, and Z. Huang,
‘‘Efficient multi-scale attention module with cross-spatial learning,’’
in Proc. IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), RONGRONG CHEN received the Bachelor of
Jun. 2023, pp. 1–5. Engineering degree from Beijing Institute of
[21] S. Gao, F. Huang, W. Cai, and H. Huang, ‘‘Network pruning via Technology, in 2006, and the Master of Science
performance maximization,’’ in Proc. IEEE/CVF Conf. Comput. Vis. degree from Blekinge Institute of Technology,
Pattern Recognit. (CVPR), Jun. 2021, pp. 9266–9276. in 2008. Currently, she is an Associate Professor
[22] K. Fotopoulos, P. Maragos, and P. Misiakos, ‘‘TropNNC: Structured neural and the Dean of the Department of Electron-
network compression using tropical geometry,’’ 2024, arXiv:2409.03945. ics and Communication Engineering, School of
[23] Z. Liu, J. Li, Z. Shen, G. Huang, S. Yan, and C. Zhang, ‘‘Learning efficient Electronics and Electrical Engineering, Zhaoqing
convolutional networks through network slimming,’’ in Proc. IEEE Int.
University. Her research interests include digi-
Conf. Comput. Vis. (ICCV), Oct. 2017, pp. 2755–2763.
tal signal processing, fundamentals of computer
[24] X. Zhang, J. Sun, J. Wang, Y. Jin, L. Wang, and Z. Liu, ‘‘PAOLTransformer:
application and C++ programming, signals and systems, and related fields.
Pruning-adaptive optimal lightweight transformer model for aero-engine
remaining useful life prediction,’’ Rel. Eng. Syst. Saf., vol. 240, Dec. 2023,
Art. no. 109605.
[25] S. Wang, T. Xie, H. Liu, X. Zhang, and J. Cheng, ‘‘PSE-net: Channel
pruning for convolutional neural networks with parallel-subnets estima-
tor,’’ Neural Netw., vol. 174, Jun. 2024, Art. no. 106263.
[26] G. Jocher, A. Chaurasia, A. Stoken, J. Borovec, Y. Kwon, J. Fang, ZIYI WU was born in Yunfu, Guangdong, China,
K. Michael, D. Montes, J. Nadar, P. Skalski, and Z. Wang, ‘‘Ultralyt- in 2004. He is currently pursuing the bachelor’s
ics/YOLOv5: V6. 1—TensorRT, TensorFlow edge TPU and OpenVINO degree with the School of Electronic and Electrical
export and inference,’’ Zenodo, Tech. Rep., 2022. Engineering, Zhaoqing University.
[27] Q. Wang, B. Wu, P. Zhu, P. Li, W. Zuo, and Q. Hu, ‘‘ECA-net: Efficient
channel attention for deep convolutional neural networks,’’ in Proc.
IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020,
pp. 11531–11539.
[28] S. Woo, J. Park, J. Y. Lee, and I. S. Kweon, ‘‘CBAM: Convolutional
block attention module,’’ in Proc. Eur. Conf. Comput. Vis. (ECCV), 2018,
pp. 3–19.
[29] Q. Hou, D. Zhou, and J. Feng, ‘‘Coordinate attention for efficient mobile
network design,’’ in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.
(CVPR), Jun. 2021, pp. 13708–13717.
[30] A. Buyval, A. Gabdullin, R. Mustafin, and I. Shimchik, ‘‘Realtime
vehicle and pedestrian tracking for didi udacity self-driving car chal- WUYANG XUE received the B.S. and M.S.
lenge,’’ in Proc. IEEE Int. Conf. Robot. Autom. (ICRA), May 2018, degrees in electrical engineering and the Ph.D.
pp. 2064–2069. degree in information and communication engi-
[31] F. Yu, H. Chen, X. Wang, W. Xian, Y. Chen, F. Liu, V. Madhavan, neering from Shanghai Jiao Tong University,
and T. Darrell, ‘‘BDD100K: A diverse driving dataset for heterogeneous Shanghai, China, in 2014, 2017, and 2022,
multitask learning,’’ 2018, arXiv:1805.04687. respectively. His research interests include robotic
[32] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, navigation, path planning, obstacle avoidance, and
and A. C. Berg, ‘‘SSD: Single shot MultiBox detector,’’ in Proc. deep learning.
14th Eur. Conf., Amsterdam, The Netherlands. Springer, Oct. 2016,
pp. 21–37.