Design and Implementation of Embedded PCB Defect Detection System Based On FPGA
Design and Implementation of Embedded PCB Defect Detection System Based On FPGA
Abstract—Aiming at the problem that the convolutional In the deep learning method, the convolutional neural
neural network commonly used in PCB defect detection is network[1] identifies the types of defects by extracting
difficult to deploy in embedded devices with limited image features, and the YOLO[2]( You Only Look Once)
resources, an FPGA embedded system is designed to realize algorithm can mark the location of defects. Many
the hardware deployment of YOLOv3 neural network for improved YOLO detectors[3] have been developed for
PCB defect detection. The system is mainly designed from surface defect detection, such as steel, transportation,
two aspects. The deep learning processing unit DPU is construction and fabric production processes. Many
designed on the hardware side to quickly process the researchers and practitioners have also introduced this
convolution calculation part of the neural network and
mechanism into PCB defect detection. Adibhatla[4]
configure the system software for it. At the algorithm side,
the model is compressed by using a quantization method to
improved the efficiency of PCB defect detection by
reduce the computational complexity of the model and optimizing the YOLO network structure. Although the
generate a DPU deployable model. The experimental results performance of the algorithm is constantly improving,
show that the designed system can still maintain the most programs are running on the GPU platform, and the
accuracy of 0.789 in PCB defect detection. At the same time, high power consumption of the GPU is one of the key
the convolution calculation throughput of 2.44TOPS and factors to ensure the fast operation of the algorithm. In
the detection speed of 97.59ms per frame delay are realized order to achieve fast PCB defect detection under low
under low power consumption, which is suitable for power consumption, how to deploy deep learning method
industrial PCB defect detection. to embedded devices is an urgent problem to be solved.
Keywords—FPGA, YOLOv3, embedded systems, PCB Based on this, this paper designs an embedded system
defect detection, hardware deployment based on FPGA (Field Programmable Gate Array), and
realizes the hardware deployment of YOLOv3 network.
From the underlying hardware to the application,
I. INTRODUCTION according to the characteristics of the Linux operating
As the most basic component of electronic products, system and the YOLOv3 network structure, the hardware
the quality of PCB (Printed Circuit Board) directly affects platform and software environment are respectively
the use effect of electronic products. Although the PCB configured to form a complete system. According to the
production process will go through layers of checks, but it software and hardware co-design, the algorithm is
will inevitably produce some defective circuit boards. The compressed on the software side[5], the hardware platform
resulting PCB defects mainly have the following is built on the hardware side, and finally passed the on-
characteristics: (1) the defect area is relatively small; (2) board test. The results show that the system has certain
The types of defects are complex; (3) PCB layout is easily speed and power consumption advantages.
confused with some defect features. At present, the
commonly used defect detection methods in PCB II. TARGET DETECTION SYSTEM COMPOSITION
industrial production include manual detection, electrical
testing and detection methods based on deep learning. A. YOLOv3 target detection algorithm
Traditional detection methods are mainly based on manual
detection. Although they have certain flexibility YOLOv3[6] Convolutional neural network is the third
advantages, too long working hours will cause visual generation of YOLO series algorithms based on one stage
fatigue, which will affect PCB quality control and lead to object detection. Its detection performance is far ahead of
wrong or missed detection. The electrical test may cause both versions in terms of speed and accuracy. It is widely
irreversible damage to the circuit due to unstable current, used in industrial and academic fields. The main
and this contact detection method may also cause defects architecture of YOLOv3 is based on Darknet-53. Taking
in the PCB due to improper operation. Therefore, PCB the input with image resolution of 416*416 as an example,
defect detection based on deep learning can work for a the 52 convolutional layers in Darknet-53 contain a total
long time without fatigue, with fast detection speed and of more than 40M parameters and a calculation amount of
stable quality control, which has attracted wide attention. about 24.5G, as shown in Table 1.
1 C
mAP = APi
C i=1
(6)
531
Authorized licensed use limited to: COLLEGE OF ENGINEERING - Pune. Downloaded on August 05,2024 at 10:10:40 UTC from IEEE Xplore. Restrictions apply.
system kernel and the files necessary for system operation, the complexity of model calculation without losing its
and the root file system is at this layer. The middleware is detection accuracy, thus making the hardware deployment
a common service with standard program interface and more friendly[11].
protocol, which can realize the interconnection between
systems and improve the operation efficiency of software Model quantization for hardware deployment, as
applications. The top layer is the application layer. Users shown in Fig.3, takes the floating-point model, calibration
write code generation applications according to data set and its input function as the input of quantization.
requirements, and set up dynamic link libraries to expand The floating-point model provides a YOLOv3 inference
application functions. network structure that does not include data preprocessing
and loss function calculation. The calibration data and its
Application Layer input function are used to standardize the data conversion
rules of the quantization process and perform the data
preprocessing process. Quantization converts the floating-
Middleware Layer
point model into a fixed-point model. After that, the
532
Authorized licensed use limited to: COLLEGE OF ENGINEERING - Pune. Downloaded on August 05,2024 at 10:10:40 UTC from IEEE Xplore. Restrictions apply.
is used to cache the last calculated data, providing a Usually, the parallel computing method of PL will
hardware basis for data reuse. Externally, the DPU accelerate the task of using repeated computing or data
interacts with the ARM and the external storage unit reuse, which has certain advantages for the calculation of
through the high-speed data transmission channel AXI, convolutional neural network convolutional layer and
and processes the instructions sent by the front-end fully connected layer. In addition, CPU and DPU are
application through the application processing unit. connected by AXI bus to accelerate data transmission.
After the DPU configuration is completed, its
PS AXI
parameter information is solidified into the hardware
description file, and its resource utilization is shown in CPU
PL
533
Authorized licensed use limited to: COLLEGE OF ENGINEERING - Pune. Downloaded on August 05,2024 at 10:10:40 UTC from IEEE Xplore. Restrictions apply.
In order to verify the effectiveness and practicability
of the designed system, the DeepPCB data set is used as
the experimental data set, and 200 images are randomly
selected as the test data set for experimental verification.
The data set is 6 classification target detection: open, short,
mouse-bite, spur, copper and pin-hole, image size
640*640, each image has about 3-12 defects. The data set
is commonly used as the experimental verification of the
target detection algorithm, which is more convincing for
the experimental verification of the detection system with
the application background of industrial PCB defect
detection.
B. Data analysis
The overall detection effect of the system is shown in
Fig.7. From the diagram, it can be seen that the system Fig. 8. System power consumption
can accurately detect six defects of PCB when performing
PCB defect detection, frame the location of the defect and Speed calculation: DPU performs well in the
mark the defect type. detection process. Fig.9 shows the total time of single
image processing and the distribution of convolution
processing time. Therefore, the average delay of single
image is 97.59ms, and the average delay of convolution
calculation is 10.03ms, accounting for 10.28%. At the
same time, according to the amount of Darknet calculation
in Table 1, the throughput (formula 7) of DPU in the
detection process is 2.44TOPS.
computation
throughput DPU = (7)
timeconv
534
Authorized licensed use limited to: COLLEGE OF ENGINEERING - Pune. Downloaded on August 05,2024 at 10:10:40 UTC from IEEE Xplore. Restrictions apply.
C. Experimental conclusion REFERENCES
According to the above data, it can be concluded that [1] Matthew Zeiler, D., & Rob, F. "Visualizing and understanding
when the system detects PCB defects, the overall power convolutional neural networks." ECCV, 2014.
consumption of the system is 7.108W, and the detection [2] Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. "You only
speed of single frame delay is 97.59ms while maintaining look once: Unified, real-time object detection." Proceedings of the
IEEE conference on computer vision and pattern recognition.
the accuracy of 0.789. It is worth mentioning that DPU 2016.
shows excellent performance in processing convolution [3] Kou, X., Liu, S., Cheng, K., & Qian, Y. "Development of a
operations, with a throughput of 2.44TOPS and a YOLO-V3-based model for detecting defects on steel strip
convolution processing delay of 10.03ms. surface." Measurement 182 (2021): 109454.
[4] Adibhatla, V. A., Chih, H. C., Hsu, C. C., Cheng, J., Abbod, M. F.,
& Shieh, J. S. "Applying deep learning to defect detection in
V. CONCLUSION printed circuit boards via a newest model of you-only-look-once."
In order to realize low-power PCB defect detection (2021).
equipment, aiming at the hardware deployment problem [5] Cai, Y., Yao, Z., Dong, Z., Gholami, A., Mahoney, M. W., &
of convolutional neural network, an embedded system Keutzer, K. "Zeroq: A novel zero shot quantization framework."
Proceedings of the IEEE/CVF Conference on Computer Vision
based on FPGA is designed in this paper, and successfully and Pattern Recognition. 2020.
deploys YOLOv3 neural network, which can be used for [6] Redmon, J., & Farhadi, A. "Yolov3: An incremental
industrial PCB defect detection. Firstly, the industrial improvement." arXiv preprint arXiv:1804.02767 (2018).
PCB defect detection method and the academic research [7] Zhang, H., Wang, J., Sun, Z., Zurada, J. M., & Pal, N. R. "Feature
status are quoted, thus introducing the content we selection for neural networks using group lasso regularization."
designed. Secondly, the operation process of YOLOv3 IEEE Transactions on Knowledge and Data Engineering 32.4
algorithm and the operation architecture of FPGA (2019): 659-673.
embedded system are introduced. Thirdly, the hardware [8] He, K., Zhang, X., Ren, S., & Sun, J. "Deep residual learning for
image recognition." Proceedings of the IEEE conference on
deployment of YOLOv3 neural network is completed by computer vision and pattern recognition. 2016.
designing the software, hardware and hardware-software
[9] Nurvitadhi, E., Venkatesh, G., Sim, J., Marr, D., Huang, R., Ong
co-design of FPGA embedded system. Finally, according Gee Hock, J., ... & Boudoukh, G. "Can FPGAs beat GPUs in
to the experimental test, this system has good speed and accelerating next-generation deep neural networks?." Proceedings
power consumption performance when used in industrial of the 2017 ACM/SIGDA international symposium on field-
PCB defect detection, which basically meets the industrial programmable gate arrays. 2017.
needs. Therefore, the design content of this paper not only [10] Wang, J., Ye, Z., Gao, W., & Zurada, J. M. "Boundedness and
helps to promote the deployment and application of deep convergence analysis of weight elimination for cyclic training of
neural networks." Neural Networks 82 (2016): 49-61.
learning methods in industrial PCB defect detection, but
[11] Xie, X., Zhang, H., Wang, J., Chang, Q., Wang, J., & Pal, N. R.
also has certain reference significance for the subsequent "Learning optimized structure of neural networks by hidden node
hardware deployment of other higher-performance deep pruning with $ L_ {1} $ regularization." IEEE Transactions on
learning algorithms. cybernetics 50.3 (2019): 1333-1346.
[12] Yu, Q., Wang, C., Ma, X., Li, X., & Zhou, X. "A deep learning
prediction process accelerator based FPGA." 2015 15th
IEEE/ACM International Symposium on Cluster, Cloud and Grid
Computing. IEEE, 2015.
535
Authorized licensed use limited to: COLLEGE OF ENGINEERING - Pune. Downloaded on August 05,2024 at 10:10:40 UTC from IEEE Xplore. Restrictions apply.