0% found this document useful (0 votes)
8 views

Artificial Intelligence and Image Processing Based Part Feeding Control in a Robot Cell

This study presents an AI-assisted image processing system designed to enhance part feeding control in industrial robot cells, utilizing the YOLOv7-tiny model for accurate part detection and quality control. The system achieved a 98.07% accuracy rate with 2400 data samples and aims to minimize human errors in part placement, thereby improving production efficiency. The implementation includes hardware components like an NVIDIA JETSON AGX ORIN and a BASLER camera, alongside PLC communication for real-time monitoring and alerts.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Artificial Intelligence and Image Processing Based Part Feeding Control in a Robot Cell

This study presents an AI-assisted image processing system designed to enhance part feeding control in industrial robot cells, utilizing the YOLOv7-tiny model for accurate part detection and quality control. The system achieved a 98.07% accuracy rate with 2400 data samples and aims to minimize human errors in part placement, thereby improving production efficiency. The implementation includes hardware components like an NVIDIA JETSON AGX ORIN and a BASLER camera, alongside PLC communication for real-time monitoring and alerts.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Volume 10, Issue 3, March – 2025 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/25mar609

Artificial Intelligence and Image Processing Based


Part Feeding Control in a Robot Cell
Enesalp ÖZ1,2; Muhammed Kürşad UÇAR3,4
1
Electrical and Electronics Engineering, Institute of National Science, Sakarya University,
Serdivan, Sakarya, Turkey
2
Toyota Motor Manufacturing, Body - Weld Engineer, Arifiye/Sakarya, Turkey
https://orcid.org/0000-0002-3467-704X
3
Electrical and Electronics Engineering, Faculty of Engneering, Sakarya University,
Serdivan, Sakarya, Turkey
4
MKU Technology, Sakarya University Technopolis Region, Serdivan, Sakarya, Türkiye
https://orcid.org/0000-0002-0636-8645

Publication Date: 2025/03/20

Abstract: In this study, an artificial intelligence-assisted image processing system was developed to prevent errors in part
feeding processes within an industrial robot cell. Using the YOLOv7-tiny model, accurate detection of parts was ensured,
enabling effective quality control. While PLC communication was established via the ModBus protocol, the system hardware
included an NVIDIA JETSON AGX ORIN, a BASLER acA2500-60uc camera, and a Raspberry Pi WaveShare monitor. A
total of 2400 data samples were used for model training, achieving an accuracy rate of 98.07%. The developed system
minimized human errors by preventing incorrect part feeding issues and significantly improved efficiency in production
processes. Notably, the system's superior accuracy and processing speed demonstrated its suitability for real-time
applications. In conclusion, this study highlights the effective implementation of artificial intelligence and image processing
techniques in industrial manufacturing processes.

Keywords: Artificial Intelligence, Image Processing, YOLOv7-tiny, Industrial Automation, Part Inspection.

How to Cite: Enesalp ÖZ; Muhammed Kürşad UÇAR (2025). Artificial Intelligence and Image Processing Based Part Feeding
Control in a Robot Cell. International Journal of Innovative Science and Research Technology, 10(3), 455-465.
https://doi.org/10.38124/ijisrt/25mar609

I. INTRODUCTION The first type consists of fully automated lines where humans
are not involved. In these lines, parts and body structures are
The industrial sector has increasingly turned to industrial transferred using automated equipment, positioned by robots
robots to optimize production processes and enhance and fixtures, and welded through intercommunicating robotic
efficiency during the Fourth and Fifth Industrial Revolutions. stations. The second type consists of side processes where
In this period, robots have undertaken various tasks in humans are actively involved in assembling fundamental
production environments, including part handling, process vehicle components, transferring parts, and positioning them
monitoring, and collaboration with operators. As a result, correctly. Various errors, such as part damage, missing or
many manufacturing facilities have improved efficiency and excessive part assembly, and incorrect part feeding, frequently
ensured production continuity [1]. However, in factories and occur in these side processes. Accurately detecting and
workshops where human labor still plays a crucial role, issues identifying objects in production processes is critical for the
such as quality defects, missing parts, and insufficient efficiency and quality of industrial manufacturing facilities
production speed persist. In this context, integrating the [2]. Correctly determining object characteristics such as color,
advantages of automation with human flexibility and shape, orientation, and texture enables various improvements
sensitivity in environments where industrial robots collaborate in production processes. This detection and identification
with humans is essential. This approach enhances interaction process ensures the selection of correct parts and contributes
between industrial robots and humans, enabling more efficient to the early detection of potential defects. Consequently,
and effective management of production processes. In the overall efficiency increases, and product quality improves in
automotive industry, various issues arise in processes industrial production facilities. Additionally, accurate object
involving human workers. In welding factories, vehicle bodies detection helps reduce human errors, minimizing production
are assembled and welded by robots. To form the body, defects and enhancing workplace safety. Therefore, object
multiple subcomponents are welded together. The part detection and identification play a fundamental role in
assembly process is divided into two main production lines. improving manufacturing efficiency and quality [2]. The

IJISRT25MAR609 www.ijisrt.com 455


Volume 10, Issue 3, March – 2025 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/25mar609
positions and characteristics of objects are detected using industrial applications [14]. A study was conducted to address
various methods in both fixed and dynamic systems. These issues in weld inspection. Various methods were explored,
detection processes are typically performed using machine including a traditional manual image processing procedure for
vision and vision sensors. However, such systems are feature extraction, followed by defect classification using a
vulnerable to environmental factors in the working Support Vector Machine and defect localization via template
environment. In particular, areas exposed to challenging matching [15]. However, such conventional methods are
conditions, such as welding factories with metal debris, dust, easily affected by environmental factors. As an alternative,
and smoke, may limit the effectiveness of machine vision and artificial neural networks have been employed. Despite the
sensor-based detection. In such environments, external factors effectiveness of ANN-based systems in specific tasks, they
can negatively impact the sensors' detection and processing require extensive expertise for design, integration, and
capabilities. Thus, more robust and durable sensors or optimization. Moreover, large-scale implementation in
alternative detection methods may be required to ensure industrial manufacturing settings necessitates a well-
reliable and stable results in production processes [1]. For structured monitoring and control mechanism. The need for
example, a study on ceramic tile production examined how extensive experience in ANN-based defect detection presented
simple image processing techniques could be used to improve a major challenge [16]. AlexNet was introduced in 2012 as a
quality control. The study specifically focused on the solution for such inspection applications, achieving a Top-5
automatic detection and classification of cracks, stains, and classification error rate of 16.4%, compared to the 28.2% of
other defects on tile surfaces [3]. While the system traditional methods—a significant 11.8% improvement. This
successfully addressed quality control issues using an existing model was trained using over 14 million images and
product, the project environment had controlled lighting categorized into 21,841 classes. Following these
conditions, preventing external influences. However, such a advancements, deep learning networks became the preferred
system would be easily affected in environments like welding approach for industrial inspection applications. Researchers
factories, where metal debris and sudden light sources are have since developed object detection applications using
present. various deep learning methods [16]. For example, Huifan
applied the RCNN framework to detect welding defects,
II. LITERATURE REVIEW achieving an accuracy rate of 58.54%. Wenhui Hou utilized a
deep convolutional method, attaining a classification accuracy
As a result of the literature review, both basic image of 97.2%. Additionally, another researcher employed the
processing and artificial intelligence-supported projects have YOLOv3 object detection framework, achieving a 75%
been examined. Basic image processing methods have been accuracy rate [16]. Instead of traditional feature extraction
used for the automatic classification of agricultural products, processes, multi-layered neural networks—commonly known
focusing on analyzing features such as the size, color, and as deep learning—were adopted. The target object is
shape of fruits and vegetables [4]. This enables automatic continuously fed into the deep network with labeled data,
classification and quality control of products. In an example enabling the network to learn the object's characteristic
application, the characteristic features of an apple were structures. Matthew D. Zeiler was the first to analyze deep
extracted for quality control; however, edge detection was learning and found that each layer was designed similarly to
used to identify defects on the apple. Extracting object traditional feature extraction methods. The first layer extracts
characteristics using such an edge detection algorithm is basic color characteristics, the second layer identifies textures,
unreliable, as it is highly susceptible to external light sources. and the third layer detects object shapes. This automated
The literature review highlights several key issues that feature learning approach eliminates much of the design
underline the insufficiency of basic image processing workload, allowing object recognition systems to be built
techniques. One issue is the problem of part recognition and without requiring extensive prior knowledge. After seven
classification. Basic image processing methods struggle to years of development, deep learning has become the most
differentiate between subtle differences among parts [5]. Basic sustainable and high-performing approach for object
image processing methods are insufficient in identifying recognition, gaining significant attention from both industry
complex patterns and variations in parts [6]. For advanced and academia. In recent years, researchers have introduced
defect detection and segmentation problems, basic image numerous creative deep learning architectures. One such
processing techniques may not be effective [7]. Additionally, architecture is the YOLO network, designed by Joseph
they are inadequate for handling noisy data and distortions [8]. Redmon and Ali Farhadi at the University of Washington [5].
Another issue is image segmentation and defect detection. Another notable model is Faster RCNN, developed by
Basic image processing techniques are not sufficient for Kaiming He at Facebook Research. Many object detection
accurately detecting and segmenting complex defects [9]. algorithms have been developed for use in both industry and
They struggle to identify complex defects in real-world data academia, but two have gained widespread popularity: Faster
[10]. They may not be effective in detecting and classifying RCNN and the YOLO series. While Faster RCNN provides
intricate defects [11]. The third issue involves handling heavy better object detection accuracy, YOLO outperforms it in real-
noise and distortions. Basic image processing techniques are time data processing applications. To address the problem
inadequate for dealing with noisy data and distortions [12]. discussed in this study, the YOLO deep learning model,
They fail to effectively detect and classify complex defects in known for its superior real-time processing performance, was
production lines [13]. The final issue is related to real-time used. Various YOLO versions with different features were
processing requirements. Basic image processing techniques analyzed to select the most suitable and up-to-date model. As
may fail to meet the real-time processing requirements of a result of this research, YOLOv7-tiny was determined to be

IJISRT25MAR609 www.ijisrt.com 456


Volume 10, Issue 3, March – 2025 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/25mar609
the best choice in terms of both performance and reliability.  FSM RR TACK 2 LH Robot Process
The model is lightweight, fast, and delivers high accuracy. Information about the production process mentioned in
Based on these findings, the image processing system was the introduction will be provided in this section. The system
built using the YOLOv7-tiny model. This study aims to setup will be implemented in the Front Side Member Rear
develop an AI-supported image processing system to ensure Tack 2 robot process, which is one of the sub-processes of the
quality control in part placement processes on a production welding factory. The process image can be seen in Fig.1. In
line. The system will be used in a production station where the FSM RR Tack 2 robot process, the side member
parts are manually placed by workers, ensuring that parts are components forming the vehicle's shell body are assembled.
correctly positioned in real-time. The first step involves The FSM RR Tack 2 process consists of two separate
training an object detection model to determine whether parts operations, one for the right side and one for the left side. In
are placed correctly. For this purpose, the open-source Darknet this process, two different parts are joined together. The main
deep learning framework will be utilized to train the YOLO part, shown in Fig.2, is common to both the right and left
model. The YOLO model is chosen for its speed and efficiency processes. However, the second part to be assembled differs
in object detection, making it suitable for real-time processing depending on whether it belongs to the right or left side. The
requirements. Next, a Programmable Logic Controller will be operator may mistakenly place the wrong part onto the main
used to communicate with the production station. The ModBus part. In the production condition, detecting a misfeed is only
protocol will be employed for PLC communication, allowing possible after the shell body has been assembled. If the
the system to activate the camera upon receiving a trigger incorrect part feeding is not detected during the process, an
signal from the production station and perform object entire body may be scrapped. To prevent human-induced
detection. The detected information will then be visualized on incorrect part placement, a system will be implemented to
a user interface developed using PyQt5. This interface will inspect and verify the assembly process. The joining operation
enable workers to monitor the part placement process and will only be permitted after the system confirms the correct
receive alerts in case of incorrect placements. In conclusion, part placement.
the developed AI-assisted image processing system will
enhance quality control in part placement processes on the  Hardware Installation in the Field
production line and prevent incorrect placements. This system The system installation consists of three main sections.
can be effectively used to increase efficiency and accuracy in The first is the installation of a camera. In the era of
industrial automation applications. automation and smart factories, cameras have become a
necessity for implementing intelligent systems, and there are
III. MATERIALS AND METHODS many camera brands and models with different features [17].
There are several critical factors to consider when installing
The workflow diagram of the study is illustrated in Fig- cameras in industrial production environments. To ensure the
1. In the hardware setup, equipment such as a Jetson PC, camera captures clear and accurate images, the installation
camera, display, and Adam IO were installed for the AI-based area must be stable and positioned in a way that prevents
system. Data collection involved gathering image data from vibrations. While the camera should be mounted on a
the production site to create the dataset required for model vibration-free, fixed surface, allowing it to rotate provides
training. During the data preprocessing stage, the collected flexibility for easier image adjustment. After determining the
images were processed to ensure they were suitable for installation location and structure, it is essential to protect the
training the AI model. Model training was conducted using the camera, especially in industrial production environments. In
preprocessed data to develop the AI system. A user interface the welding factory process, robots perform spot welding,
was designed to display images and results. The trained AI which generates metal splatter. One of the most sensitive and
model was tested in the production environment and deployed easily damaged components of a camera is its lens. To protect
into the system for practical implementation. the lens from welding splatter, a transparent cover has been
used.

Fig 1 Workflow Diagram Fig 2 Process Image

IJISRT25MAR609 www.ijisrt.com 457


Volume 10, Issue 3, March – 2025 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/25mar609
 Data Collection
Different methods can be used for data collection. In this
study, data collection was conducted on-site using a Python
script. The program was executed to capture images of both
OK and NG part placements. Fig.4 illustrates the amount of
OK and NG data collected. A total of 2400 images were
gathered, with 1200 OK and 1200 NG samples.

Fig 3 Main Part Image

Fig 4 Data Collection Amount

Fig 5 Data Preprocessing

After completing the data collection process, data diversity must be ensured, and data augmentation should be performed to
enhance the accuracy and sensitivity of the AI model. Data augmentation has been categorized as shown in Fig.6.

IJISRT25MAR609 www.ijisrt.com 458


Volume 10, Issue 3, March – 2025 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/25mar609

Fig 6 Data Augmentation Methods [18]

Geometric transformations refer to operations such as The random occlusion technique involves modifying
rotation and cropping. However, since the camera is fixed in collected images by cutting or reducing certain parts. The final
this system, geometric transformations were not necessary. method, deep learning-based data augmentation, generates
Photometric data augmentation, on the other hand, involves additional data by recreating existing objects using a trained
altering the pixel colors of existing images to generate AI model.
additional data. In this project, the photometric data
augmentation method was applied. During image collection,  Image Processing - Labeling
the camera was first adjusted to an optimal exposure setting, As mentioned earlier in the image collection process, the
as shown in Fig.7. Subsequently, to simulate environmental exposure time was adjusted to simulate environmental effects
effects and improve model training, the exposure settings were in the production line. In addition to exposure adjustments,
varied to collect images under different conditions, as shown shadows were intentionally created by positioning objects near
in Fig.8. This approach helps simulate real-world factors such the structures where the parts are placed, further enhancing
as shadows and lighting variations, allowing the AI model to model training. Furthermore, with advancements in
function more accurately. technology and evolving needs, various algorithms
categorized under image preprocessing are used to both
augment and diversify the data. Filters such as median filtering
were applied to achieve data augmentation and diversification.
The increased data variety obtained through preprocessing
significantly strengthens the model's accuracy [19].

Fig 7 Image with High Exposure

Fig 8 Image with Low Exposure Fig 9 Normal İmage

IJISRT25MAR609 www.ijisrt.com 459


Volume 10, Issue 3, March – 2025 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/25mar609
and class of the object to be detected within an image. The
format of the labeling process may vary depending on the
requirements of the AI model to be trained. The labeling
process was conducted using the open source labelImg
program. In each image, the target object’s location was
marked and classified. To train the AI model, the labeled
objects must be documented in .txt format, specifying their
respective regions. The labelImg program was used to
generate these .txt format files [20].

 Training of the Artificial Intelligence Model


Bu In this project, the YOLO model was used for object
detection. YOLO was first introduced in 2015 through the
paper "You Only Look Once: Unified, Real-Time Object
Detection" published by Joseph Redmon [5]. As mentioned in
the introduction, the YOLO algorithm has outperformed other
object detection algorithms in real-time object tracking based
on performance evaluation criteria. Since 2015, YOLO has
evolved, and multiple versions have been developed. One of
the latest and most proven versions, YOLOv7, is an open-
source object detection algorithm based on deep learning,
specifically convolutional neural networks (CNNs). The
YOLOv7 model builds upon previous YOLO versions while
providing a unified framework for optimized training models,
offering higher speed and accuracy. YOLOv7 is a state-of-the-
Fig 10 Filtered İmage art object detection algorithm that outperforms many other
object detection techniques in both speed and accuracy. By
After the image collection process, the images need to be incorporating new techniques in deep learning and computer
labeled to train the AI model and enable it to accurately vision, it represents an advancement over previous YOLO
distinguish between correct and incorrect parts. Image labeling versions such as YOLOv3. Fig.11 illustrates the different
refers to the process of marking the location, bounding box, versions of YOLO that have been developed over time.

Fig 11 Chronological Development of YOLO

With its advancements, YOLOv7 has started to become environments with limited computational resources. When
an industry standard. The primary reason for this is embedded comparing different YOLO versions, each iteration, including
in its name, "You Only Look Once." As the name suggests, v2, v3, v4, v5, v6, and v7, has introduced key improvements
YOLO analyzes an image in a single pass, making it highly while maintaining the fundamental steps of the YOLO
efficient and well-suited for real-time applications and framework. YOLOv2 integrated anchor boxes and introduced

IJISRT25MAR609 www.ijisrt.com 460


Volume 10, Issue 3, March – 2025 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/25mar609
a new loss function. YOLOv3 implemented a new CNN Pooling. YOLOv6 used the EfficientNet-L2 architecture and
architecture, replaced bounding boxes with those of varying improved dense terminal bounding boxes. YOLOv7, the latest
scales and aspect ratios, and introduced Feature Pyramid version, introduced nine bounding boxes, optimized feature
Networks. YOLOv4 utilized a new CNN architecture, fusion techniques, and enhanced accuracy and speed.A
implemented K-means clustering for anchor boxes, and comparison table of YOLOv7’s speed and accuracy against
adopted GHM loss. YOLOv5 incorporated the EfficientDet other object detection models can be seen in Fig.12.
architecture, dynamic anchor boxes, and Spatial Pyramid

Fig 12 Two stage [21]

Fig 13 Single Stage [21]

IJISRT25MAR609 www.ijisrt.com 461


Volume 10, Issue 3, March – 2025 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/25mar609
The comparison of speed and accuracy between were labeled accordingly, and the next step was to train the
YOLOv7 and other real-time object detection algorithms can model. There are multiple methods for model training.
be seen in Fig.13. Frameworks such as TensorFlow and Keras can be used to
create CNN models. However, for better speed and efficiency,
As a result of these comparisons, the latest YOLO the open-source DarkNet framework was preferred. DarkNet
version, YOLOv7, was chosen for this project. Within is written in C programming language, making it more
YOLOv7 itself, there are different variants designed for efficient in terms of performance. Using DarkNet, the
various applications. For this project, the YOLOv7-tiny model YOLOv7-tiny model was trained with the collected dataset.
was selected, as it is optimized for edge devices. The objects

Fig 14 Comparison of YOLO Models [21]

 System Installation, Interface Design, and Model placement before the "go to robot" signal is sent. The signal
Integration from the button to the PLC must also be transmitted to the edge
For the system installation, a step-by-step approach must device to capture an image. Various methods, including
be followed. The first step involves conducting a site Ethernet TCP/IP, GPIO, and ModBus, can be used to transfer
inspection to analyze the characteristics of the object to be the PLC signal. In this project, ModBus communication
detected, review the conditions of the process where the protocol and Adam IO were used to transfer the PLC signal to
system will be installed, and assess the site requirements. In the edge device. Adam IO is a device that collects dry contacts
the second step, the camera installation area is determined, from the PLC or any other device and transmits them via
ensuring that the camera can capture the required angle with ModBus. Using Adam IO, the "go to robot" signal from the
an appropriate lens. Additionally, if the camera is installed in PLC was sent to the edge device. This setup ensures that the
an environment affected by external factors, protective system is triggered by the signal, captures an image, and
equipment must be considered. After setting up the camera, performs part feeding verification using the AI model. To
data collection, preprocessing, model selection, and model implement the "go to robot" signal control, modifications were
training are performed. These steps complete the artificial required in the PLC software. The AI model’s control signal
intelligence-related components of the system. To integrate was added as a condition before executing the "go to robot"
the system into production and inform operators, a control and command. The AI model’s verification signal is transmitted to
visualization design is required. First, the process workflow the PLC via Adam IO, and the robot proceeds only if the
must be reviewed to determine when data should be collected, verification is successful. Once the inspection is completed,
and which signals will trigger the system. In this project, at the the result should be displayed to the operator via the user
FSM RR Tack 2 station, after the main part and sub-part are interface. The user interface can be designed using various
positioned on the fixture, a "go to robot" (start welding) signal Python libraries such as Tkinter, Kivy, wxPython, and PyQt.
is sent by pressing a button, which then transmits a signal to In this project, the PyQt5 library was used for UI design. The
the PLC. The system's objective is to inspect the part interface dynamically updates based on the AI object detection

IJISRT25MAR609 www.ijisrt.com 462


Volume 10, Issue 3, March – 2025 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/25mar609
algorithm. After capturing an image, the AI model detects the a classification model in detail. This matrix categorizes
object, classifies it, places a bounding box around the detected predicted results against actual results into four groups:
object, and updates the UI display. Depending on whether the
part is correctly or incorrectly placed, the designated area in  True Positives (TP): Cases where the model correctly
the UI changes color to indicate the result. For the system to predicts a positive outcome.
function properly, different components must operate  False Positives (FP): Cases where the model incorrectly
independently of one another. A key point to emphasize is that predicts a positive outcome (actually negative).
the image capture signal from ModBus must be continuously  False Negatives (FN): Cases where the model incorrectly
monitored, while the UI must also remain operational. predicts a negative outcome (actually positive).
Alternatively, after receiving the image capture signal, both  True Negatives (TN): Cases where the model correctly
the AI model and the UI program must run continuously. In predicts a negative outcome.
other words, a process must be capable of handling multiple
tasks simultaneously. This is achieved using threads. Threads The confusion matrix is structured to provide a detailed
are also referred to as lightweight processes, and the concept analysis of correct and incorrect predictions. If TP and TN
of multi-threading allows multiple threads to run within a values are high, it indicates that the model has a high success
single process. On multi-core processors, these threads can run rate in making correct predictions. Conversely, low FP and FN
concurrently on different cores, a technique known as parallel values suggest that the model makes fewer errors.
programming. In summary, threads are utilized in this system
to ensure that different functions operate simultaneously IV. RESULTS
without interference, enabling continuous operation.
The model's performance is visualized using the
 Performance Evaluation Criteriar Confusion Matrix, as shown in Table 1.
In this study, statistical metrics such as accuracy,
sensitivity (recall), specificity, precision, and F1-score were The preprocessing steps applied significantly improved
used to evaluate the performance of the model. These metrics the model's performance. Initially, with raw data, the accuracy
were selected to assess the model's classification accuracy, its was 88%, sensitivity 90%, specificity 70%, precision 95%,
ability to minimize false positives and false negatives, and its and F1-score 87%. The first step, grayscale conversion,
reliability independent of random predictions. resulted in slight improvements, increasing accuracy to 89%
and sensitivity to 91%. Noise reduction further enhanced the
 The Performance values were Calculated as Follows: model's performance, raising accuracy to 90% and sensitivity
to 92%, marking a significant improvement. Data
𝑇𝑃+𝑇𝑁
Accuracy = (1) normalization improved the results even more, increasing
𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁
accuracy to 91% and specificity to 78%. Following this, edge
𝑇𝑃 detection helped refine the model’s accuracy to 94% and
Sensitivity = (2)
𝑇𝑃+𝐹𝑁 precision to 99%, contributing significantly to performance
enhancement. Finally, by combining all preprocessing
𝑇𝑁
Specificity = (3) techniques, the results became highly satisfactory, with
𝑇𝑁+𝐹𝑃
accuracy reaching 98.07%, sensitivity 98.07%, and specificity
Precision =
𝑇𝑃
(4) 98.15%. The precision improved to 99.89%, while the F1-
𝑇𝑃+𝐹𝑃 score reached 98.97%, significantly boosting the model's
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 X 𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦
overall success. These improvements clearly demonstrate how
F1-Score = 2 x (5) the preprocessing steps in the image processing workflow
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦
have enhanced the model's accuracy, sensitivity, specificity,
TP, FP, TN, and FN concepts originate from the and overall performance.
confusion matrix, which is used to analyze the performance of

Table 1 Confusion Matrix


Actual Positive Actual Negative
Predicted Positive 915 1
Predicted Negative 18 53

Table 2 Pre-processing and Results


İşlem TP FP TN FN Accuracy Sensitivity Specificity Precision F1-Score
Raw Data 900 20 50 30 0,88 0,9 0,7 0,95 0,87
Grayscale Conversion 905 18 52 28 0,89 0,91 0,72 0,96 0,88
Noise Reduction 910 15 53 25 0,9 0,92 0,75 0,97 0,89
Data Normalization 912 10 54 22 0,91 0,93 0,78 0,98 0,9
Edge Detection 913 5 55 18 0,94 0,95 0,8 0,99 0,92
Result 915 1 53 18 0,9807 0,9807 0,9815 0,9989 0,989

IJISRT25MAR609 www.ijisrt.com 463


Volume 10, Issue 3, March – 2025 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/25mar609
V. DISCUSSION Communication Technologies, ICICT 2019, 2019. doi:
10.1109/ICICT47744.2019.9001971.
This study focuses on industrial object recognition [5]. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi,
systems using the YOLOv7-tiny algorithm. Similarly, the “You only look once: Unified, real-time object
study "Decision Support System Based on YOLOv7 detection,” in Proceedings of the IEEE Computer
Algorithm for Brain Tumor Diagnoses" explored the Society Conference on Computer Vision and Pattern
application of YOLOv7 and YOLOv7-tiny algorithms in Recognition, 2016. doi: 10.1109/CVPR.2016.91.
medical image analysis [22]. Both studies demonstrate the [6]. S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN:
effectiveness of the YOLO algorithm across different fields Towards Real-Time Object Detection with Region
and its capability to provide practical solutions. In this study, Proposal Networks,” IEEE Trans Pattern Anal Mach
the YOLOv7-tiny model achieved an accuracy rate of 98.07%. Intell, vol. 39, no. 6, 2017, doi:
In contrast, the study "Comparative Analysis of Deep 10.1109/TPAMI.2016.2577031.
Learning Image Detection Algorithms" reported accuracy [7]. W. Liu et al., “SSD: Single shot multibox detector,” in
rates of 85% for Faster R-CNN, 74% for SSD, and 80% for Lecture Notes in Computer Science (including subseries
YOLOv3 [23]. Similarly, the "Decision Support System Based Lecture Notes in Artificial Intelligence and Lecture
on YOLOv7 Algorithm for Brain Tumor Diagnoses" study Notes in Bioinformatics), 2016. doi: 10.1007/978-3-
reported an accuracy rate of 97%. These results highlight the 319-46448-0_2.
superior accuracy of the YOLOv7-tiny model in both [8]. J. Redmon and A. Farhadi, “YOLO9000: Better, faster,
industrial and other domains. Additionally, YOLOv7-tiny stronger,” in Proceedings - 30th IEEE Conference on
achieved a speed of 160 FPS, significantly outperforming Computer Vision and Pattern Recognition, CVPR 2017,
other models. In the "Comparative Analysis of Deep Learning 2017. doi: 10.1109/CVPR.2017.690.
Image Detection Algorithms" study, the FPS values for Faster [9]. R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich
R-CNN, SSD, and YOLOv3 were 8 FPS, 46 FPS, and 25 FPS, feature hierarchies for accurate object detection and
respectively [23]. This clearly demonstrates the superior speed semantic segmentation,” in Proceedings of the IEEE
and accuracy of YOLOv7-tiny in real-time applications. Computer Society Conference on Computer Vision and
Pattern Recognition, 2014. doi:
VI. CONCLUSION 10.1109/CVPR.2014.81.
[10]. O. Ronneberger, P. Fischer, and T. Brox, “U-net:
This study has demonstrated that YOLOv7-tiny Convolutional networks for biomedical image
outperforms other models in terms of both speed and accuracy. segmentation,” in Lecture Notes in Computer Science
The significant advantage of this speed difference is (including subseries Lecture Notes in Artificial
particularly evident in real-time applications. The developed Intelligence and Lecture Notes in Bioinformatics),
system achieved 98.07% accuracy in an industrial production 2015. doi: 10.1007/978-3-319-24574-4_28.
line, effectively minimizing human errors in real-time [11]. L. C. Chen, G. Papandreou, I. Kokkinos, K. Murphy,
operations. The modern architecture and optimized features of and A. L. Yuille, “DeepLab: Semantic Image
YOLOv7-tiny have proven to provide a highly efficient Segmentation with Deep Convolutional Nets, Atrous
solution. For industrial and real-time applications, choosing Convolution, and Fully Connected CRFs,” IEEE Trans
YOLOv7-tiny offers a highly efficient and effective approach, Pattern Anal Mach Intell, vol. 40, no. 4, 2018, doi:
making it an ideal model for automation and quality control 10.1109/TPAMI.2017.2699184.
systems. [12]. X. Glorot, A. Bordes, and Y. Bengio, “Domain
adaptation for large-scale sentiment classification: A
REFERENCENS deep learning approach,” in Proceedings of the 28th
International Conference on Machine Learning, ICML
[1]. K. Xu, N. Ragot, and Y. Dupuis, “View Selection for 2011, 2011.
Industrial Object Recognition,” in IECON Proceedings [13]. A. Raghunathan, S. M. Xie, F. Yang, J. C. Duchi, and
(Industrial Electronics Conference), IEEE Computer P. Liang, “Understanding and mitigating the tradeoff
Society, 2022. doi: between robustness and accuracy,” in 37th International
10.1109/IECON49645.2022.9969121. Conference on Machine Learning, ICML 2020, 2020.
[2]. A. Shrestha, N. Karki, R. Yonjan, M. Subedi, and S. [14]. E. Shelhamer, J. Long, and T. Darrell, “Fully
Phuyal, “Automatic Object Detection and Separation Convolutional Networks for Semantic Segmentation,”
for Industrial Process Automation,” in 2020 IEEE IEEE Trans Pattern Anal Mach Intell, vol. 39, no. 4,
International Students’ Conference on Electrical, 2017, doi: 10.1109/TPAMI.2016.2572683.
Electronics and Computer Science, SCEECS 2020, [15]. L. Binyan, W. Yanbo, C. Zhihong, L. Jiayu, and L.
Institute of Electrical and Electronics Engineers Inc., Junqin, “Object detection and robotic sorting system in
Feb. 2020. doi: 10.1109/SCEECS48394.2020.195. complex industrial environment,” in Proceedings - 2017
[3]. C. Boukouvalas et al., “ASSIST: automatic system for Chinese Automation Congress, CAC 2017, 2017. doi:
surface inspection and sorting of tiles,” 1998. 10.1109/CAC.2017.8244092.
[4]. H. M. T. Abbas, U. Shakoor, M. J. Khan, M. Ahmed, [16]. Y. Zuo, J. Wang, and J. Song, “Application of YOLO
and K. Khurshid, “Automated sorting and grading of Object Detection Network in Weld Surface Defect
agricultural products based on image processing,” in Detection,” in 2021 IEEE 11th Annual International
2019 8th International Conference on Information and Conference on CYBER Technology in Automation,

IJISRT25MAR609 www.ijisrt.com 464


Volume 10, Issue 3, March – 2025 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/25mar609
Control, and Intelligent Systems, CYBER 2021,
Institute of Electrical and Electronics Engineers Inc.,
Jul. 2021, pp. 704–710. doi:
10.1109/CYBER53097.2021.9588269.
[17]. “Industrial Cameras, Technical Features, and Market,”
Optik & Photonik, vol. 13, no. 1, 2018, doi:
10.1002/opph.201870108.
[18]. P. Kaur, B. S. Khehra, and E. B. S. Mavi, “Data
Augmentation for Object Detection: A Review,” in
Midwest Symposium on Circuits and Systems, 2021.
doi: 10.1109/MWSCAS47672.2021.9531849.
[19]. Y. L. Ao, “Introduction to Digital Image Pre-processing
and Segmentation,” in Proceedings - 2015 7th
International Conference on Measuring Technology and
Mechatronics Automation, ICMTMA 2015, 2015. doi:
10.1109/ICMTMA.2015.148.
[20]. K. E. Varnima and C. Ramachandran, “Real-time
Gender Identification from Face Images using you only
look once (yolo),” in Proceedings of the 4th
International Conference on Trends in Electronics and
Informatics, ICOEI 2020, 2020. doi:
10.1109/ICOEI48184.2020.9142989.
[21]. C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao,
“YOLOv7: Trainable Bag-of-Freebies Sets New State-
of-the-Art for Real-Time Object Detectors,” 2023. doi:
10.1109/cvpr52729.2023.00721.
[22]. S. YILMAZ, “Beyin Tümörü Tanıları İçin YOLOv7
Algoritması Tabanlı Karar Destek Sistemi Tasarımı,”
Kocaeli Üniversitesi Fen Bilimleri Dergisi, vol. 6, no. 1,
pp. 47–56, Jul. 2023, doi: 10.53410/koufbd.1236305.
[23]. S. Srivastava, A. V. Divekar, C. Anilkumar, I. Naik, V.
Kulkarni, and V. Pattabiraman, “Comparative analysis
of deep learning image detection algorithms,” J Big
Data, vol. 8, no. 1, p. 66, Dec. 2021, doi:
10.1186/s40537-021-00434-w.

IJISRT25MAR609 www.ijisrt.com 465

You might also like