Understanding_house_numbers_for_delivery_robots-2024

The document discusses a project aimed at enhancing delivery robots' ability to detect house numbers using the SVHN dataset, focusing on developing a robust AI model with YOLO for real-time object detection. It details the methodology, including data pre-processing, model training, and hyper-parameter tuning, while comparing various AI models to determine the most effective approach. The findings indicate that YOLOv8n is the optimal choice due to its balance of performance and computational efficiency.

Uploaded by

Philipe

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views

Understanding_house_numbers_for_delivery_robots-2024

Uploaded by

Philipe

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

2024 IEEE International Conference on Industrial Technology (ICIT), 25-27 March 2024, Bristol, UK

DOI: 10.1109/ICIT58233.2024.10540817

Reading and understanding house numbers for delivery robots using the ”SVHN
Dataset”

1st Omkar Pradhan 2nd Dr. Gilbert Tang 3rd Christos Makris 4th Radhika Gudipati
SATM SATM Ocado Technology Ocado Technology
Cranfield University Cranfield University Hatfield, UK Hatfield, UK
Cranfield, UK Cranfield, UK [email protected] [email protected]
[email protected] [email protected]

Abstract— Detecting street house numbers in complex environ- 2) Pre-processing: The image data is being prepared
ments is a challenging robotics and computer vision task that before any machine learning processing.
could be valuable in enhancing the accuracy of delivery robots’ 3) Deciding the optimum AI model to train on for the
localisation. The development of this technology also has posi- best results: this part will compare the results for
tive implications for address parsing and postal services. This multiple possible AI models based on the necessary
project focuses on building a robust and efficient system that factors as follows.
deals with the complexities associated with detecting house
a) Considering the inference time
numbers in street scenes. The models in this system are trained
b) Increasing the efficiency by tuning the
on Stanford University’s SVHN (Street View House Numbers)
hyper-parameters
dataset. By fine-tuning the YOLO’s (You Only Look Once)
c) Real-time object detection analysis
nano model results with an effective detection range from 1.02
meters to 4.5. The optimum allowance for angle of tilt was 4) Hardware implementation: in this part, the model
±15o . The inference resolution was obtained to be 2160 ∗ 1620 needs to be implemented by using a camera and
with inference delay of 35 milliseconds thus doing the inferencing of live camera feed.
Index Terms—Artificial Intelligence, Character Recognition,
Computer Vision, Object Detection, YOLO, SVHN. 1.1. Related works

1. Introduction The state-of-the-art computational techniques can match

human-level accuracy in pattern recognition and object de-
In the dynamic field of artificial intelligence (AI) devel- tection when tested in a controlled environment, but this gap
opment, rapid advancements in computational capabilities widens as we go towards complex scenarios. To deal with
and sophisticated algorithms propel the creation of AI and number identification in the real-world environment, we
machine learning (ML) models, meticulously designed for need to have robust systems. [1] Used Convolution networks
precise object categorization to emulate human cognitive architecture to deal with the issue. Instead of implementing
abilities. Across industries, the widespread adoption of AI is max pooling, Lp pooling was implemented, and multi-stage
driven by its unparalleled accuracy and minimal downtime, features were used along with the training using stochastic
contributing transformative benefits in fields such as dis- gradient descent (SGD). With the implementation of this
ease diagnosis, fraud detection, and autonomous vehicles. architecture, and accuracy of pattern recognition improved
This project explores multiple facets of machine learning to 94.97%.
and machine vision, with applications extending to various To determine the most suitable machine learning model
domains, notably in the optimization of delivery robots. The for a specific application, researchers referred to various
fusion of AI and computer vision equips these robots with papers that summarized findings. In a study by [2], five ma-
the capability to navigate complex environments, efficiently chine learning models (Neural Network, K-Nearest Neigh-
recognize and handle objects, and seamlessly interact with bour, Random-forest, Decision tree, and bagging with gra-
their surroundings. This integration enhances the precision dient boost) were compared using the MNIST dataset. They
of object and number detection, fostering the evolution of applied multiple pre-processing techniques and found that
smart, efficient, and adaptive delivery systems. The project Neural Networks achieved the highest accuracy at 95.73%,
objectives are as follows: but struggled with poorly written digits. [3] also used
1) Literature review: This part develops the baseline the MNIST dataset, comparing Linear SVM, Multilayered
of the development by reviewing existing work on Perception, and Convolutional Neural Networks (CNN).
this research topic. They considered execution time, complexity, accuracy rate,

© 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including
reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or
reuse of any copyrighted component of this work in other works.
epochs, and hidden layers. SVM provided the fastest re- 2. Implementation
sults for simple data, while CNN excelled in accuracy and
efficiency for more complex data. [4] and [5] also trained
on MNIST, using Ensemble systems, TFE-SVM, C-NN, and To successfully implement the training, the data was
Large C-NN+. Their results reinforced that Neural networks, prepared in the format required for the training of the
particularly CNN, consistently outperformed other models specific model (also known as pre-processing), as required
in terms of efficiency and accuracy. for using YOLOv8. The process of data pre-processing,
[6] conducted a classification analysis using 18 augmenting and tuning parameters are defined in the further
Deep Neural Network (D-NN) models, including ResNets, subsections 2.1, 2.2, 2.3, 2.4 and 2.5
DenseNets, MobileNets, Nasnets, VGG Nets, and Alex
Nets. They trained these models on the ImageNet Dataset,
generating adversarial images from a random sample of
1000 images. Evaluation criteria included attack success 2.1. Bounding Box Labels
rate, distortion using l2 and l∞ norms, CLEVER scores,
and transferability, providing a comprehensive understand-
ing of model performance. [7], in a separate study, compared As stated in section 1.1, the study converged towards
VGG16, VGG19, and ResNet50 models trained on a custom the solution that YOLO is the optimum model to begin
dataset of 6000 images with five classes. The results showed with. According to Ultralytics, a YOLO model is to be
accuracies of 0.9667, 0.9707, and 0.9733. trained with a training and validation dataset. The bifurcated
Object detection combines localization and classification dataset’s image needs to be accompanied by the text file
and is often implemented using CNN architecture. In [8], with the information about the classes and the bounding
Faster-RCNN, YOLOv5 , and SSD were compared using box information defined in it. Ex. (if an image is having
an automobile training dataset. Faster-RCNN demonstrated the name ‘xyz.jpg’, the text file should be ‘xyz.txt’). The
better accuracy but was unsuitable for real-time applications information in the text file is to be exactly in the form which
due to its 2-stage nature, making YOLO the top performer. is required to train the YOLO model. The information is
In another comparison by [9] in 2021, YOLOv6 was pitted shown in table1.
against SSD, with SSD outperforming YOLO in terms of
Frames Per Second (FPS) and achieving a higher mean
TABLE 1. T EXT F ILE F ORMAT
Average Precision (mAP) score.
[10] delved into the SVHN (Street View House Num- C1 CBB1 X norm CBB1 Y norm W1 norm H1 norm
bers) dataset from Stanford, consisting of 73,257 images in C2 CBB2 X norm CBB2 Y norm W2 norm H2 norm
. . . . .
the training set and 26,032 in the test set. An additional
SVHN extra dataset, with 531,131 less complex images, . . . . .
introduces potential model bias. Feature learning, utilizing
methods such as Histogram of Oriented Gradients (HOG) . . . . .
Cn CBBn X norm CBBn Y norm Wn norm Hn norm
and Sauvola binarization, is employed to detect features in
these complex images. Post-processing involves comparing
results from algorithms like HOG binary features, K-means,
and Stacked Sparse Auto-Encoders. The findings favor the Where, Cn =nth Class number CBBn X norm = Nor-
K-means-based system, although a notable challenge is the malised nth Centre co-ordinate of Bounding Box in X-
continuous failure of the binarization algorithm to separate axis CBBn Y norm = Normalised nth Centre co-ordinate
characters from their surrounding backgrounds. of Bounding Box in Y-axis Wn norm = Normalised nth
Bounding Box Width Hn norm = Normalised nth Bounding
Box Height
1.2. Research Methodology Stanford University’s SVHN dataset comes in two for-
mats: one with all images formatted to 32x32 pixels, heavily
The study followed a general methodology that involved cropped to show only a single number, and another with
collecting relevant papers in the AI field, laying the foun- images in a general form including parts of the environment.
dation for the research direction. Once the basics were The ’digitstruct.mat’ file provides data on bounding boxes
established, specific models were chosen for the project. and classes. To create individual text files for each image,
Data analysis and preparation were conducted to facilitate a MATLAB code was developed. This code reads image
model training. During the training and testing phases, the dimensions, extracts information from the mat file, and
selected models were individually trained and fine-tuned normalizes and formats the data according to the required
for improved performance. Challenges encountered at each specifications.
stage were addressed through the study of GitHub libraries To get to know about the number of instances for which
and literature. This iterative process continued until the the classes appear in the training and validation set in total,
desired results were achieved. the graph is plotted.
2.4. Hyper-parameters

During the training of a model, multiple hyper-

parameters can be changed to check the impact of the
variance of those parameters on the inference. In this study,
the hyper-parameters that are studied and tuned are shown
in table 2. The models were tuned and analysed using the
different combinations.

Figure 1. Total number of instances of classes in training and validation

TABLE 2. T UNED H YPER - PARAMETERS
The graph, when plotted, shows that the highest number [HTML]D5DCE4Hyper-parameter [HTML]D5DCE4Values
of instances that appear are for number ‘1’, and it gradually Optimiser Auto, SGD, RAdams, Adamax
goes on decreasing till number ‘9’ and is equal to the Pretrained True / False
Degrees ±5, ±15, ±45
number ‘0’. Imgsz 240, 320, 640, 720, 1080
Model YOLOv8n, YOLOv8s, YOLOv8m, YOLOv8x

2.2. YAML file

The YAML file contains crucial details about the dataset,
specifying the locations of the training and testing datasets 3. Analysis and Discussion
and defining classes. Path indicates the absolute location
of the dataset folder. Train and Val denote the relative
locations of the training and validation datasets. Names
Before hyper-parameter tuning, the initial step involves
assign encoding to each class (e.g., 7 represents the encoded
identifying the best-performing models for the SVHN
value for ’H’). These encodings are reflected in the Cn
dataset using their default configurations. This preliminary
section of the Bounding Box Labels file, detailed in Section
performance analysis aims to streamline the process by
2.1.
excluding slower or less efficient models. The training and
testing procedures are carried out on the hardware specifi-
2.3. structure of the Files cations detailed in Table 3.
To train the model, the dataset with its labels and images
needs to be organised in a specific order for the training. TABLE 3. H ARDWARE S PECIFICATIONS
The order is shown in Figure 2 The absolute path of the
[HTML]D5DCE4Hardware Specification
CPU intel i5-11400H 6 cores 12 threads
GPU Nvidia RTX3050 Laptop GPU 4 GB
RAM 32 GB DDR4 3200 MHz
Max Power TDP 180W
[HTML]D5DCE4Software Specification
OS Ubuntu 22.04.2 Jammy Jellybean
IDE VS Code
Python version 3.10
Cuda version 11.8.0
Cudnn version 8.6.0.163
Tensorflow version 2.12.0

Figure 2. Structure of the files

file director is to be given in the YAML file, as explained 3.1. Default Model Performance Analysis
in section 2.2. The names of the files matter the most while
training the YOLO models using the Ultralytics library.
There is a compulsion to have folders named ‘Images’
and ‘Labels’ in both training and validation because the To decide which model works the best, multiple aspects
Utralytics library searches for the folders with exact names. need to be considered. To get the overall working of the
The names of the train and validation folder can be user- different versions of the YOLOv8 models, parameters like
defined, considering you define the exact name in the YAML training time, precision vs epochs, mAP vs epochs, and
file with the specified location. Recall vs epochs were compared.
Comparing the models, YOLOv8x emerges as the top
performer, but for tuning purposes, YOLOv8n is selected as
the best model. This decision is based on YOLOv8n having
precision levels almost equivalent to YOLOv8x while being
significantly lighter in size. This choice results in reduced
computational expense and enables implementation on mo-
bile systems.

3.2. Hyper-Parameters Tuning Analysis

Figure 3. Precision vs Epochs YOLOv8n was trained with a fixed Epoch of 20 and a
batch size of -1, optimizing CUDA memory usage. A fixed
positive batch size can either slow down training or exceed
P recsion = T P/(T P + F P ) (1) memory capacity. By setting it to -1, the system dynami-
The top three performers, YOLOv8x, YOLOv8n, and cally determines simultaneous training images, optimizing
YOLOv8m, were further examined. The study delved deeper memory usage without trial and error.
into these models, analyzing Precision-Confidence and
Recall-Confidence curves to gain a comprehensive under-
standing of their overall performance.

Figure 6. Precision-recall-mAP comparison of Nano models for different

Optimiser
Figure 4. Precision-Confidence Curve of YOLOv8x, YOLOv8n, YOLOv8m

Precision confidence curves for the three models re- When plotting Precision, Recall, and mAP, the graph
vealed a consistent lag in precision for the digit ’1,’ with the with the largest area indicates the optimum and best results.
widest gap in YOLOv8x, narrowing in YOLOv8n, and fur- In this scenario, the Default configuration consistently de-
ther reducing in YOLOv8m. As confidence levels increased, livers the best results. The Optimiser with high Precision,
the gap diminished, and all models peaked at approximately Recall, and mAP collectively outperforms the alternative,
82% confidence. Despite ’1’ having the highest training resulting in fewer overall losses. Therefore, the Default
instances, the lower precision is puzzling. To address this, (Auto) configuration is the most suitable choice in this case.
a detailed analysis of the confusion matrix is recommended Enabling tilt for the models revealed that as the tilt angles
for insights into class predictions, including true and false increased, losses also increased. The model with no rotation
predictions across all classes. had the lowest loss, while the model with a 45° tilt had the
highest, with the 15° tilt falling in between.

Figure 5. Confusion Matrix for YOLOv8x

In the case of YOLOv8m, it was observed that 20% of

the time, it identified the number ’1’ in the background.
This indicates that the problem of misclassification is more
related to errors in the dataset rather than how the models Figure 7. Precision-recall-mAP comparison of Nano models for different
are trained. tilt angles
Training models with various training resolutions, rang- meters. Optimal results were obtained with a tilt angle of
ing from 240 pixels width to 1080 pixels, led to an unex- 15°, aligning with previous comparisons, offering flexibility
pected outcome: the model trained with a 240-pixel resolu- within ±15° and an inference distance spanning 1.11 to
tion performed the best. This result contradicts the assump- 3.171 meters. Similarly, the 240-pixel training resolution
tion that higher training resolution leads to better results, echoed earlier findings, providing the best performance with
which typically holds true when the actual image resolution a distance range of 1.1 to 4.9 meters.
is sufficiently high. It’s crucial for the training resolution to
closely match the actual image resolution.

Figure 8. Precision-recall-mAP comparison of Nano models for different Figure 9. Inference distance for YOLOv8n with RAdam Optimiser
training resolutions

To address this, a code was developed to identify the

highest and lowest image resolutions in the training dataset, 3.4. Compound Hyper-parameters
as well as the average resolution across all images.
The analysis revealed that the average image resolution
is 128 pixels in width and 50 pixels in height. This indicates
that when the model is trained with a resolution higher than The analysis in Section 3.3 indicates that a training
the actual image resolution, it introduces significant noise, resolution of 240 pixels width and the RAdam optimizer
leading to increased losses and reduced accuracy. Therefore, produce optimal results within their respective segments. A
the optimal model, taking resolution into account, is one tilt angle of 15° is chosen for a balance between robustness
trained with a 240-pixel width resolution. and a reduction in inference distance. Subsequently, a model
is trained with these optimal hyperparameter settings, com-
3.3. Inference Timings and Min/Max Distance prising a training resolution of 240 pixels width, the RAdam
optimizer, and a ±15° tilt angle.
To evaluate the real-time performance of the models, all
trained models with various hyper-parameters, as discussed
in section 3.2, were configured for inference. The recorded
metrics include inference time and the closest and farthest
distances at which classes are successfully classified. These
measurements were collected while altering the inference
resolution, spanning 240, 320, 640, 720, 1080, 2160, and
3840-pixel widths.
Various models were assessed for effective classification
and localization distances through experiments in a lab. A Figure 10. Confusion matrix for YOLOv8 with compound tuning
complex-font image with numbers 0 to 9 was affixed to a
movable structure. A webcam, paired with a laser-guided
distance meter, ensured accurate measurements. Multiple
experiments, with varying hyper-parameters and inference
resolutions, were conducted. Results were tabulated, and
final outcomes were derived by averaging data from three
experiment repetitions.
RAdam demonstrated standout performance in optimizer
evaluations, achieving an inference distance of 7.389 meters
and a lower bound of 1.396 meters at a training resolution
of 3840 pixels, albeit with a noticeable 100-millisecond
inference lag. Subsequent reevaluation at an inference res- Figure 11. Precision-Confidence curve and Recall-Confidence curve for
olution of 2160 pixels yielded a range from 0.967 to 5.483 YOLOv8 with compound tuning
4.2.2. Classification of Number ‘1’. In Section 3.2, an
average image resolution of 128 x 50 pixels was determined.
Training involves convolution and pooling layers to preserve
features while reducing dimensions. The unclassified image
section, representing the background class, may transform
the number ’1’ into a background-like feature due to lower
resolution, causing misclassification and reduced precision.
This requires later detection or closer proximity for ’1’ than
other classes, impacting overall performance. A proposed
solution is outlined in Section 6.
Figure 12. Precision recall mAP comparison of Compound tuned model
with individual tuned models 5. Conclusion
The successful implementation of house number recog-
nition in a complex environment was achieved using the
SVHN (Street View House Numbers) dataset. A robust
system was developed utilizing a webcam to detect and
localize numbers with a remarkable 90% accuracy, covering
an inference distance of up to 4.5 meters.

6. Future work
6.0.1. Recognition. To address the inference issue related
to number ’1’, a potential solution is to create a custom
image dataset by gathering images of number ’1’ from
Figure 13. Inference distance for YOLOv8n with Compound Model Tuning various open sources and incorporating them into the train-
ing dataset, along with their corresponding bounding box
4. Results and challenges information. This approach can be used to assess whether
adding more number ’1’ images to the dataset resolves the
issue and whether it has any unintended consequences or
4.1. Results alterations on the results for the other classifications.
After the successful completion of the tuning, YOLOv8n
6.0.2. Model Fine Tuning. Much finer tuning of the models
was selected with hyperparameters as defined.
can be done by changing the parameters more gradually,
TABLE 4: Finalised Hyper-parameters thus resulting in a more extensive analysis.

Hyperparameter Key Value 6.0.3. Delivery Robots. The trained algorithm after further-
Imgsz (Training) 240 more fine-tuning, can be implemented on the actual robots
Optimiser RAdam such as delivery robots to check practical implementation
Degrees 15 of the system. The fusion of GPS and recognition can
be implemented and analysed for the enhancement in the
accuracy of the current systems in use.
This model gave the optimum results during the analysis
with the effective inference distance of 1.02 meters to 4.5
meters with an average latency of 32 milliseconds.
References
[1] P. Sermanet, S. Chintala, and Y. LeCun, “Convolutional neural
4.2. Challenges networks applied to house numbers digit classification,” Proceed-
ings of the 21st International Conference on Pattern Recognition
4.2.1. Configuration errors. Setting up the environment (ICPR2012), pp. 3288–3291, 2012.
demands careful configuration. Neglecting to ensure com- [2] S. Chen, R. Almamlook, Y. Gu, and L. wells, “Offline handwritten
patibility among versions of PyTorch, cuDNN, CUDA, and digits recognition using machine learning,” European Conference on
Computer Vision (ECCV), 2018.
TensorFlow can lead to a chain of failures and potentially a
malfunctioning setup. It is not recommended to install the [3] R. Dixit, R. Kushwah, and S. Pashine, “Handwritten digit recognition
using machine and deep learning algorithms,” International Journal
latest version using the pip command, as it can result in com- of Computer Applications, vol. 176, pp. 27–33, 07 2020.
patibility issues or failure to recognize specific packages.
[4] A. Shrivastava, I. Jaggi, S. Gupta, and D. Gupta, “Handwritten digit
To address this, it’s advisable to follow the comprehensive recognition using machine learning: A review,” 2019 2nd Interna-
list provided by TensorFlow to verify the compatibility of tional Conference on Power Energy, Environment and Intelligent
versions. Control (PEEIC), pp. 322–326, 2019.
[5] R. KARAKAYA and S. Çakar, “Handwritten digit recognition using
machine learning,” Sakarya University Journal of Science, vol. 25,
10 2020.
[6] D. Su, H. Zhang, H. Chen, J. Yi, P.-Y. Chen, and Y. Gao, Is Robustness
the Cost of Accuracy? – A Comprehensive Study on the Robustness
of 18 Deep Image Classification Models: 15th European Conference,
Munich, Germany, September 8–14, 2018, Proceedings, Part XII,
pp. 644–661. 09 2018.
[7] S. Mascarenhas and M. Agarwal, “A comparison between vgg16,
vgg19 and resnet50 architecture frameworks for image classification,”
in 2021 International Conference on Disruptive Technologies for
Multi-Disciplinary Research and Applications (CENTCON), vol. 1,
pp. 96–99, Nov 2021.
[8] J. ah Kim, J.-Y. Sung, and S.-H. Park, “Comparison of faster-rcnn,
yolo, and ssd for real-time vehicle type recognition,” 2020 IEEE
International Conference on Consumer Electronics - Asia (ICCE-
Asia), pp. 1–4, 2020.
[9] M. Shetty, “A review on deep learning object detection: Yolo vs ssd,”
International Journal of Advanced Research in Science, Communica-
tion and Technology (IJARSCT), vol. 5, 2021.
[10] Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y. Ng,
“Reading digits in natural images with unsupervised feature learning,”
in NIPS Workshop on Deep Learning and Unsupervised Feature
Learning 2011, 2011.
Cranfield University
CERES Research Repository https://dspace.lib.cranfield.ac.uk/
School of Aerospace, Transport and Manufacturing (SATM) Staff publications (SATM)

Reading and understanding house

numbers for delivery robots using the
”SVHN Dataset”

Pradhan, Omkar N.
2024-06-05
Attribution-NonCommercial 4.0 International

Pradhan O, Tang G, Makris C, Gudipati R. (2024) Reading and understanding house numbers
for delivery robots using the “SVHN Dataset”. In: 2024 IEEE International Conference on
Industrial Technology (ICIT), 25-27 March 2024, Bristol, UK
https://doi.org/10.1109/ICIT58233.2024.10540817
Downloaded from CERES Research Repository, Cranfield University

The Notion of Freedom in Krishnamurti and Sartre
No ratings yet
The Notion of Freedom in Krishnamurti and Sartre
10 pages
A Real-Time Object Detection Processor With Xnor-B
No ratings yet
A Real-Time Object Detection Processor With Xnor-B
13 pages
Object Detection and Its Implementation On Android Devices
No ratings yet
Object Detection and Its Implementation On Android Devices
8 pages
Research Article: Concrete Cracks Detection Using Convolutional Neural Network Based On Transfer Learning
No ratings yet
Research Article: Concrete Cracks Detection Using Convolutional Neural Network Based On Transfer Learning
10 pages
Iván García Aguilar Automated Labeling of Training
No ratings yet
Iván García Aguilar Automated Labeling of Training
8 pages
Efficient Lightweight Residual Network For Real-Time Road Semantic Segmentation
No ratings yet
Efficient Lightweight Residual Network For Real-Time Road Semantic Segmentation
8 pages
AI-driven-force-torque-control-strategies-for-further-automate-_2024_Procedi
No ratings yet
AI-driven-force-torque-control-strategies-for-further-automate-_2024_Procedi
7 pages
1 s2.0 S0925231221009486 Main
No ratings yet
1 s2.0 S0925231221009486 Main
7 pages
main
No ratings yet
main
13 pages
Convolutional Neural Network CNN For Image Detection and Recognition
No ratings yet
Convolutional Neural Network CNN For Image Detection and Recognition
5 pages
entropy-27-00181
No ratings yet
entropy-27-00181
25 pages
Model Transferability From ImageNet To Lithography Hotspot
No ratings yet
Model Transferability From ImageNet To Lithography Hotspot
9 pages
Transfer Learning For Object Detection Using State-of-the-Art Deep Neural Networks
No ratings yet
Transfer Learning For Object Detection Using State-of-the-Art Deep Neural Networks
7 pages
Bonnard Et Al-2020-On Building A CNN-based Multi-View Smart Camera For Real-Time Object Detection
No ratings yet
Bonnard Et Al-2020-On Building A CNN-based Multi-View Smart Camera For Real-Time Object Detection
33 pages
Backbone Search For Object Detection For Applications in Intrusion Warning Systems
No ratings yet
Backbone Search For Object Detection For Applications in Intrusion Warning Systems
10 pages
Traffic sign classification using CNN and detection using FRCNN and YOLOV4
No ratings yet
Traffic sign classification using CNN and detection using FRCNN and YOLOV4
8 pages
2 Convolutional Neural Network For Image Classification
No ratings yet
2 Convolutional Neural Network For Image Classification
6 pages
A - Deep - Learning - Model - For - Smart - Manufacturing - Using - Convolutional - LSTM - Neural - Network - Autoencoders Ok
No ratings yet
A - Deep - Learning - Model - For - Smart - Manufacturing - Using - Convolutional - LSTM - Neural - Network - Autoencoders Ok
10 pages
Paper 2
No ratings yet
Paper 2
12 pages
Convolutional Neural Network CNN For Ima
No ratings yet
Convolutional Neural Network CNN For Ima
5 pages
A_Survey_on_Efficient_Vision_Transformers_Algorithms_Techniques_and_Performance_Benchmarking
No ratings yet
A_Survey_on_Efficient_Vision_Transformers_Algorithms_Techniques_and_Performance_Benchmarking
19 pages
Handwritten Digit Recognition Using Machine Learning
No ratings yet
Handwritten Digit Recognition Using Machine Learning
5 pages
Region-Based_Convolutional_Networks_for_Accurate_Object_Detection_and_Segmentation
No ratings yet
Region-Based_Convolutional_Networks_for_Accurate_Object_Detection_and_Segmentation
17 pages
ObjectDetectionwithConvolutionalNeuralNetworks
No ratings yet
ObjectDetectionwithConvolutionalNeuralNetworks
12 pages
Towards Reconfigurable CNN Accelerator For FPGA Implementation
No ratings yet
Towards Reconfigurable CNN Accelerator For FPGA Implementation
5 pages
[40] MDPI
No ratings yet
[40] MDPI
21 pages
AI DRIVEN WATERSYSTEM2023
No ratings yet
AI DRIVEN WATERSYSTEM2023
13 pages
A Compressed Deep Convolutional Neural Networks For Face Recognition
No ratings yet
A Compressed Deep Convolutional Neural Networks For Face Recognition
6 pages
Applied Computational Intelligence and Soft Computing - 2020 - Kamsing - Deep Neural Learning Adaptive Sequential Monte
No ratings yet
Applied Computational Intelligence and Soft Computing - 2020 - Kamsing - Deep Neural Learning Adaptive Sequential Monte
9 pages
MRS Sot Seminar Report
No ratings yet
MRS Sot Seminar Report
16 pages
Malaiya2018 LSTM
No ratings yet
Malaiya2018 LSTM
6 pages
Quantum_Optical_Convolutional_Neural_Network_A_Novel_Image_Recognition_Framework_for_Quantum_Computing
No ratings yet
Quantum_Optical_Convolutional_Neural_Network_A_Novel_Image_Recognition_Framework_for_Quantum_Computing
10 pages
Applsci 11 07733 v2
No ratings yet
Applsci 11 07733 v2
18 pages
Applied Sciences: Lightweight Attention Pyramid Network For Object Detection and Instance Segmentation
No ratings yet
Applied Sciences: Lightweight Attention Pyramid Network For Object Detection and Instance Segmentation
16 pages
JOCC Volume 2 Issue 1 Page 9 19
No ratings yet
JOCC Volume 2 Issue 1 Page 9 19
11 pages
Traffic Sign Recognition Using CNN and Res-Net
No ratings yet
Traffic Sign Recognition Using CNN and Res-Net
7 pages
Final Selected Report
No ratings yet
Final Selected Report
4 pages
1-s2.0-S0925231223003806-main
No ratings yet
1-s2.0-S0925231223003806-main
13 pages
Paper 1
No ratings yet
Paper 1
8 pages
Anomaly Detection of Defect Using Energy of Poi - 2024 - Engineering Application
No ratings yet
Anomaly Detection of Defect Using Energy of Poi - 2024 - Engineering Application
15 pages
Sensors 24 03549
No ratings yet
Sensors 24 03549
14 pages
2 PB
No ratings yet
2 PB
8 pages
Sensors: A Compact High-Quality Image Demosaicking Neural Network For Edge-Computing Devices
No ratings yet
Sensors: A Compact High-Quality Image Demosaicking Neural Network For Edge-Computing Devices
18 pages
Revista de Sensores
No ratings yet
Revista de Sensores
13 pages
AI-Powered Image Analysis Using Python
No ratings yet
AI-Powered Image Analysis Using Python
5 pages
05_AIHC_Exp03
No ratings yet
05_AIHC_Exp03
7 pages
1-s2.0-S026288562300197X-main
No ratings yet
1-s2.0-S026288562300197X-main
11 pages
Hu Et Al. - 2020 - Gabor-CNN for Object Detection Based on Small Samples
No ratings yet
Hu Et Al. - 2020 - Gabor-CNN for Object Detection Based on Small Samples
14 pages
Quantifying_the_Effects_of_Ground_Truth_Annotation_Quality_on_Object_Detection_and_Instance_Segmentation_Performance
No ratings yet
Quantifying_the_Effects_of_Ground_Truth_Annotation_Quality_on_Object_Detection_and_Instance_Segmentation_Performance
15 pages
Image Recognition Using Machine Learning Research Paper
No ratings yet
Image Recognition Using Machine Learning Research Paper
5 pages
Object Detection Using Tensorflow....
No ratings yet
Object Detection Using Tensorflow....
9 pages
Younis 2020
No ratings yet
Younis 2020
5 pages
Kuch Bhi
No ratings yet
Kuch Bhi
5 pages
Final Project Paper Akash
No ratings yet
Final Project Paper Akash
5 pages
Improving Performance of Autoencoder-Based Network Anomaly Detection On NSL-KDD Dataset
No ratings yet
Improving Performance of Autoencoder-Based Network Anomaly Detection On NSL-KDD Dataset
11 pages
Espinosa, Velastin, Branch - 2017 - Vehicle detection using alex net and faster R-CNN deep learning models A comparative study-annotated
No ratings yet
Espinosa, Velastin, Branch - 2017 - Vehicle detection using alex net and faster R-CNN deep learning models A comparative study-annotated
14 pages
20387-Article Text-24400-1-2-20220628
No ratings yet
20387-Article Text-24400-1-2-20220628
9 pages
numberplate2 (1)
No ratings yet
numberplate2 (1)
7 pages
Intrusion Detection of Imbalanced Network Traffic Based On Machine Learning and Deep Learning
No ratings yet
Intrusion Detection of Imbalanced Network Traffic Based On Machine Learning and Deep Learning
14 pages
Learning Hierarchical Features For Scene Labeling
No ratings yet
Learning Hierarchical Features For Scene Labeling
15 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Probability of Independent Events
No ratings yet
Probability of Independent Events
10 pages
Grade 5 q2 Mathematics Las
No ratings yet
Grade 5 q2 Mathematics Las
100 pages
Peace Corps OST Focus In/Train Up (FITU) Focus in - Train Up Action Plan
No ratings yet
Peace Corps OST Focus In/Train Up (FITU) Focus in - Train Up Action Plan
2 pages
TM254 - Final - Fall2022-2023
100% (1)
TM254 - Final - Fall2022-2023
6 pages
Cave - Wikipedia
No ratings yet
Cave - Wikipedia
3 pages
Universidad Nacional Abierta y A Distancia - UNAD Curso: INGLÉS B1+ Código: 9000
No ratings yet
Universidad Nacional Abierta y A Distancia - UNAD Curso: INGLÉS B1+ Código: 9000
6 pages
Demonstration Pelton Turbine
No ratings yet
Demonstration Pelton Turbine
6 pages
Research Paradigms
No ratings yet
Research Paradigms
24 pages
Common Idioms List
No ratings yet
Common Idioms List
3 pages
Handshaking
No ratings yet
Handshaking
3 pages
June 2018 QP (12)
No ratings yet
June 2018 QP (12)
32 pages
Cha Cha Cha
No ratings yet
Cha Cha Cha
67 pages
Chapter 1
No ratings yet
Chapter 1
24 pages
48.lamp Illumination Control System Using Sensor Circuit
No ratings yet
48.lamp Illumination Control System Using Sensor Circuit
4 pages
ANN and Power System
No ratings yet
ANN and Power System
37 pages
B2016 Interpersonal and Intrapersonal Expectancies, Trusz ROUTLEDGE
100% (1)
B2016 Interpersonal and Intrapersonal Expectancies, Trusz ROUTLEDGE
203 pages
Beginner S1
No ratings yet
Beginner S1
233 pages
Importance of Heat Treatment
No ratings yet
Importance of Heat Treatment
5 pages
Adi Shankara Got Into An Argument With A Man and Won
No ratings yet
Adi Shankara Got Into An Argument With A Man and Won
2 pages
Saint Mary's University - Criminology Review Center: Subject: Traffic Management & Accident Investigation
No ratings yet
Saint Mary's University - Criminology Review Center: Subject: Traffic Management & Accident Investigation
7 pages
500 Cau Trac Nghiem Tu Vung
No ratings yet
500 Cau Trac Nghiem Tu Vung
20 pages
Maam Roida
No ratings yet
Maam Roida
12 pages
Detailed Lesson Plan
100% (1)
Detailed Lesson Plan
5 pages
Pattern Recognition Letters: Xiaohua Xia, Suping Fang, Yan Xiao
No ratings yet
Pattern Recognition Letters: Xiaohua Xia, Suping Fang, Yan Xiao
6 pages
Capgemini New Pattern 2020
No ratings yet
Capgemini New Pattern 2020
4 pages
Lec 2 Terms in Periodontology
No ratings yet
Lec 2 Terms in Periodontology
13 pages
Core Banking
No ratings yet
Core Banking
35 pages
Upper Airway Obstruction
No ratings yet
Upper Airway Obstruction
31 pages
IMRAD Introduction Methods Results and Discussion 1
No ratings yet
IMRAD Introduction Methods Results and Discussion 1
2 pages