Thermal Vision - Night Object Detection With PyTorch and YOLOv5 (Real Project) - PyImageSearch
Thermal Vision - Night Object Detection With PyTorch and YOLOv5 (Real Project) - PyImageSearch
ARCH
IR VISION (HTTPS://PYIMAGESEARCH.COM/CATEGORY/IR-VISION/)
TUTORIALS (HTTPS://PYIMAGESEARCH.COM/CATEGORY/TUTORIALS/)
YOLOV5 (HTTPS://PYIMAGESEARCH.COM/CATEGORY/YOLOV5/)
8:44
Table of Contents
Thermal Vision: Night Object Detection with PyTorch and YOLOv5 (real project)
Object Detection with Deep Learning Through PyTorch and YOLOv5
Project Structure
Pre-Training
Training
Testing
Summary
Citation Information
This tutorial is the last of our 4-part course on Infrared Vision Basics:
2 Thermal Vision: Measuring your First Temperature from an Image with Python
and OpenCV (https://pyimg.co/mns3e)
3 Thermal Vision: Fever Detector with Python and OpenCV (starter project)
(https://pyimg.co/6nxs0)
4 Thermal Vision: Night Object Detection with PyTorch and YOLOv5 (real project)
(https://pyimg.co/p2zsm) (today’s tutorial)
By the end of this lesson, you’ll learn how to detect different objects using thermal images
and Deep Learning in a very quick, easy, and up-to-date way, using only four pieces of
code!
To learn how to utilize YOLOv5 using your custom thermal imaging dataset, just keep
reading.
Thermal Vision: Night Object Detection with PyTorch
and YOLOv5 (real project)
From that point and based on all the content covered during this course, the
PyImageSearch team appeals to your imagination to excel in any thermal imaging
situation, but not before providing you with another powerful and real-life example of this
incredible combination: Computer Vision + Thermal Imaging.
In this case, we will learn how computers can see in the dark distinguishing different
object classes in real time.
Before starting this tutorial, for better comprehension, we encourage you to take the
Torch Hub Series course at PyImageSearch University or gain some experience with
PyTorch and Deep Learning. As in all PyImageSearch University courses, we will cover all
aspects step by step.
As explained in Torch Hub Series #3: YOLOv5 and SSD — Models on Object Detection
(https://pyimagesearch.com/2022/01/03/torch-hub-series-3-yolov5-and-ssd-models-
on-object-detection/), YOLOv5 — You Only Look Once
(https://arxiv.org/abs/1506.02640) (Figure 1, 2015) version 5 — is the fifth version of one
of the most powerful state-of-the-art Convolutional Neural Network models. This fast
object detector model is usually trained on the COCO dataset
(https://cocodataset.org/#home), an open-access Microsoft RGB imaging database
consisting of 330K images, 91 object classes, and 2.5 million labeled instances.
Figure 1: Original YOLO logo (source (https://pjreddie.com/darknet/yolo/)).
This strong combination makes YOLOv5 the perfect model to detect objects even in our
custom imaging datasets. For obtaining a thermal object detector, we will use Transfer
Learning (i.e., to train the COCO-pre-trained YOLOv5 model on a real thermal imaging
dataset especially collected for self-driving car solutions).
This database consists of 14,452 thermal images in gray8 and gray16 format, which, as we
have learned, allows us to measure any pixel temperature. All the 14,452 gray8 images
acquired in some streets of California with a mounted-car thermal camera are hand-
labeled with bounding boxes, as Figure 2 shows. We will use these annotations (labels +
bounding boxes) to detect four different object categories out of the four classes
predefined in this dataset: car , person , bicycle , and dog .
(https://pyimagesearch.com/wp-content/uploads/2022/10/figure-2-dataset.png)
Figure 2: Example of a gray8 thermal image (left) and a gray8 thermal image hand-labeled with bounding boxes
(right). The hand-labeled image (right) shows the object detection of the 4 defined classes: car (yellow), person
(pink), bicycle (purple), and dog (red).
A JSON file with the COCO format annotations is provided. To simplify this tutorial, we
give you the annotations in the YOLOv5 PyTorch format. You can find a labels folder
with individual annotations for each gray8 image.
We have also reduced the dataset to 1,772 images: 1000 to train our pre-trained YOLOv5
model and 772 to validate it (i.e., approximately 60-40% training-validation split). These
images have been selected from the training portion of the original dataset.
Thermal Vision: Night Object Detection with PyTorch and YOLOv5 (real project)
1. $ pip install opencv-contrib-python
If you need help configuring your development environment for OpenCV, we highly
recommend that you read our pip install OpenCV guide
(https://pyimagesearch.com/2018/09/19/pip-install-opencv/) — it will have you up and
running in a matter of minutes.
(https://pyimagesearch.com/pyimagesearch-university/)
Having trouble configuring your dev environment? Want access to pre-configured
Jupyter Notebooks running on Google Colab? Be sure to join PyImageSearch
University (https://pyimagesearch.com/pyimagesearch-university/) — you’ll be
up and running with this tutorial in a matter of minutes.
Short on time?
Wanting to skip the hassle of fighting with the command line, package managers, and
virtual environments?
Ready to run the code right now on your Windows, macOS, or Linux system?
Gain access to Jupyter Notebooks for this tutorial and other PyImageSearch guides
that are pre-configured to run on Google Colab’s ecosystem right in your web browser!
No installation required.
And best of all, these Jupyter Notebooks will run on Windows, macOS, and Linux!
Project Structure
We first need to review our project directory structure.
Start by accessing this tutorial’s “Downloads” section to retrieve the source code and
example images.
Thermal Vision: Night Object Detection with PyTorch and YOLOv5 (real project)
1. $ tree --dirsfirst
2. .
3. └── yolov5
4. ├── data
5. ├── models
6. ├── utils
7. ├── CONTRIBUTING.md
8. ├── Dockerfile
9. ├── LICENSE
10. ├── ...
11. └── val.py
12.
13. 1 directory, XX files
Thermal Vision: Night Object Detection with PyTorch and YOLOv5 (real project)
1. # clone the yolov5 repository from GitHub and install some necessary packages
(requirements.txt file)
2. !git clone https://github.com/ultralytics/yolov5
3. %cd yolov5
4. %pip install -qr requirements.txt
Notice that we also installed the required libraries indicated in the requirements.txt
file (Line 4): Matplotlib, NumPy, OpenCV, PyTorch, etc.
In the yolov5 folder, we can find all the necessary files to use YOLOv5 in any of our
projects:
utils : includes some necessary Python files to manage the training, the dataset, the
information visualization, and general project utilities.
The rest of the files in the yolov5 files are required, but we will only run two of them:
train.py
(https://github.com/ultralytics/yolov5/blob/master/train.py): is
a file to train our model, which is part of the repository we cloned above
detect.py
(https://github.com/ultralytics/yolov5/blob/master/detect.py): is
a file to test our model by inferring the detected objects, which is also part of the
repository we cloned above
The thermal_imaging_dataset folder includes our 1,772 gray8 thermal images. This
folder contains the images ( thermal_imaging_dataset/images ) and the labels (
thermal_imaging_dataset/labels ) split into the training and validation sets,
respectively, train and val folders.
The thermal_imaging_video_test.mp4 is the video file on which we will test our thermal
object detection model. It contains 4,224 thermal frames acquired at 30 fps with scenes
of streets and highways.
Thermal Vision: Night Object Detection with PyTorch and YOLOv5 (real project)
6. # import PyTorch and check versions
7. import torch
8. from yolov5 import utils
9. display = utils.notebook_init()
Open your yolov5.py file and import the required packages (Lines 7 and 8), checking
your notebook features (Line 9) if you are working with Jupyter Notebooks on Google
Colab.
Check that your environment includes a GPU (Figure 3) to run our next training process in
a reasonable time successfully.
(https://pyimagesearch.com/wp-
content/uploads/2022/10/figure-3-
jupyter-set-up-4-2.png)
Pre-Training
As we have already mentioned, we’ll use Transfer Learning to train our object detector
model on our thermal imaging dataset using the YOLOv5 CNN architecture pre-trained on
the COCO dataset as a starting point.
For this purpose, the trained YOLOv5 model selected is the YOLOv5s version due to its
high speed-accuracy performance.
Training
After setting up the environment and fulfilling all the requirements, let’s train our pre-
trained model!
Thermal Vision: Night Object Detection with PyTorch and YOLOv5 (real project)
11. # train pretrained YOLOv5s model on the custom thermal imaging dataset,
12. # basic parameters:
13. # - image size (img): image size of the thermal dataset is 640 x 512, 640 passed
14. # - batch size (batch): 16 by default, 16 passed
15. # - epochs (epochs): number of epochs, 30 passed
16. # - dataset (data): dataset in .yaml file format, custom thermal image dataset passed
17. # - pre-trained YOLOv5 model (weights): YOLOv5 model version, YOLOv5s (small version) passed
18. !python train.py --img 640 --batch 16 --epochs 30 --data thermal_image_dataset.yaml --weights
yolov5s.pt
On Line 18, after importing the PyTorch and the YOLOv5 utils (Lines 7-9), we run the
train.py file by specifying the following parameters:
img : image size of the training images to be passed through our model. In our case,
thermal images have a 640x512 resolution, so we indicate the maximum size, 640
pixels.
data : YAML dataset file. Figure 4 shows our dataset file. It is pointing to the YOLOv5
dataset structure, previously explained:
thermal_imaging_dataset/images/train
thermal_imaging_dataset/labels/train ,
thermal_imaging_dataset/images/val
thermal_imaging_dataset/labels/val ,
for validation.
It also indicates the number of classes, nc: 4 , and the class names,
names: ['bicycle', 'car', 'dog', 'person'] .
weights : calculates weights of the pre-trained model, in our case, YOLOv5s, on the
COCO dataset. The yolov5s.pt file is the pre-trained model that contains these
weights and is located in yolov5/models .
(https://pyimagesearch.com/wp-content/uploads/2022/10/figure-4-yaml-dataset-
file.png)
Figure 4: YAML dataset file: data thermal_image_dataset.yaml. It contains the thermal imaging dataset path, the
number of classes, and the class names.
After 30 epochs completed in the GPU NVIDIA Tesla T4 in 0.279 hours, our model has
learned to detect the classes person , car , bicycle , and dog , achieving the mean
Average Precision of 50.7%, mAP (IoU = 0.5) = 0.507, as Figure 5 shows. This means that
our average prediction, with an Intersection over Union (IoU, Figure 6) of 0.5, is 50.7% for
all our classes.
(https://pyimagesearch.com/wp-content/uploads/2022/10/figure-5-results.png)
Figure 5: Results for our YOLOv5 model trained on the thermal imaging dataset. Inside the green box, the mean Average
Precision for all classes is shown, mAP (IoU = 0.5) = 0.507. The mean Average Precision for each class is shown: bicycle
(red), car (pink), dog (blue), and person (yellow). As you can deduce, our bicycle and dog classes are
underrepresented with mAP bicycle (IoU = 0.5) = 0.456 and mAP dog (IoU = 0.5) = 0.004, respectively.
(https://lh3.googleusercontent.com/RMkuHUykGe2jBKvvaK9hn2kwLeVOBqPrVtzvUu9s3c
jDH8yPxl_DSbP9UupYjVBMli3l7SLRib
Figure 6: Intersection over Union (IoU) definition. Thermal image example from the dataset (left) with the hand-labeled bo
percentage calculated by dividing the Ov
As is shown in Figure 6, the Intersection over Union (IoU) is the right overlap of the
bounding boxes when the original and the prediction are compared.
So, for our person class, our model properly detects, on average, 77.7% of the cases,
considering a correct prediction when there is a bounding-boxes intersection of 50% or
higher.
Figure 7 compares two original images, their hand-labeled bounding boxes, and their
predicted results.
(https://lh3.googleusercontent.com/5qH8NjpX8rnkDg2xcAVcw55W1nTKCie
acm28hh48SU0Iimu_nWpWmEzlrom2IAr_NWKbsmLMTo3r_rYUZeMXCfjV32eNuJ6RfOC_
Figure 7: Two image results of our trained model. Original images (left), hand-labeled images (middle), and predicted images
show the object detection of 3 out of the 4 defined classes: car (pink), per
Although it is out of the scope of this tutorial, it is important to note that our dataset is
highly unbalanced, with only 280 and 31 labels, respectively, for our bicycle and dog
classes. That is why we obtain mAP bicycle (IoU = 0.5) = 0.456 and mAP dog (IoU =
0.5) = 0.004, respectively.
Finally, to verify our results, Figure 8 shows the Classification Loss during the training
(top-left) and the validation (bottom-left) processes, and the mean Average Precision at
IoU 50% (middle-right), mAP (IoU = 0.5) for all the classes through the 30 epochs.
Figure 8: Classification training loss (top-left), classification validation loss (bottom-left), and mean Average Precision at
IoU 50% (middle-right), mAP (IoU = 0.5).
Testing
Thermal Vision: Night Object Detection with PyTorch and YOLOv5 (real project)
21. # test the trained model (night_object_detector.pt) on a thermal imaging video,
22. # parameters:
23. # - trained model (weights): model trained in the previous step, night_object_detector.pt
passed
24. # - image size (img): frame size of the thermal video is 640 x 512, 640 passed
25. # - confidence (conf): confidence threshold, only the inferences higher than this value will
be shown, 0.35 passed
26. # - video file (source): thermal imaging video, thermal_imaging_video.mp4 passed
27. !python detect.py --weights runs/train/exp/weights/best.pt --img 640 --conf 0.35 --source
../thermal_imaging_video.mp4
weights : points to our trained model. Calculated weights collected at best.pt file (
runs/train/exp/weights/best.pt ).
img : image size of the testing images that will be passed through our model. In our
case, thermal images from our video have a 640x512 resolution, so we indicate the
maximum size as 640 pixels.
conf : confidence of each detection. This threshold establishes the level of probability
of detection from which the detections are considered correct and therefore shown.
We set up the confidence at 35%.
source : images to test the model, in our case, the video file
thermal_imaging_video.mp4 .
As we have indicated, the night object detection of this video has been obtained with
35% confidence. To modify this factor, we should check the curve obtained in Figure 10,
where Precision is plotted against Confidence.
(https://pyimagesearch.com/wp-content/uploads/2022/10/figure-10-p-curve.png)
Figure 10: Precision vs. Confidence curve for our tested model.
3:52
Course information:
86 total classes • 115+ hours of on-demand code walkthrough videos • Last
updated: October 2024
★★★★★ 4.84 (128 Ratings) • 16,000+ Students Enrolled
I strongly believe that if you had the right teacher you could master computer
vision and deep learning.
Do you think learning computer vision and deep learning has to be time-
consuming, overwhelming, and complicated? Or has to involve complex
mathematics and equations? Or requires a degree in computer science?
All you need to master computer vision and deep learning is for someone to
explain things to you in simple, intuitive terms. And that’s exactly what I do. My
mission is to change education and how complex Artificial Intelligence topics
are taught.
If you're serious about learning computer vision, your next stop should be
PyImageSearch University, the most comprehensive computer vision, deep
learning, and OpenCV course online today. Here you’ll learn how to successfully
and confidently apply computer vision to your work, research, and projects. Join
me in computer vision mastery.
Summary
We would like to acknowledge the great work of Ultralytics
(https://github.com/ultralytics). We found their train.py
(https://github.com/ultralytics/yolov5/blob/master/train.py) and
detect.py
(https://github.com/ultralytics/yolov5/blob/master/detect.py) files
so great we included them in this post.
In this tutorial, we have learned how to detect different objects under any light condition,
combining Thermal Vision and Deep Learning, using the CNN YOLOv5 architecture and
our custom thermal imaging dataset.
For this purpose, we have discovered how to train the state-of-the-art YOLOv5 model,
previously trained using the Microsoft COCO dataset, on the FLIR Thermal Starter
Dataset.
Even though the thermal images are completely different from common RGB images of
the COCO dataset, the great performance and results obtained show how powerful the
YOLOv5 model is.
We can conclude that Artificial Intelligence goes through incredible and useful paradigms
nowadays.
This tutorial shows you how to apply Thermal Vision and Deep Learning in real
applications (e.g., Self-Driving Cars). If you would like to learn about this awesome topic,
check out the Autonomous Car courses at PyImageSearch University
(https://pyimagesearch.com/pyimagesearch-university/).
The PyImageSearch team hopes that you have enjoyed and interiorized all the concepts
taught during this Infrared Vision Basics course.
Citation Information
Garcia-Martin, R. “Thermal Vision: Night Object Detection with PyTorch and YOLOv5 (real
project),” PyImageSearch, P. Chugh, A. R. Gosthipaty, S. Huot, K. Kidriavsteva, and R.
Raha, eds., 2022, https://pyimg.co/p2zsm (https://pyimg.co/p2zsm)
@incollection{RGM_2022_PYTYv5,
author = {Raul Garcia-Martin},
title = {Thermal Vision: Night Object Detection with {PyTorch} and {YOLOv5}
(real project)},
booktitle = {PyImageSearch},
editor = {Puneet Chugh and Aritra Roy Gosthipaty and Susan Huot and Kseniia
Kidriavsteva and Ritwik Raha},
year = {2022},
note = {https://pyimg.co/p2zsm},
}
Step into the realm of the future by signing up or logging into your Roboflow
account (https://roboflow.com/?ref=pyimagesearch). Unlock a wealth of
innovative dataset libraries and revolutionize your computer vision operations.
Jumpstart your journey by choosing from our broad array of datasets, or benefit
from PyimageSearch’s comprehensive library, crafted to cater to a wide range of
requirements.
Transfer your data to Roboflow in any of the 40+ compatible formats. Leverage
cutting-edge model architectures for training, and deploy seamlessly across
diverse platforms, including API, NVIDIA, browser, iOS, and beyond. Integrate
our platform effortlessly with your applications or your favorite third-party tools.
Equip yourself with the ability to train a potent computer vision model in a mere
afternoon. With a few images, you can import data from any source via API,
annotate images using our superior cloud-hosted tool, kickstart model training
with a single click, and deploy the model via a hosted API endpoint. Tailor your
process by opting for a code-centric approach, leveraging our intuitive, cloud-
based UI, or combining both to fit your unique needs.
Embark on your journey today with absolutely no credit card required. Step into
the future with Roboflow.
To download the source code to this post (and be notified when future tutorials are
published here on PyImageSearch), simply enter your email address in the form below!
Enter your email address below to get a .zip of the code and a FREE 17-page Resource Guide
on Computer Vision, OpenCV, and Deep Learning. Inside you'll find my hand-picked
tutorials, books, courses, and libraries to help you master CV and DL!
Previous Article:
Thermal Vision: Fever Detector with Python and OpenCV (starter project)
(https://pyimagesearch.com/2022/10/24/thermal-vision-fever-detector-with-python-
and-opencv-starter-project/)
Next Article:
(https://pyimagesearch.com/2022/11/21/scaling-kaggle-competitions-using-xgboost-
part-1/)
Comment section
Hey, Adrian Rosebrock here, author and creator of PyImageSearch. While I love
hearing from readers, a couple years ago I made the tough decision to no longer
offer 1:1 help over blog post comments.
At the time I was receiving 200+ emails per day and another 100+ blog post
comments. I simply did not have the time to moderate and respond to them all, and
the sheer volume of requests was taking a toll on me.
Instead, my goal is to do the most good for the computer vision, deep learning, and
OpenCV community at large by focusing my time on authoring high-quality blog
posts, tutorials, and books/courses.
If you need help learning computer vision and deep learning, I suggest you refer
to my full catalog of books and courses (https://pyimagesearch.com/books-and-
courses/) — they have helped tens of thousands of developers, students, and
researchers just like yourself learn Computer Vision, Deep Learning, and OpenCV.
RANTS
A day in the life of a Adrian Rosebrock: computer vision researcher, developer, and
entrepreneur.
January 30, 2017
(https://pyimagesearch.com/2017/01/30/a-day-in-the-life-of-a-adrian-
rosebrock-computer-vision-researcher-developer-and-entrepreneur/)
Get your FREE 17 page Computer Vision, OpenCV, and Deep Learning Resource Guide PDF.
Inside you’ll find our hand-picked tutorials, books, courses, and libraries to help you master CV
and DL.
(https://www.facebook.com/pyimagesearch) (https://twitter.com/PyImageSearch)
(https://www.linkedin.com/company/pyimagesearch/)
(https://www.youtube.com/channel/UCoQK7OVcIVy-nV4m-SMCk_Q/videos)