Computer Vision
Computer Vision
Electrical Engineering
June 2021
Preetham Notla
Ganta Saaketh Reddy
Sandeep Jyothula
Contact Information:
Author(s):
Preetham Notla
E-mail: [email protected]
Ganta Saaketh Reddy
E-mail: [email protected]
Sandeep Jyothula
E-mail: [email protected]
University advisor:
Prof. Wlodek J.Kulesza
Dept. of Mathematics and Natural Sciences
Company advisor:
Dr Damian Dziak
Bioseco Sp. z o.o.
E-mail: [email protected]
Context. Air being the free source is used in different ways commer-
cially. In earlier days windmills generate power, water, and electricity.
The excessive establishment of windmills for commercial purposes af-
fected avifauna. Most of the birds lost their lives due to collisions
with windmills. Turbines used to generate power near airports are
also one of the causes for the extinction of birdlife. According to a
survey in 2011 in Canada a total of 23,300 bird deaths were caused
by wind turbines and also it is estimated that the number of deaths
would increase to 2,33,000 in the following 10-15 years.
Objectives. The main objective of this thesis is to find a suitable
software solution to detect the birds on a series of grayscale images in
real-time and in minimum full HD resolution with at least a 15 FPS
rate. User-Driven Design Methodology is used for developing, tools
are Python and Open-CV.
Methods. In this research, a system is designed to detect the bird
in an HD Video. Possible methods that can be considered are con-
volutional neural networks (CNN), vision based detection with back-
ground subtraction, contour detection and confusion matrix classifi-
cation. These methods detect birds in raw images and with help of
a classifier make it possible to see the bird in desired pixels with full
resolution. We will investigate a bird classification method consisting
of two steps, background subtraction, and then object classification.
Background subtraction is a fundamental method to extract moving
objects from a fixed background. For classification, we will use the
YOLO v3 model version for object classification.
Results. The project is expected to result in a system design and
prototype for the bird identification on a grayscale video stream in
at least full HD resolution in a minimum of 15 FPS. The bird should
be distinguished from other moving objects like wind turbine blades,
trees, or clouds. The proposed solution should identify up to 5 birds
simultaneously.
Conclusions. After completing each step and arriving at the classifi-
cation, the methods we have tried, such as Haar Cascades and mobile-
net SSD, were not providing us with the desired results. So we opted
to use YOLO v3, which had the best accuracy in classifying different
objects. By using the YOLO v3 classifier, we have detected the bird
with 95% accuracy, blades with 90% accuracy, clouds with 80% ac-
curacy, trees with 70% accuracy. Moreover, we conclude that there
is a need for further empirical validation of the models in full-scale
industry trials.
Keywords: Background Subtraction, Bird detection, Classification,
Contour Detection, Convolutional Neural Networks(CNN), Python,
Open-CV.
ii
Acknowledgments
We would like to express our special thanks gratitude to Bioseco Company for
assigning three of us to this project. We are grateful to Damian Dziak for provid-
ing all information regarding the project, who also helped us with software and
completing this project. Thank you for guiding us and teaching us many new
things during this project.
We would also like to extend our special thanks to Prof. Wlodek J. Kulesza,
for providing this unique project opportunity and providing facilities that were re-
quired. We need to thank him for his support and motivation during the project.
We are inspired by his commitment to his work.
Also, we would like to thank our family and friends for supporting us during
this project.
This research was funded by a grant "The completion of R&D works lead-
ing to the implementation of MULTIREJESTRATOR PLUS, a new solution for
monitoring and controlling the power system to increase the operating efficiency,
extend the service life and optimise the environmental impact of wind farms."(No.
POIR.01.02.00-00-0247/17) from The National Centre for Research and Develop-
ment of Poland.
iii
Contents
Abstract i
Acknowledgments iii
List of Figures vi
Nomenclature viii
1 Introduction 1
5 Implementation 15
5.1 Software Implementation . . . . . . . . . . . . . . . . . . . . . . . 15
5.1.1 Flowchart and its Working . . . . . . . . . . . . . . . . . . 15
5.1.2 Implementing Background Subtraction . . . . . . . . . . . 17
5.1.3 Implementing Contour Detection . . . . . . . . . . . . . . 18
5.2 Classification of Objects with YOLO V3 . . . . . . . . . . . . . . 19
iv
6 Testing and Validation 23
6.1 Training and Testing the Dataset (Weights) . . . . . . . . . . . . 23
6.2 Validation of Confusion Matrix . . . . . . . . . . . . . . . . . . . 23
References 29
v
List of Figures
vi
List of Tables
vii
Nomenclature
ACC - Accuracy
ERR - Error
FN - False Negatives
FP - False Positives
HD - High Definition
REC - Recovery
RGB - Red-Green-Blue
SEN - Sensitivity
SPC - Specificity
TN - True Negatives
viii
TP - True Positives
ix
Chapter 1
Introduction
Bird’s collision with airplanes and structures poses damage for both. While we
can observe migrating birds by using a [3] tracking radar, we found that birds
were searching for their prey looking down, while not seeing what is coming for-
ward. Many aircraft components like windshield, engine, wings are susceptible to
collision with birds. Especially windshields are very susceptible for high damage
[4]. According to the data obtained from Federal Aviation Administration, the
total number of bird strikes that happen annually have been increased six times
from 1,795 cases to 10,856 cases in the past 15 years. These birds strike accidents
have caused 103 aircrafts and 262 lives were lost from 1912 to 2008.
1
Chapter 1. Introduction 2
methods work on the detection of birds in raw images and add filters to make it
possible to see the bird in full resolution image. The image pre-processing can
work in convolutional or artificial neural networks (ANN).
This section gives the survey of related work done by the researchers.
It has been calculated that this vision-based detection can detect objects up
to 45 meters away. It was analyzed by calculating the total number of pixels
within the target to the camera. Roberto Albertani [15] presents a method of
determining bird presence and likelihood of collision based on object recognition
using cascading classifiers and a backup tracking system. In addition to removing
repeating false positives, the program also strengthens the detection system in
the process, resulting in a powerful avian detection system from a blade-mounted
camera. To ensure that the selected components will operate correctly, hardware
validation was conducted. As a housing for the camera, transmitter, and power
supply, a 3D-printed on-blade enclosure was built.
On the other hand, there is a method [16] in which through algorithms of ar-
tificial intelligence we can identify the bird. Moreover, the naturalist’s examined
the system and concluded that this system is having enough solidity of identify-
3
Chapter 2. Survey of Related Work 4
ing the bird. Furthermore, it is able to categorize the birds according to their
measurements.
According to Uma D. Nadimpalli [17], In their system, they used some meth-
ods together for identifying birds and performed it on 3 different images. First
and foremost is image morphology, the other one is Artificial Neural Networks
and the last one is template matching which gave the expected outcome for the
three images. However, ANN is advanced during this and is image morphology
incorporated in it and they produced the output according to the complexity of
the images.
A few methods [18] that are used to detect mammals can be used on birds, but
to detect birds with low pixels with aerial thermal-infrared images that are used
often on greater mammals will be limited in very little execution. Researchers
in bird populations are able to reduce their time and resources spent monitoring
bird populations by using automated bird detection and counts in aerial images,
thanks to the continued development of the camera and drone technology [19] .
There is one more technology we have read about bird detection that is based
on radar [21]. In reality, radio waves can be sent back by birds with the help of
this we can detect a bird [22] with radar. According to the authors the system
they made is for observing the birds and gathering it’s moving direction details.
Moreover, they also introduced a tool that is Bio-Rad [23] which plays a key role
in collecting the details and studying the data. Furthermore, this will also help
in detecting a bird from a far distance. They also said that this can be performed
at any period without depending on the light [24].
Chapter 3
Problem Statement, Objective,
Hypothesis, and Main Contribution
We have a reliable number of methods for bird protection and their preserva-
tion. But some of the methods and systems that were mentioned in the survey
of related works are expensive and need many regulations to implement. Every
bird detection system detects the bird and monitor’s the required data according
to the system needs. The detection system must be stable and need to detect the
bird in an image with minimal number of pixels if possible. This project solves the
main challenges involved in the detection of birds. This project also deals with
the analysis of possible technologies and selecting the most relevant method for
the vision-based bird detection which allows detection of birds in video streams
and selecting the best possible method that would work in real-time at full [27]
HD resolution and also in grayscale. The selected method should be capable to
distinguish between birds and other objects like insects, turbines, etc.
6
Chapter 3. Problem Statement, Objective, Hypothesis, and Main Contribution7
3.2 Objectives
Four research questions define the objective of the project.
They are many methods to detect moving objects in grayscale images or video
sequences such as shadow detection, frame difference method, background sub-
traction, etc. But, to detect the moving object at the required resolution and
with a minimum [28] FPS rate, the background subtraction technique is effective
as it can detect the moving objects in a series of grayscale images accurately.
To classify moving objects in a video, the convolution neural networks (CNN)
technique is used. Rather than any neural networks, CNN are based on pattern
recognition. This method can classify moving objects by extracting them from a
raw video to detect wind turbines, trees, and clouds. The Haar Cascade classifier
is also used as it will apply filters, representing the bird, blades, trees, and clouds
to the background subtracted sequence of images in required resolution with a
rate of a minimum of 15 FPS.
3.3 Hypothesis
The hypothesis of our project is the system consists of only two functionali-
ties. Bird detection and object classification. For bird detection, we are using a
background subtraction model called the Gaussian mixture model. It is a fun-
damental method to extract moving objects from a fixed background. As for
object classification, we are using [29] convolutional neural networks (CNN) and
[30] Haar Cascade classifier. These classifiers will filter the bird, blades, trees,
and clouds. Bird detection can vary between many methods but is done com-
monly with threshold methods like gray, RGB, and size threshold. These methods
Chapter 3. Problem Statement, Objective, Hypothesis, and Main Contribution8
can separate the birds from the background moving objects with their required
threshold levels and with neural networks techniques to detect the bird.
Many deep learning methods have been used for the detection of birds. Some
of the methods are convolutional neural networks, image pre-processing, and mo-
tion detection techniques. These methods can be trained on the dataset which is
created from the obtained images and videos and used to identify the birds and
other objects in the provided data and classify them to make it possible to see the
bird of the desired size in pixels with full resolution. We can use morphological
image pre-processing to separate the background moving objects from the bird.
The image pre-processing can work in convolutional or artificial neural networks
[31]. Template matching is done on pre-processed images where templates have
been stored in a database and accessed one by one for correlating each template
image with an input image. We are proposing a method where we would like to
detect the bird in a minimal size less than 100x100 pixels. The proposed detection
algorithm needs to work in real-time at grayscale with full HD resolution. More-
over, the system has to distinguish birds from the background or other moving
objects like wind turbine blades, trees, or clouds.
The code is developed in a way that it can detect 4 parameters that are trees,
clouds, windmill blades, and birds, and use [32] grayscale background subtraction
to extract the parameters from the background. The detected moving objects
will be shown in the background-subtracted frame. Therefore, the bird can be
detected by subtracting the other moving objects like trees and clouds in the
background subtraction.
Chapter 4
Modeling and Design
From Figure 4.1, the User-Driven Design methodology helps to procure the
objectives of the thesis. It helps in identifying the specific stages of the thesis. It
provides the flow of the thesis. The initial step involves identifying the existing
methods from the literature. The existing methods are analyzed for finding the
state-of-the-art method that suits the proposed problem statement. The algo-
rithm is proposed from the selected algorithms to solve the constraints of the
problem statement and is evaluated on the data provided by Bioseco. Shown in
Figure 4.1 User-Driven Design methodology was based on the [33] design method-
ology approach.
9
Chapter 4. Modeling and Design 10
The [35] product development involves technologies and algorithms that are
possible to develop a vision-based bird detection system. Product development
consists of two stages, where technology and algorithm is one stage and modelling
and prototyping is another stage. The possible technology and algorithm that
we consider to use in the system are [36] Open-CV, [37] YOLOv3, and [38] back-
ground subtraction technique. The prototyping shows the preliminary version of
the system.
• The table has been divided into three main sections. Those are functional-
ities, particular constraints, and possible technologies and algorithms.
• The functionalities section is divided into two sections i.e, general and item-
ized. These functionalities represent the main parameters of the project.
This method is mainly used on the computational basis to subtract the moving
objects (foreground objects) from the background in a series of frames or a video
sequence. The idea of this background subtraction method is to extract the
current image from the reference image. We are going to use a [41] Gaussian
mixture-based background subtraction and morphological filter which depends
on moving bounded objects.
We choose the [42] MOG2 method because of its low memory consumption
and low complexity rate. For the MOG2 the background is considered as a para-
metric frame and each pixel is represented as the particular number for Gaussian
function. The equation [43] is given as
k
P (Xt ) = ωi ,t .η(Xt , μi ,t , σi, t) (4.1)
i=1
where
Xt = Observations
η = ith Gaussian Component
ωi ,t = Data(W eights) associated by ith component with time t
μ = M ean intensity
σi, t = Standard deviation
t = time
Manual data(y-true)
Predicted - Detected Not detected
data Detected True Positive False Positive
(y- pred) Not Detected False Negative True Negative
where TPR is also known as recall or sensitivity measures the recovery rate of
positives and specificity measures the recovery rate of negatives.
In this section, we are going to specify the technical details of the project and
describe how the prototype is built in detail with all types of detection methods
and algorithms.
The videos that are provided by the Bioseco company have been given as input.
The video will be converted into frames, where every frame will store in a folder
with the same directory as the video. The frames will be then processed by back-
ground subtraction algorithm where the constant background frame is removed
from the grayscale frames which, lets us determine the contour detection.
15
Chapter 5. Implementation 16
The classification begins with reading the objects in every frame. The detected
objects were classified from the initial frame to the last frame. In the processed
frames, the detected objects will be further marked with labels. These frames
store the added labels in data that is created by the program. This data is called
the predicted(y-pred) data. This obtained object data will be stored in the image
folder.
This processing of frames will continue until the completion of the last frame.
After obtaining the data from the last frame, comparing the actual data and the
obtained data will begin by using the confusion matrix. The matrix compares
the actual data values(y-true) and the obtained data values (y-pred) and create
four data values such as TP, TN, FP and FN. The confusion matrix is obtained
and the result can be seen in the program output.
In Figure 5.4 the extracted grayscale image from the given input video frame
is shown. As one can see here, the three small dots together in the middle of the
image are birds, which can be found after doing the background subtraction to
the given input.
contour detection and given the function "detections = []" to store those de-
tections. After storing the detections, we will go through each detected object.
Now we have given the "for" loop in contours. To calculate the area and remove
the small elements, we have given [50] "cv2.contourArea(cnt)" which will count
the area in pixels of an object. Now, we have given an "if" statement to detect
the object with its area greater than 30 pixels and store the data in the detec-
tion list by giving"if (cont_ar > 30 and cont_ar < 20000):" [51]. By using this
"cv2.boundingRect(cnt)" we have given four parameters to create a bounding box
around the object. Here Figure 5.5 shows the output of the contour detection
stage where the detected bird and blade are indicated by the bounding box.
YOLO v3 algorithm. This algorithm has been trained on a data set before using
it to classify the objects of interest in our work. We have done the classification
for four detected objects that is bird, blade, trees, and clouds. The accuracy
for the bird must be 95% and for the blade with 90% accuracy. The clouds are
up-to 80% and the trees are with 70% accuracy. We have first extracted the
grayscale image from the provided raw input video and thereafter subtracted the
background in the grayscale image. As shown in Figure 4.3, in the background
subtracted output, we have added a bounding box around the moving objects. To
perform this detection, we have taken the enlarged part of the moving bounded
objects where the cropped image of the detected object will be displayed. This
cropped output has been given as the input for the classifier.
We have used the data provided by the Bioseco company to train the classifier
weights. We have first converted the provided data into frames. To apply the
trained model on the frames, we need the model weights and model configurations.
We need to give the required classes for training and need to filter them. Then
the frames will be fed into the trainer and the weights file will be obtained after
training it for a few hours. We need to use this weights file, configurations file,
and the objects file in the classification code, and the obtained output can classify
bird, blade tree, and cloud.
The classification is done on an image where the bird is detected and has been
shown with a [53] rectangular box around the bird which gives its total True
or False Positives and Negatives, and threshold accuracy of 97.67% as it was
obtained in the trained frame.
After training the weights file and the configurations file, now we will use these
files in the development of the classification of objects. The libraries that are
used in this software implementation are Open-CV, NumPy, time, and Scikit-
learn [37]. We have developed this software by using the Python programming
language. There are four phases in which the software has been implemented.
In the first phase, the given input video has been taken and converted into the
grayscale frames by using the background subtraction method. We have used the
mixture of gaussian-based subtraction techniques to implement the background
subtraction. After converting the frames into grayscale frames, the moving objects
can be identified easily.
In the next phase, we are giving the contours around the moving objects.
Therefore object detection has been done where the contours have been given to
the moving objects in the video. From these moving objects, we need to classify
the detected objects into classes named Birds, Blade, Trees, and Clouds.
Chapter 5. Implementation 21
In the later phase which consists of classification, [54] we are going to use the
weights file, configurations files, and the objects file. From the objects file, we
have filtered the four object classes which needs to be classified. For this purpose,
the configurations file has been used to mark the four object classes. Now while
giving the raw video as the input, we have created a blob function with the height,
width, and shape of the given object frames which is used for the classification
by using the weights file. Figure 5.6 shows the classified image of Bird and Cloud
with its accuracy.
Figure 5.6: Three objects, classified as a bird, marked by a green rectangle, and
the clouds marked by two larger black rectangles.
To classify, we are giving the weights file to two inputs. One is the raw input,
another one is the cropped frame input. By giving the weights file to these two
inputs, we can show the classified objects in the full-frame and the cropped frame.
Chapter 5. Implementation 22
In the final phase, we have created a confusion matrix [55] that will calculate
the true positive (TP), true negative (TN), false positive (FP), and false negatives
(FN). We are going to calculate them by using by comparing manual data and
the predicted data. The manual data is the one that we are going to give and
the predicted data is the one the software has been predicted. By comparing
these two, we can obtain the TP, TN, FP, and FN values. Figure 5.7 contains the
image of classified blade, cloud, and trees with accuracy and Figure 5.8 contains
the image of classified blade and clouds.
23
Chapter 6. Testing and Validation 24
confusion matrix shown in Figure 6.2 has been obtained using the frames from a
video file; an example of a frame from that video file is shown in Figure 5.6
tree. The obtained true positive count for the blade is 119, the cloud is 65 and
the tree is 36. The number of frames that were saved in the output was 184. The
confusion matrix shown in Figure 6.3 has been obtained using the frames from a
video file; an example of a frame from that video file is shown in Figure 5.7
Figure 6.4 shows the predicted data and calculated true positives between the
blade and the cloud. In this figure, we can see the predicted results and the
confusion matrix for the Blade is 1405 and the cloud is 42. The confusion matrix
shown in Figure 6.4 has been obtained using the frames from a video file; an
example of a frame from that video file is shown in Figure 5.6. The number of
frames that were saved in the output was 400. However, this shows order defined
Chapter 6. Testing and Validation 26
by equation 6.1 works in a way that if the video doesn’t have any bird detections,
then the priority will shift to the blade and so on.
Chapter 7
Conclusions and Future Work
In this developing world, software that can identify and classifies birds and other
objects is necessary. Saving the rare bird species to prevent damage on windmills
can decrease the financial loss and as well as save the bird species. This software
helps identify the moving objects in the air near the windmills, terrestrial places,
and airports and can detect the object with great accuracy.
We have also tried the MobileNet-SSD classifier. It has given the best results
and can pinpoint the detected objects and classified them. But by using this
classifier, we cannot able to use the weights files simultaneously to two inputs,
and creating a confusion matrix from this classifier is not possible. Therefore, we
have taken the YOLO v3 classifier.
By this thesis, we concluded that the YOLO v3 classifier gives the best ac-
curacy in classifying the different objects when compared to other classifiers like
Haar Cascade and Mobile-Net SSD classifier. This YOLO V3 classifier has dif-
27
Chapter 7. Conclusions and Future Work 28
ferent classes so it gives better accuracy and it erases the false positives which
increase the efficiency of the detected system.
This thesis provides a better classification and this can be extended to increase
the performance time of classification by using multi-threading processing, which
can be used on videos with high frames per second. The classification with the
bounding box can be extended to the segmentation of the objects. The classifica-
tion can be improved by developing new classifiers with deep learning with more
scope on eradicating the false positives.
In this thesis, we have used YOLO v3 classifier to detect and classify birds,
windmills, clouds, and trees. If any researchers want to work in the future, they
can use different classifiers with more complicated neural networks like Deep Con-
volution Neural Networks (DCNN), which may give more accuracy in detection
and classification. This software can be further developed to work in live oper-
ations. The classification tasks can also be expanded by differentiating various
objects and able to classify the rare bird species and estimating distance between
the birds and windmills, and able to alert us and scare away the bird to prevent
collision with the windmill blade.
References
[2] K Samantha Nichols, Tania Homayoun, Joanna Eckles, and Robert B Blair.
Bird-building collision risk: An assessment of the collision risk of birds with
buildings by phylogeny and behavior using two citizen-science datasets. PloS
one, 13(8):e0201558, 2018.
[4] Vyacheslav Merculov and Dmitry Ivchenko. Simulation of bird collision with
aircraft laminated glazing. In Advances in Design, Simulation and Man-
ufacturing III: Proceedings of the 3rd International Conference on Design,
Simulation, Manufacturing: The Innovation Exchange, DSMIE-2020, June
9-12, 2020, Kharkiv, Ukraine–Volume 2: Mechanical and Chemical Engi-
neering, page 179. Springer Nature, 2020.
[5] Da Li, Bodong Liang, and Weigang Zhang. Real-time moving vehicle de-
tection, tracking, and counting system implemented with opencv. In 2014
4th IEEE International Conference on Information Science and Technology,
pages 631–634. IEEE, 2014.
[6] Tuan Tu Trinh, Ryota Yoshihashi, Rei Kawakami, Makoto Iida, and Takeshi
Naemura. Bird detection near wind turbines from high-resolution video using
lstm networks. In World Wind Energy Conference (WWEC), volume 2,
page 6, 2016.
29
References 30
[8] Thomas Grill and Jan Schlüter. Two convolutional neural networks for bird
detection in audio signals. In 2017 25th European Signal Processing Confer-
ence (EUSIPCO), pages 1764–1768. IEEE, 2017.
[9] Suk-Ju Hong, Yunhyeok Han, Sang-Yeon Kim, Ah-Yeong Lee, and Ghiseok
Kim. Application of deep-learning methods to bird detection using un-
manned aerial vehicle imagery. Sensors, 19(7):1651, 2019.
[12] Roelof Frans May, Øyvind Hamre, Roald Vang, and Torgeir Nygård. Eval-
uation of the dtbird video-system at the smøla wind-power plant. detection
capabilities for capturing near-turbine avian behaviour. NINA rapport, 2012.
[13] Gustavo Gil, Giovanni Savino, Simone Piantini, and Marco Pierini. Is stereo
vision a suitable remote sensing approach for motorcycle safety? an analysis
of lidar, radar, and machine vision technologies subjected to the dynamics
of a tilting vehicle. Proceedings of the 7th Transport Research Arena TRA,
Vienna, Austria, 12, 2017.
[14] Xin Feng, Youni Jiang, Xuejiao Yang, Ming Du, and Xin Li. Computer
vision algorithms and hardware implementations: A survey. Integration,
69:309–320, 2019.
[15] William Gage Maurer. Bird and bat interaction vision-based detection sys-
tem for wind turbines. 2016.
[17] Uma D Nadimpalli, Randy R Price, Steven G Hall, and Pallavi Bomma. A
comparison of image processing techniques for bird recognition. Biotechnol-
ogy progress, 22(1):9–13, 2006.
[18] Gábor Bakó, Márton Tolnai, and Ádám Takács. Introduction and testing of a
monitoring and colony-mapping method for waterbird populations that uses
high-speed and ultra-detailed aerial remote sensing. Sensors, 14(7):12828–
12846, 2014.
References 31
[20] Jeongjin Jo, Junwon Park, Jinyoung Han, Minsun Lee, and Anthony H
Smith. Dynamic bird detection using image processing and neural network.
In 2019 7th International Conference on Robot Intelligence Technology and
Applications (RiTA), pages 210–214. IEEE, 2019.
[22] Anthony D Fox and Patrick DL Beasley. David lack and the birth of radar
ornithology. Archives of natural history, 37(2):325–332, 2010.
[23] Adriaan M Dokter, Peter Desmet, Jurriaan H Spaaks, Stijn van Hoey,
Lourens Veen, Liesbeth Verlinden, Cecilia Nilsson, Günther Haase, Hidde
Leijnse, Andrew Farnsworth, et al. biorad: biological analysis and visualiza-
tion of weather radar data. Ecography, 42(5):852–860, 2019.
[24] Hans van Gasteren, Karen L Krijgsveld, Nadine Klauke, Yossi Leshem, Is-
abel C Metz, Michal Skakuj, Serge Sorbi, Inbal Schekler, and Judy Shamoun-
Baranes. Aeroecology meets aviation safety: early warning systems in europe
and the middle east prevent collisions between birds and aircraft. Ecography,
42(5):899–911, 2019.
[25] William L Thompson. Towards reliable bird surveys: accounting for individ-
uals present but not detected. The Auk, 119(1):18–25, 2002.
[26] Scott R Loss, Tom Will, and Peter P Marra. Estimates of bird collision
mortality at wind facilities in the contiguous united states. Biological Con-
servation, 168:201–209, 2013.
[27] Holger Flatt, Holger Blume, and Peter Pirsch. Mapping of a real-time ob-
ject detection application onto a configurable risc/coprocessor architecture
at full hd resolution. In 2010 International Conference on Reconfigurable
Computing and FPGAs, pages 452–457. IEEE, 2010.
[28] Jing Yi Tou and Chen Chuan Toh. Optical flow-based bird tracking and
counting for congregating flocks. In Asian Conference on Intelligent Infor-
mation and Database Systems, pages 514–523. Springer, 2012.
[29] Haoxiang Li, Zhe Lin, Xiaohui Shen, Jonathan Brandt, and Gang Hua. A
convolutional neural network cascade for face detection. In Proceedings of
the IEEE conference on computer vision and pattern recognition, pages 5325–
5334, 2015.
References 32
[30] Sander Soo. Object detection using haar-cascade classifier. Institute of Com-
puter Science, University of Tartu, 2(3):1–12, 2014.
[31] Min-Seok Choi and Whoi-Yul Kim. A novel two stage template match-
ing method for rotation and illumination invariance. Pattern recognition,
35(1):119–129, 2002.
[32] J Cezar Silveira Jacques, Claudio Rosito Jung, and Soraia Raupp Musse.
Background subtraction and shadow detection in grayscale video sequences.
In XVIII Brazilian symposium on computer graphics and image processing
(SIBGRAPI’05), pages 189–196. IEEE, 2005.
[33] Damian Dziak, Bartosz Jachimczyk, and Wlodek J Kulesza. Iot-based in-
formation system for healthcare application: design methodology approach.
Applied sciences, 7(6):596, 2017.
[36] Gary Bradski and Adrian Kaehler. Opencv. Dr. Dobb’s journal of software
tools, 3:2, 2000.
[37] Liquan Zhao and Shuaiyang Li. Object detection algorithm based on im-
proved yolov3. Electronics, 9(3):537, 2020.
[39] Ahmed Elgammal, David Harwood, and Larry Davis. Non-parametric model
for background subtraction. In European conference on computer vision,
pages 751–767. Springer, 2000.
[40] Shahrizat Shaik Mohamed, Nooritawati Md Tahir, and Ramli Adnan. Back-
ground modelling and background subtraction performance for object de-
tection. In 2010 6th International Colloquium on Signal Processing & its
Applications, pages 1–6. IEEE, 2010.
[41] Sriram Varadarajan, Paul Miller, and Huiyu Zhou. Spatial mixture of gaus-
sians for dynamic background modelling. In 2013 10th IEEE International
Conference on Advanced Video and Signal Based Surveillance, pages 63–68.
IEEE, 2013.
References 33
[42] Thierry Bouwmans, Fida El Baf, and Bertrand Vachon. Background mod-
eling using mixture of gaussians for foreground detection-a survey. Recent
patents on computer science, 1(3):219–237, 2008.
[43] Fida El Baf, Thierry Bouwmans, and Bertrand Vachon. Fuzzy statistical
modeling of dynamic backgrounds for moving object detection in infrared
videos. In 2009 IEEE Computer Society Conference on Computer Vision
and Pattern Recognition Workshops, pages 60–65. IEEE, 2009.
[44] Pablo Arbelaez, Michael Maire, Charless Fowlkes, and Jitendra Malik. Con-
tour detection and hierarchical image segmentation. IEEE transactions on
pattern analysis and machine intelligence, 33(5):898–916, 2010.
[46] Teddy Surya Gunawan, Arselan Ashraf, Bob Subhan Riza, Edy Victor
Haryanto, Rika Rosnelly, Mira Kartiwi, and Zuriati Janin. Development
of video-based emotion recognition using deep learning with google colab.
TELKOMNIKA, 18(5):2463–2471, 2020.
[47] Zoran Zivkovic. Improved adaptive gaussian mixture model for background
subtraction. In Proceedings of the 17th International Conference on Pattern
Recognition, 2004. ICPR 2004., volume 2, pages 28–31. IEEE, 2004.
[48] David E Allen, Michael McAleer, and Bernardo da Veiga. Modelling and fore-
casting dynamic var thresholds for risk management and regulation. Avail-
able at SSRN 926270, 2005.
[49] Xin-Yi Gong, Hu Su, De Xu, Zheng-Tao Zhang, Fei Shen, and Hua-Bin
Yang. An overview of contour detection approaches. International Journal
of Automation and Computing, 15(6):656–672, 2018.
[50] Ruchi Manish Gurav and Premanand K Kadbe. Real time finger tracking and
contour detection for gesture recognition using opencv. In 2015 International
Conference on Industrial Instrumentation and Control (ICIC), pages 974–
977. IEEE, 2015.
[51] Xhensila Poda and Olti Qirici. Shape detection and classification using
opencv and arduino uno. RTA-CSIT, 2280:128–36, 2018.
[52] Yiting Li, Haisong Huang, Qingsheng Xie, Liguo Yao, and Qipeng Chen.
Research on a surface defect detection algorithm based on mobilenet-ssd.
Applied Sciences, 8(9):1678, 2018.
References 34
[53] Yihui He, Chenchen Zhu, Jianren Wang, Marios Savvides, and Xiangyu
Zhang. Bounding box regression with uncertainty for accurate object de-
tection. In Proceedings of the IEEE/CVF Conference on Computer Vision
and Pattern Recognition, pages 2888–2897, 2019.
[54] Wei Fang, Lin Wang, and Peiming Ren. Tinier-yolo: A real-time object
detection method for constrained environments. IEEE Access, 8:1935–1944,
2019.
[55] Nadav David Marom, Lior Rokach, and Armin Shmilovici. Using the confu-
sion matrix for improving ensemble classifiers. In 2010 IEEE 26-th Conven-
tion of Electrical and Electronics Engineers in Israel, pages 000555–000559.
IEEE, 2010.
[56] Guanqing Li, Zhiyong Song, and Qiang Fu. A new method of image detection
for small datasets under the framework of yolo network. In 2018 IEEE
3rd Advanced Information Technology, Electronic and Automation Control
Conference (IAEAC), pages 1031–1035. IEEE, 2018.
We have Achieved the desired output by developing and using this code.
1 import cv2 # to install this cv2 "pip
install opencv-python" in command prompt
2 import numpy as np
3 import time
4 from sklearn import metrics
5
6 #greater than or equal to variable "maxObjPerFrame" (
object count) => frame will be saved
7 maxObjPerFrame = 2
8 #Do have the weights and .cfg file in the same folder
where you have the code file
9 net = cv2.dnn.readNet("yolov3_custom2_last.weights","
yolov3_custom2.cfg")
10
11 #defining parameters related Video Saving Function
12 codec = cv2.VideoWriter_fourcc(’X’, ’V’, ’I’, ’D’)
13 framerate = 29
14 resolution = (640, 480)
15
16 classifierOut = cv2.VideoWriter("C:\\Users\\PC\\Desktop\\
WorkSpace\\ImageProcessingProject\\secondSetData\\
classifierResult.avi", codec, framerate, resolution)
17 croppedOut = cv2.VideoWriter("C:\\Users\\PC\\Desktop\\
WorkSpace\\ImageProcessingProject\\secondSetData\\
croppedResult.avi", codec, framerate, resolution)
18 substractionOut = cv2.VideoWriter("C:\\Users\\PC\\Desktop
\\WorkSpace\\ImageProcessingProject\\secondSetData\\
substractionResult.avi", codec, framerate, resolution)
19
20 classes = []
21 with open("obj.names","r") as f:
22 classes = [line.strip() for line in f.readlines()]
23
35
References 36
24 print(classes)
25
26 layer_names = net.getLayerNames()
27 outputlayers = [layer_names[i[0] - 1] for i in net.
getUnconnectedOutLayers()]
28
29 colors= np.random.uniform(0,255,size=(len(classes),3))
30
31 #loading video
32 cap=cv2.VideoCapture("birdcloud1.mp4") #Do have
the video in the same folder where you have the code
file
33 backSub = cv2.createBackgroundSubtractorMOG2(history= 2,
varThreshold=10, detectShadows=False)
34 font = cv2.FONT_HERSHEY_DUPLEX
35 starting_time= time.time()
36 frame_id = 0
37
38 ################Important Do create a Folder with name =>
"Images"
39 #Saving Image function in Images folder
40 pictureCount = 1
41 def saveImage(img, pictureCount):
42 name = ’./Images/image_’+ str(pictureCount)+ ’.jpg’
43 print (’Creating...’ + name)
44 cv2.imwrite(name, img)
45
46 #previously created cropped and substraction function
47 def otherWindows(image):
48 output_img = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# converting image to grayscale
49 fgMask = backSub.apply(output_img) # applying
background subtraction
50 contours, _ = cv2.findContours(fgMask, cv2.RETR_TREE,
cv2.CHAIN_APPROX_SIMPLE)
51 #detections = []
52 for cnt in contours:# here you are going through each
detected object
53 # Area is calculated and all small elements are
removed
54 cont_ar = cv2.contourArea(cnt) #counting area in
pixcels of an object
55 if (cont_ar > 15 and cont_ar < 20000): # if
References 37
90 blob_c = cv2.dnn.blobFromImage(img,0.00392,(320,320)
,(0,0,0),True,crop=False)
91 net.setInput(blob_c)
92 outs_c = net.forward(outputlayers)
93
94 for out in outs_c:
95 for detection in out:
96 scores = detection[5:]
97 class_id = np.argmax(scores)
98 confidence = scores[class_id]
99
100 if confidence > 0.3:
101 center_x= int(detection[0]*width)
102 center_y= int(detection[1]*height)
103 w = int(detection[2]*width)
104 h = int(detection[3]*height)
105
106 x=int(center_x - w/2)
107 y=int(center_y - h/2)
108
109 cv2.rectangle(img,(x,y),(x+w,y+h),colors[
class_id],3)
110
111
112
113
114
115 y_true = [’bird’, ’cloud’, ’bird’, ’bird’, ’cloud’, ’bird
’, ’bird’, ’bird’, ’bird’,
116 ’bird’, ’cloud’, ’cloud’, ’cloud’, ’cloud’, ’
cloud’, ’cloud’, ’cloud’, ’bird’,
117 ’bird’, ’bird’, ’cloud’, ’cloud’, ’bird’, ’bird
’, ’bird’, ’bird’, ’bird’, ’bird’,
118 ’bird’, ’bird’, ’bird’, ’bird’, ’bird’, ’bird’,
’bird’, ’bird’, ’bird’,
119 ’bird’, ’bird’, ’bird’, ’bird’, ’bird’, ’bird’,
’cloud’, ’bird’, ’bird’,
120 ’bird’, ’bird’, ’bird’, ’bird’, ’bird’, ’bird’,
’cloud’, ’bird’, ’bird’,
121 ’bird’, ’bird’, ’bird’, ’bird’, ’bird’, ’bird’,
’bird’, ’bird’, ’bird’,
122 ’bird’, ’bird’, ’bird’, ’bird’, ’bird’, ’bird’,
’bird’, ’bird’, ’bird’,
References 39
178
179 cv2.rectangle(frame,(x,y),(x+w,y+h),
colors[class_id],3)
180 l= len(label+" "+str(round(confidence,2))
)
181
References 41