14C Thesis
14C Thesis
1|Page
CLASSIFICATION OF ANIMALS BASED ON
CHARACTERISTIC BODY MARKINGS
A PROJECT REPORT
Submitted in partial fulfilment of the requirements for the award of the degree of
BACHELOR OF TECHNOLOGY
in
COMPUTER SCIENCE AND ENGINEERING
BY
2|Page
Maharaj Vijayaram Gajapathi Raj (MVGR) College of Engineering (Autonomous)
Vizianagaram
CERTIFICATE
This is to certify that the project report entitled “Classification of Animals based on
Characteristic Body Marking” being submitted by Vadlana Subhash, Voonna Abhishek,
Sagi Gayathri, Tirumalaraju Mohith Varma bearing registered numbers 18331A05G3,
18331A05G9, 18331A05C7, 18331A05F8 respectively, in partial fulfilment for the award of
the degree of “Bachelor of Technology” in Computer Science andEngineering is a record
of bonafide work done by them under my supervision during the academic year 2021-2022.
External Examiner
3|Page
DECLARATION
4|Page
ACKNOWLEDGEMENTS
The success and final outcome of this project required a lot of guidance and assistance from
many people and we are extremely privileged to have got this all along the completion of my
project. All that we have done is only due to such supervision and assistance and we would not
forget to thank them.
We place on record our heartfelt appreciation and gratitude to Mrs. M. Beulah Rani for the
immense cooperation and navigation as mentor in bringing out the project work under his
guidance. We are deeply indebted to him for his excellent, enlightened and enriched guidance.
We also thank Dr. K. V. L. Raju, Principal, and Dr. P. Ravi Kiran Varma, Head of the
Department, for extending their utmost support and co-operation in providing all the
provisions for the successful completion of the project.
We are also thankful for and fortunate enough to get constant encouragement and guidance
from our panel members Mrs. B. Aruna Kumari, Mrs. K. Santosh Jhansi and Mr. K. Leela
Prasad. We sincerely thank all the members of the staff in the Department of Computer
Science & Engineering for their sustained help in our pursuits. We thank all those who
contributed directly or indirectly in successfully carrying out this work.
Vadlana Subhash
Voonna Abhishek
Sagi Gayathri
5|Page
ABSTRACT
The objective of our project is to help in classifying the animals present in the
images, into four different species of animals namely Tiger, Cheetah, Jaguar and
Hyena.
6|Page
LIST OF CONTENTS
CONTENTS Page No
LIST OF ABBREVATIONS 9
LIST OF FIGURES 10
1. INTRODUCTION 12
3. SYSTEM DESIGN 40
4. IMPLEMENTATIONS 41
4.1 DATASET 41
7|Page
4.2 SAMPLE IMAGES 41
4.3 CODE 43
5. EXPERIMENTAL RESULTS 53
6. CONCLUSION 57
7. REFERENCES 58
8|Page
LIST OF ABBREVATIONS
AI - Artificial Intelligence
DL - Deep Learning
ML - Machine Learning
9|Page
LIST OF FIGURES
Figure 2-1 Existing System Flow Chart
Figure 2-9 Filters slides over input image forms output matrix
11 | P a g e
CHAPTER 1: INTRODUCTION
1.1 PROJECT OVERVIEW
Checking the movements and population of wild animals is a tiresome
process when done manually, without which may cause damage to villages in rural
areas or it may affect the population of animals to become an endangered species.
It also makes wild animals vulnerable to poachers and illegal trafficking groups.
To reduce burden on the officials and conservationists, we developed this
project called classification of animals based on their characteristic body
markings. This project helps in identifying the species of the animal present in
the given input image among four different species namely tiger, cheetah, jaguar
and hyena.
To identify the species of the animal we perform multi-class classification
among the four species of animals taken, using the Resnet-18 architecture. It is a
Convolution Neural Network (CNN) which is 18 layers deep. After training the
model with the images available in the dataset, the model will become capable of
identifying the species of the animal.
After training the model, we use the model for prediction. We take an image
as an input and display the probabilistic species name to which it belongs to.
1.2 PURPOSE
The Purpose of our project is to help people like biologists, ecologists and
conservationists, in identifying the species of animals present in the images. This
project can be used to help increase the population of wild animals by keeping an
eye on animals present in the territory and also their numbers.
This can also help in keeping poachers and illegal traffickers in-checkin
sanctuaries and in wild forests by keeping close eye on the animals present in the
territories, this may help in improving the numbers of a few endangered species of
animals.
12 | P a g e
1.3 OBJECTIVE
The objective of our project is to help in classifying the animals present in
the images, into four different species of animals namely Tiger, Cheetah, Jaguar
and Hyena.
In general, it will be too difficult to recognize among these four species of
animals, which are nearly similar to each other. So, by using this tool we will be
easily able to identify the species of the animal
1.4 LITERATURE SURVEY
Previously there exist many image classification algorithms which are developed
on the cat family, But many of them are developed on binary classification (only
on one animal). We are implementing the multi-class classification on the same
species of 4 animals, where we get more than 90% accuracy.
[1]Animal Detection Using Deep Learning Algorithm N. Banupriya1, S. Saranya2,
Rashmi Swaminathan3, Sanchithaa Harikumar4, Sukitha Palanisamy5
Many of them are implemented on different animals and for example (tiger and
other animals (other than cat family), we achieve more accuracy than many of
them. Checking wild animals in their common environment is crucial. This
proposed work develops an algorithm to detect the animals in wildlife. Since there
are many different animals, manually identifying them can be a difficult task. This
algorithm classifies animals based on their images so we can monitor them more
efficiently. Animal detection and classification can help to prevent animal- vehicle
accidents, trace animals and prevent theft. This can be achieved by applying
effective deep learning algorithms.
Thus this project uses the Convolutional Neural Network (CNN) algorithm to
detect wild animals. The algorithm classifies animals efficiently with a good
number of accuracy and also the image of the detected animal is displayed for a
better result so that it can be used for other purposes such as detecting wild animals
entering into human habitat and to prevent wildlife poaching.
16 | P a g e
systems produce good results; however, none of them obtains a high-accuracy
classification because of the lack of information. Doñana National Park is a very
rich environment with various endangered animal species. Thereby, this park
requires a more accurate and efficient system of monitoring to act quickly against
animal behaviours that may endanger certain species. In this letter, we propose a
hierarchical, wireless sensor network installed in this park, to collect information
about animals' behaviours using intelligent devices placed on them which contain
a neural network implementation to classify their behaviour based on sensory
information. Once a behaviour is detected, the network redirects this information
to an external database for further treatment. This solution reduces power
consumption and facilitates animals' behaviours monitoring for biologists.
17 | P a g e
1.5.2 SOFTWARE SPECIFICATIONS
1.5.2.1 OPERATING SYSTEM
18 | P a g e
Libraries Used
1.Torch
PyTorch is defined as an open source machine learning library for Python.
It is used for applications such as natural language processing. It is initially
developed by Facebook artificial-intelligence research group, and Uber’s Pyro
software for probabilistic programming which is built on it.
The major features of PyTorch are mentioned below −
● Easy Interface − PyTorch offers easy to use API; hence it is considered
to be very simple to operate and runs on Python. The code execution in this
framework is quite easy.
● Python usage − This library is considered to be Pythonic which smoothly
integrats with the Python data science stack. Thus, it can leverage all the
services and functionalities offered by the Python environment.
● Computational graphs − PyTorch provides an excellent platform which
offers dynamic computational graphs. Thus a user can change them during
runtime. This is highly useful when a developer has no idea of how much
memory is required for creating a neural network model.
2.Matplotlib
Matplotib is a comprehensive library for creating static, animated and
interactive visualizations. Matplotlib produces publication-quality figures in a
variety of hardcopy formats and interactive environments across platforms.
Matplotlib can be used in Python scripts, the Python and IPython shell, web
application servers, and various graphical user interface toolkits.
3.OS
The OS module in python provides functions for interacting with the
operating system. OS, comes under Python’s standard utility modules. This
module provides a portable way of using operating system dependent
functionality. The “os” and “os.path” modules include many functions to interact
with the file system
19 | P a g e
CHAPTER 2: SYSTEM STUDY AND ANALYSIS
20 | P a g e
the animal easily.
● Once the animal is recognized, the animal is displayed along with the
probabilistic class.
● In this project we perform a multi-class classification to identify the species
of animal present in the image.
Of the various classification techniques, the most common ones are the following-
22 | P a g e
Risk,” or “Unsafe” then would be classed as a multi-class classification problem.
Note – Each observation can belong to only one class, and multiple classes can’t
be assigned to observation. Thus here, observation will either be “Safe” or “At-
Risk” or “Unsafe” and can’t be multiple things.
23 | P a g e
In deep learning, we create an artificial structure called an artificial neural net
where we have nodes or neurons. We have some neurons for input value and some
for output value and in between, there may be lots of neurons interconnectedin the
hidden layer.
24 | P a g e
● Artificial Neural Network
Input data is taken in, transformed and applied. The repetition of steps
allows the artificial neural network to learn several layers of non-linear
features and ultimately creates a prediction as the final layer (output). It
learns by generating an error signal measuring the differences between
predictions.
● Layer
Deep learning consists of building blocks, and layers are the highest-level
building blocks bordered by input, and output layers are the hidden layers.
The received weighed input is transformed and then passed on as output to
the next layer.
● Artificial Neuron or Unit
A unit refers to the activation function and inhibits numerous incoming and
outcoming connections. More complex units are referred to as long or short
term memory units.
2.3.4 WHY DEEP LEARNING?
Deep learning lays at the forefront of AI helping shape the tools we use to
achieve tremendous levels of accuracy. Advances in deep learning have pushed
this tool to the point where deep learning outperforms humans in some tasks
like classifying objects in images.
25 | P a g e
The most important difference between deep learning and traditional
machine learning is its performance as the scale of data increases. When the data
is small, deep learning algorithms don’t perform that well. This is because deep
learning algorithms need a large amount of data to understand it perfectly. On the
other hand, traditional machine learning algorithms with their handcrafted rules
prevail in this scenario.
In traditional Machine learning techniques, most of the applied features
need to beidentified by a domain expert in order to reduce the complexity of the
data andmake patterns more visible to learning algorithms to work. The biggest
advantage Deep Learning algorithms has is that they try to learn high-level
features from data in an incremental manner. This eliminates the need of domain
expertise and hard core feature extraction.
26 | P a g e
● The architecture of a CNN is analogous to that of the connectivity pattern
of neurons in the human brain and was inspired by the organization of the
Visual Cortex.
● Individual neurons respond to stimuli only in a restricted region of the visual
field known as the Receptive Field. A collection of such fields overlap to
cover the entire visual area.
● A CNN generally consists of:
1. Input Layer
2. Hidden Layers
3. Output layer
27 | P a g e
CNN image classifications take an input image, process it andclassify it under
certain categories. Computers sees an input image as array of pixels and it depends
on the image resolution. Based on the image resolution, it will see h x w x d
(h = Height, w = Width, d = Dimension).
Example., An image of 6 x 6 x 3 array of matrix of RGB (3 refers to RGB values)
and an image of 4 x 4 x 1 array of matrix of grayscale image.
Technically in deep learning CNN models, to train and test each input image
will pass it through a series of convolution layers with filters (Kernals), Pooling,
fully connected layers (FC) and apply Softmax function to classify an object with
probabilistic values between 0 and 1.
28 | P a g e
2.3.6 CNN LAYERS
1.Convolution Layer
The convolutional layer is the core building block of a CNN. The layer’s
parameters consist of a set of learnable filters (or kernels), which have a small
receptive field, but extend through the full depth of the input volume.
Convolution Operation:
The filter slides over the input image one pixel at a time starting from the top left.
The filter multiplies its own values with the overlapping values of the image while
sliding over it and adds all of them up to output a single value for each overlap
until the entire image is traversed:
29 | P a g e
Figure 2-9 Filter slides over input image forms output matrix
In the above animation the value 4 (top left) in the output matrix corresponds to
the filter overlap on the top left of the image which is computed as:
Similarly, we compute the other values of the output matrix. Each value in our
output matrix is sensitive to only a particular region in our original image.
Thus, the feature map is derived.
The size of the Feature Map (Convolved Feature) is controlled by three
parameters that we need to decide before the convolution step is performed:
Depth: Depth corresponds to the number of filters we use for the
convolution operation. The number of distinct filters used produces same
number of different feature maps. You can think of these feature maps as
stacked 2d matrices. So, the ‘depth’ of the feature map would be the number
of feature maps derived.
● Stride: Stride is the number of pixels by which we slide our filter matrix
over the input matrix. When the stride is 1 then we move the filters one pixel
at a time. When the stride is 2, then the filters jump 2 pixels at a time as we
slide them around. Having a larger stride will produce smaller featuremaps.
● Zero-padding: Sometimes, it is convenient to pad the input matrix with
zeros around the border, so that we can apply the filter to borderingelements
of our input image matrix. A nice feature of zero padding is thatit allows
us to control the size of the feature maps. Adding zero-padding is also
called wide convolution, and not using zero-padding would be a
narrow convolution.
30 | P a g e
2.Relu Layer
An additional operation called ReLU is used after every Convolution operation.ReLU
is a non-linear operation.
Its output is given by:
𝑶𝒖𝒕𝒑𝒖𝒕 =
𝑴𝒂𝒙(𝒛𝒆𝒓𝒐, 𝒊𝒏𝒑𝒖𝒕)
ReLU is an element wise operation (applied per pixel) and replaces all negative
pixel values in the feature map by zero. The purpose of ReLU is introduce non-
linearity in our ConvNet.
3.Polling Layer
The layer after convolutional layer is mostly pooling layer in CNN architecture.
It partitions the input image into a set of non-overlapping rectangles and, for each
such sub-region, outputs a value. The intuition is that the exact location of a feature
is less important than its rough location relative to other features.
There are three types of pooling
● Max Pooling
● Average Pooling
● Global Pooling
1. Max Pooling
Max pooling is a pooling operation that selects the maximum element from
the region of the feature map covered by the filter. Thus, the output after max-
pooling layer would be a feature map containing the most prominent features
31 | P a g e
of the previous feature map.
32 | P a g e
The Convolutional Layer and the Pooling Layer, together form the ith layer
of a Convolutional Neural Network. Depending on the complexities in the
images, the number of such layers may be increased for capturing low-levels
details even further, but at the cost of more computational power.\
4.Flattening Layer
In between the convolutional layer and the fully connected layer, there is a 'Flatten'
layer. Flattening transforms a two-dimensional matrix of features into a vector that
can be fed into a fully connected neural network classifier.
33 | P a g e
6.SoftMax Layer
Softmax layer is typically the final output layer in a neural network that performs
multi-class classification and hence the output can be interpreted as a probabilistic
value.
The name comes from the softmax function that takes input a number of scores
values and squashes them into values in the range between 0 and 1 whose sum is
1. Therefore, they represent a true probability distribution.
The standard (unit) softmax function is defined by the formula:
The approach used for animal classification is similar to the one applied for image
classification, which seeks deep-seated features by increasing the number of DL
layers. Therefore, our research was focused on finding a DL model suitable for
animal multi class classification by changing the ResNet hierarchy. This article
presents the achieved result—an improved version of the ResNet-18 baseline
model for CNNs. Due to the ResNet-18 characteristics, the CNN can extract more
features by increasing the number of convolutional layers while achieving an
improved accuracy.
34 | P a g e
CNN and RESNETS:
Compared with traditional neural networks, CNNs have two characteristics,
weight sharing and local connection, which greatly improve their ability to extract
features and lead to improved efficiency and reduced number of training
parameters. The main structure of a traditional CNN includes an input layer, a
convolutional layer, a pooling layer, a fully connected layer, and an output layer,
whereby the output of one layer serves as an input for the subsequent layer in the
structure . Usually, the convolutional and pooling layers are alternately used in the
structure.
The convolutional layer, the core of a CNN, contains multiple feature maps,
whereby each feature map contains multiple neurons. When a CNN is used for
image classification, for example, this layer scans the image through the
convolution kernel and makes full use of the information of the adjacent areas in
the image to extract image features. After using the activation function, the feature
map of the image is obtained as follows
where ⊗ represents the pooling operation. The main pooling methods include
maximum pooling, average pooling, and median pooling. In the fully connected
layer, the maximum likelihood function is used to calculate the probability of each
sample, and the learned features are mapped to the target label. The label with the
highest probability is used as the classification result to realize the CNN-based
classification. The deeper the CNN, the better its performance. However, with
deepening the network, two major problems arise: (1) the gradient dissipates,
which affects the network convergence, and (2) the accuracy tends to saturate. In
order to solve the problems of gradient vanishing explosion and performance
35 | P a g e
degradation caused by the depth increase, residual networks (ResNets) were
proposed in [9], which are easier to optimize and can gain accuracy from
considerably increased depth. The ResNet approach won the first place on the
ILSVRC 2015 classification task
37 | P a g e
Figure 2-18The structure of the improved ResNet-18 model.
The structure of the elaborated improved ResNet-18 model, which consists of four
parts: a convolutional layer, a classic ResNet-18 layer, an improved ResNet18
layer, and a fully connected layer. The first part, the convolutional layer, is used
mainly to perform basic feature extraction on the input data in orderto prepare these
for the next deeper level. The second part uses the classic ResNet- 18, which is
known as one of the best models used for animal multi classification.In this part,
the input data are convoluted twice, and the modified linear unit, ReLU, is added
between the two convolutions. ReLU zeros the output of some neurons, which
makes the network sparse and reduces the interdependence of parameters. It also
alleviates the occurrence of overfitting problems. On the other hand, the data before
convolution are inputted into a maximum pooling layer, which divides the sample
into feature regions and uses the maximum value in a region as the region
representative to reduce the amount of calculation and the number of parameters.
Finally, two kinds of data with the same dimension after different processing are
added to complete the creation of the block module. The
purpose of this step is to inherit the optimization effect of the previous step and
make the model continue to converge.
In order to achieve better performance, we use an improved ResNet-18 in the third
part. A batch norm is added before the classical ResNet-18 structure to accelerate
the training of the neural network, increase the convergence speed, andmaintain
the stability of the algorithm. The elaborated model goes through this structure
38 | P a g e
seven times, and then, the data are sent to the fourth part, which is a fully connected
layer.
Finally, the output data features are mapped from the fully connected layer to a
one-dimension vector, and the vector is regressed by a softmax function [38] (also
called a normalized exponential function), which is suitable for a multiobjective
classification. The goal is to transform the output feature vector of the fully
connected layer into an exponential
function and map an n-dimension real number vector into another n-dimension
vector by an exponential function. Finally, all the results are added and normalized
to present the multi classification results in the form of probability. The softmax
function used is defined as
39 | P a g e
CHAPTER 3: SYSTEM DESIGN
3.1 BLOCK DIAGRAM
Figure 3-1 is the block diagram for our project. In this first we are taken dataset
from Kaggle. After that split that dataset into train set and test set. For train set
80% ofthe dataset and for test set 20% of the dataset taken. After splitting the
dataset, we are train our model with ResNet18.And then save that model for
prediction. For prediction the user gives the input to the saved model and then it
gives output as animal name.
40 | P a g e
CHAPTER 4: IMPLEMENTATION
4.1 DATASET
The dataset is collected from the Kaggle website.
The dataset consists of images classified into 2 folders namely Training and
Validation, the further consists of 4 folders each which represents the 4 species
of animals which are tiger, cheetah, hyena and jaguar. Each folder consists of 900
images.
The model is well trained using the training images from each class, and the
accuracy of the model is determined by the predictions of images present in the
validation folder.
4.2.1 CHEETAH
41 | P a g e
4.2.2 JAGUAR
42 | P a g e
4.3 CODE
TRAINING CODE:
from google.colab import drive
drive.mount('/content/drive')
import os
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision.models as models
from torchvision.datasets import ImageFolder
from torchvision.transforms import ToTensor
from torchvision.utils import make_grid
from torch.utils.data import random_split
from torch.utils.data.dataloader import DataLoader
import matplotlib.pyplot as plt
data_dir = '/content/ODS'
print('Folders :', os.listdir(data_dir))
classes = os.listdir(data_dir + "/training")
print(len(classes),'classes :', classes)
43 | P a g e
print('Label: ', dataset.classes[label], "("+str(label)+")")
plt.imshow(img.permute(1, 2, 0))
show_example(*dataset[0])
torch.manual_seed(43)
val_size = 400
train_size = len(dataset) - val_size
batch_size = 32
train_loader = DataLoader(train_ds, batch_size, shuffle=True, num_workers=4,
pin_memory=True)
val_loader = DataLoader(val_ds, batch_size*2, num_workers=4,
pin_memory=True)
test_loader = DataLoader(test, batch_size*2, num_workers=4,
pin_memory=True)
def get_default_device():
"""Pick GPU if available, else CPU"""
if torch.cuda.is_available():
return torch.device('cuda')
else:
44 | P a g e
return torch.device('cpu')
device = get_default_device()
device
if isinstance(data, (list,tuple)):
return [to_device(x, device) for x in data]
return data.to(device, non_blocking=True)
class DeviceDataLoader():
for b in self.dl:
yield to_device(b, self.device)
class ImageClassificationBase(nn.Module):
def training_step(self, batch):
images, labels = batch
out = self(images)
loss = F.cross_entropy(out, labels) # Calculate
lossreturn loss
class CnnModel(ImageClassificationBase):
def init (self):
super().__init ()
self.network = nn.Sequential(
nn.Conv2d((3,400,400), 100, kernel_size=3, padding=1),
nn.ReLU(),
nn.Conv2d(100, 150, kernel_size=3, stride=1, padding=1),
nn.ReLU(),
nn.MaxPool2d(2, 2), # output: 150 x 16 x 16
nn.Flatten(),
nn.Linear(128, 64),
nn.ReLU(),
46 | P a g e
nn.Linear(64, 32),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(32, 3))
class CnnModel2(ImageClassificationBase):
def init (self):
super().__init ()
# Use a pretrained model
self.network = models.resnet18(pretrained=True)
# Replace last layer
num_ftrs = self.network.fc.in_features
self.network.fc = nn.Linear(num_ftrs, 4)
50 | P a g e
def validation_epoch_end(self, outputs):
batch_losses = [x['val_loss'] for x in outputs]
epoch_loss = torch.stack(batch_losses).mean() # Combine losses
batch_accs = [x['val_acc'] for x in outputs]
epoch_acc = torch.stack(batch_accs).mean() # Combine accuracies
return {'val_loss': epoch_loss.item(), 'val_acc': epoch_acc.item()}
nn.Flatten(),
nn.Linear(128, 64),
nn.ReLU(),
nn.Linear(64, 32),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(32, 3))
def forward(self, xb):
return self.network(xb)
51 | P a g e
class CnnModel2(ImageClassificationBase):
def init (self):
super().__init ()
# Use a pretrained model
self.network = models.resnet18(pretrained=True)
# Replace last layer
num_ftrs = self.network.fc.in_features
self.network.fc = nn.Linear(num_ftrs, 4)
def forward(self, xb):
return torch.sigmoid(self.network(xb))
model = torch.load("/content/drive/MyDrive/model_weights1.pt")
train_transform= transforms.Compose([
transforms.Scale([512, 512]),
transforms.ToTensor(),])
url=input('Enter URL of Image')
img=io.imread(url)
disp=cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
cv2_imshow(disp)
img2 = Image.open(url)
img2 = train_transform(img2)
img2 = img2.unsqueeze(0)
img2 = Variable(img2)
if torch.cuda.is_available():
img2 = img2.cuda()
model.eval()
target = model(img2)
_, pred = torch.max(target.data, 1)
ans=str(pred[0])
classes={0:'Cheetah',1:'Hyena',2:'Jaguar',3:'Tiger'}
a=int(ans[7])
print('Prediction: ', classes[a])
52 | P a g e
CHAPTER 5: EXPERIMENTAL RESULTS
5.1 CLASSIFICATION RESULTS
The classification is applied to find out the species to which the animal in the given image
belongs to. The classification result in our model is measured by themetric of Accuracy.
Accuracy: The higher the value is, the more reliable our system is and the more is our
model trained.
The Accuracy of a model is obtained by the formula:
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝑁𝑜. 𝑜𝑓 𝐶𝑜𝑟𝑟𝑒𝑐𝑡 𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠/𝑇𝑜𝑡𝑎𝑙 𝑁𝑜. 𝑜𝑓 𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠
The accuracy of our model is found out to be 96.88%.
5.2 GRAPH
Two plots can be created from the trained Animal data.
1. A plot of loss on the training and validation datasets over training epochs.
2. A plot of accuracy on the training and validation datasets over training epochs.
A loss is a number indicating how bad the model's prediction was on a single
example. If the model's prediction is perfect, the loss is zero; otherwise,the loss
is greater. The loss is calculated on training and validation and its interpretation is
based on how well the model is doing in these two sets. It is the sum of errors made
for each example in training or validation sets.
An accuracy metric is used to measure the algorithm’s performance in an
interpretable way. Accuracy of a model is usually determined after the model
parameters and is calculated in the form of a percentage. It is the measure of how
accurate your model's prediction is compared to the true data.
53 | P a g e
5.3 TEST CASE SCENARIOS
54 | P a g e
The prediction of our model depends upon the visibility of the body markings of
the animal in the input image, so based on a few criteria like brightness, posture
of the animal in the image, similarity of the animal in the image with our given
species, the prediction varies.
55 | P a g e
training the svm model, which is not suitable for consideration. So, we thought
of implementing other models for acquiring better accuracy.
After implement both svm and Resnet algorithms we are going with Resnet
because we are getting more accurate (96%) results compared to svm (65%)
accuracy.
Our model utilizes the Resnet-18 architecture, when compared to SVM, this
model is trained in less time than the SVM, it also makes the prediction faster.
56 | P a g e
CHAPTER 6: CONCLUSION
In this project, an improved ResNet-18 model has been proposed for animal
classification, which help in classifying the animals present in the images, into four
different species of animals namely Tiger, Cheetah, Jaguar and Hyena where
improved ResNet-18 model can effectively be used to identify animal classes.
Therefore, the model has great application prospects and is worthy of further study
and elaboration. Another idea could be to identify the action that is being
performed by the animals like (eating, standing, walking. Etc) and number of
animals in an image can be identified which also can be implemented in future
works.
57 | P a g e
CHAPTER 7: REFERENCES
[1] Gooliaff, T.J., Hodges, K.E., 2018. Measuring agreement among experts in
classifying camera images of similar species. Ecology and Evolution
8,11009e11021. https://doi.org/10.1002/ece3.4567.
[2] Güthlin, D., Storch, I., Küchenhoff, H., 2014. Is it possible to individually
identify red foxes from photographs? Wildl. Soc. Bull. 38, 205e210.
https://doi.org/10.1002/wsb.377.7
[3] Hallgren, W., Santana, F., Low-Choy, S., Zhao, Y., Mackey, B., 2019. Species
distribution models can be highly sensitive to algorithm configuration. Ecol.
Model. 408, 108719. https://doi.org/10.1016/j.ecolmodel.2019.108719.
[4] Mendoza, E., Martineau, P., Brenner, E., Dirzo, R., 2011. A novel method to
improve individual animal identification based on camera-trapping data.
J.Wildl.Manag. 75, 973e979. https://doi.org/10.1002/jwmg.120
[5] Nguyen, H., Maclagan, S.J., Nguyen, T.D., Nguyen, T., Flemons, P., Andrews,
K., Ritchie, E.G., Phung, D., 2017. Animal recognition and identification with
deepconvolutional neural networks for automated wildlife monitoring. In:
2017 IEEE International Conference on Data Science and Advanced
Analytics(DSAA). https://doi.org/10.1109/DSAA.2017.31
[6] Norouzzadeh, M.S., Nguyen, A., Kosmala, M., Swanson, A., Palmer, M.S.,
Packer, C., Clune, J., 2018. Automatically identifying, counting, and
describing wildanimals in camera-trap images with deep learning. Proc. Natl.
Acad. Sci. Unit. States Am. 115, E5716eE5725.
https://doi.org/10.1073/pnas.1719367115
[7] abak, M.A., Norouzzadeh, M.S., Wolfson, D.W., Sweeney, S.J., Vercauteren,
K.C., Snow, N.P., Halseth, J.M., Salvo, P.A.D., Lewis, J.S., White, M.D.,
Teton,B.,Beasley, J.C., Schlichting, P.E., Boughton, R.K., Wight, B.,
Newkirk, E.S.,Ivan, J.S., Odell, E.A., Brook, R.K., Lukacs, P.M., Moeller,
A.K., Mandeville, E.G., Clune, J., Miller, R.S., 2019. Machine learning to
classify animal species in camera trap images: applications in ecology.
Methods in Ecology and Evolution 10, 585e590. https://doi.org/10.1111/2041-
210X.13120
58 | P a g e