Cat Vs Dog Classification Using Python
Cat Vs Dog Classification Using Python
the degree of
BACHELOR OF TECHNOLOGY
in
Guided by
Adgaonker Shashank
Designation
1
CANDIDATE’S DECLARATION
It is hereby certified that the work which is being presented in the B. Tech Industrial/In-
house training Report entitled "Cats vs Dogs Classification" in partial fulfilment of the
requirements for the award of the degree of Bachelor of Technology and submitted in the
Department of Electronics & Communication Engineering of BHARATI
VIDYAPEETH’S COLLEGE OF ENGINEERING, New Delhi (Affiliated to Guru
Gobind Singh Indraprastha University, Delhi) is an authentic record of our own work
carried out during a period from February2021 to March 2021 under the guidance of
Adgaonker Shashank , Designation.
The matter presented in the B. Tech Industrial/In-house training Report has not been
submitted by me for the award of any other degree of this or any other Institute.
This is to certify that the above statement made by the candidate is correct to the best of my
knowledge. He/She/They are permitted to appear in the External Industrial/In-house training
Examination
Dr S B Kumar
2
TABLE OF CONTENTS
List of Figures………………………………………………………………………………....4
Acknowledgements……………………………………………………………………………5
Abstract………………………………………………………………………………………..6
1.4 Algorithms………………………………………………………………………8
1.5 Sections…………………………………………………………………………..8
2.3 CNN…………………………………………………………………………....10
References……………………………………………………………………..22
3
LIST OF FIGURES
4
ACKNOWLEDGEMENT
We express our deep gratitude to Adgaonker Shashank, Designation for his valuable
guidance and suggestion throughout my project work. We are thankful to Winter training
for their valuable guidance.
We would like to extend my sincere thanks to Head of the Department, Prof. Kirti Gupta
for her time to time suggestions to complete my project work. I am also thankful to
Dharmender Saini, Principal for providing me the facilities to carry out my project work.
5
ABSTRACT
The topic of our project is Cats vs Dogs classification and for building the model we have used
Deep Learning algorithm CNN (Convolutional Neural Network) A Convolutional Neural
Network (ConvNet/CNN) is a Deep Learning algorithm which can take in an input image,
assign importance (learnable weights and biases) to various aspects/objects in the image and
be able to differentiate one from the other. The pre-processing required in a ConvNet is much
lower as compared to other classification algorithms. While in primitive methods filters are
hand-engineered, with enough training, ConvNets have the ability to learn these
filters/characteristics. And we have used metrics to find accuracy to avulate our model
6
Chapter 1
Introduction
This project aims to classify the input image as either a dog or a cat image. The image input
which you give to the system will be analyzed and the predicted result will be given as output.
Machine learning algorithm [Convolutional Neural Networks] is used to classify the image.
The model thus implemented can be extended to a mobile device or any website as per the
developer’s need..
Deep learning deploys artificial neural networks to recognize the hidden patterns of data in the dataset
provided. These algorithms are trained over an adequate amount of time and applied to a data set.
7
Deep learning uses artificial neural networks (ANN) to find the hidden patterns. These patterns
are the connection between various variables present in a dataset.
ANN algorithms are trained over a high volume of sample data and then applied to a new
dataset. Such algorithms stimulate the way for information processing and communicate
experiences similar to the biological nervous system. Deep learning has become a part of our
everyday lives: from search engines to self-driving cars that demand high computational power.
1.4 Algorithm
This model is created using the machine & and deep learning algorithms i.e CNN
Convolutional Neural Network (CNN) is an algorithm taking an image as input then assigning
weights and biases to all the aspects of an image and thus differentiates one from the other.
Neural networks can be trained by using batches of images, each of them having a label to
identify the real nature of the image (cat or dog here). A batch can contain few tenths to
hundreds of images. For each and every image, the network prediction is compared with the
corresponding existing label, and the distance between network prediction and the truth is
evaluated for the whole batch. Then, the network parameters are modified to minimize the
distance and thus the prediction capability of the network is increased. The training process
continues for every batch similarly.
CNN does the processing of Images with the help of matrixes of weights known as filters. They
detect low-level features like vertical and horizontal edges etc. Through each layer, the filters
recognize high-level features.
1.5 Sections
The proposed report organized as follows. In section 2, various works in the field are discussed
and the gap in exploring using machine learning and deep learning techniques available in
python has been highlighted. Section 3 discusses the methodology of the approaches applied
in this report using a block diagram. Results and discussions are detailed in the section 4. This
section explores the results for better understanding. It is also important that the performance
metrics derived from the models is proving the high accuracy and efficiency of the built model.
Section 5 concludes the work done in this report.
8
Chapter 2
Literature Review
This project aims to classify the input image as either a dog or a cat image. The image input
which you give to the system will be analyzed and the predicted result will be given as
output. Machine learning algorithm [Convolutional Neural Networks] is used to classify the
image. The model thus implemented can be extended to a mobile device or any website as
per the developer’s need.
2.1 challenges
Image classification is one of the major task because of its data size, handling a image data is
not an easy game. Image classification is where a computer can analyse an image and identify
the 'class' the image falls under. (Or a probability of the image being part of a 'class'.) A class
is essentially a label, for instance, 'car', 'animal', 'building' and so on. For example, you input
an image of a sheep. Simply put, image classification is where machines can look at an image
and assign a (correct) label to it. It’s a key part of computer vision, allowing computers to see
the world as we do. And with the invention of deep learning, image classification has become
more widespread.
Deeper exploration into image classification and deep learning involves understanding
convolutional neural networks. But for now, you have a simple overview of image
classification and the clever computing behind it.
9
Machine Learning (ML) image classification
1. Artificial Neural Network.
2. Convolutional Neural Network.
3. K nearest neighbor.
4. Decision tree.
5. Support Vector Machines.
In the classification, six different modalities, linear discriminant analysis (LDA), quadratic
discriminant analysis (QDA), -nearest neighbour ( NN), the Naïve Bayes approach, support
vector machine (SVM), and artificial neural networks (ANN), were utilized.
2.3 CNN
In the past few decades, Deep Learning has proved to be a very powerful tool because of its
ability to handle large amounts of data. The interest to use hidden layers has surpassed
traditional techniques, especially in pattern recognition. One of the most popular deep neural
networks is Convolutional Neural Networks.
Since the 1950s, the early days of AI, researchers have struggled to make a system that can
understand visual data. In the following years, this field came to be known as Computer Vision.
In 2012, computer vision took a quantum leap when a group of researchers from the University
of Toronto developed an AI model that surpassed the best image recognition algorithms and
that too by a large margin.
Background of CNNs
CNN’s were first developed and used around the 1980s. The most that a CNN could do at that
time was recognize handwritten digits. It was mostly used in the postal sectors to read zip
codes, pin codes, etc. The important thing to remember about any deep learning model is that
it requires a large amount of data to train and also requires a lot of computing resources. This
was a major drawback for CNNs at that period and hence CNNs were only limited to the postal
sectors and it failed to enter the world of machine learning.
10
Fig. 2.2: Layering
Cnn working
This article was published as a part of the Data Science Blogathon. In 2012 Alex Krizhevsky
realized that it was time to bring back the branch of deep learning that uses multi-layered neural
networks. The availability of large sets of data, to be more specific ImageNet datasets with
millions of labeled images and an abundance of computing resources enabled researchers to
revive CNNs.
Fig.2.3: Working
But we don’t really need to go behind the mathematics part to understand what a CNN is or
how it works.
Bottom line is that the role of the ConvNet is to reduce the images into a form that is easier to
process, without losing features that are critical for getting a good prediction.
11
How does it work?
Before we go to the working of CNN’s let’s cover the basics such as what is an image and how
is it represented. An RGB image is nothing but a matrix of pixel values having three planes
whereas a grayscale image is the same but it has a single plane. Take a look at this image to
understand more.
Convolutional neural networks are composed of multiple layers of artificial neurons. Artificial
neurons, a rough imitation of their biological counterparts, are mathematical functions that
calculate the weighted sum of multiple inputs and outputs an activation value. When you input
an image in a ConvNet, each layer generates several activation functions that are passed on to
the next layer.
The first layer usually extracts basic features such as horizontal or diagonal edges. This output
is passed on to the next layer which detects more complex features such as corners or
combinational edges. As we move deeper into the network it can identify even more complex
features such as objects, faces, etc.
KERAS IN CNN
A great way to use deep learning to classify images is to build a convolutional neural
network (CNN). The Keras library in Python makes it pretty simple to build a CNN. Computers
see images using pixels. ... This dataset consists of 70,000 images of handwritten digits from
0–9. We will attempt to identify them using a CNN.
Prediction in cnn
They have proven so effective that they are the ready to use method for any type of prediction
problem involving image data as an input. The benefit of using CNNs is their ability to develop
an internal representation of a two-dimensional image, Keras is used for creating deep
models which can be productized on smartphones. Keras is also used for distributed training
of deep learning models. Keras is used by companies such as Netflix, Yelp, Uber, etc
12
Chapter 3
Work Carried Out
Architecture
This below architecture of our project shows that we extracted the key features from the raw
data and then split it into a training dataset and testing dataset. The machine learning model
then trained using the training dataset and tested using the testing dataset. The model then
predicts the match outcome.
13
Features Provided:
Own image can be tested to verify the accuracy of the model
This code can directly be integrated with your current project or can be extended as a mobile
application or a site.
To extend the project to classify different entities, all you need to do is find the suitable dataset,
change the dataset accordingly and train the model
For example, let’s load and plot the first nine photos of dogs in a single figure
14
Running the example creates a figure showing the first nine photos of dogs in the dataset.
We can see that some photos are landscape format, some are portrait format, and some are
square
Again, we can see that the photos are all different sizes.
We can also see a photo where the cat is barely visible (bottom left corner) and another that
has two cats (lower right corner). This suggests that any classifier fit on this problem will
have to be robust.
15
There are many ways to achieve this, although the most common is a simple resize operation
that will stretch and deform the aspect ratio of each image and force it into the new shape.
We could load all photos and look at the distribution of the photo widths and heights, then
design a new photo size that best reflects what we are most likely to see in practice.
Smaller inputs mean a model that is faster to train, and typically this concern dominates the
choice of image size. In this case, we will follow this approach and choose a fixed size of
200×200 pixels
That is 25,000 images with 200x200x3 pixels each, or 3,000,000,000 32-bit pixel values. We
could load all of the images, reshape them, and store them as a single NumPy array. This could
fit into RAM on many modern machines, but not all, especially if you only have 8 gigabytes to
work with. We can write custom code to load the images into memory and resize them as part
of the loading process, then save them ready for modeling.
The example below uses the Keras image processing API to load all 25,000 photos in the
training dataset and reshapes them to 200×200 square photos. The label is also determined for
each photo based on the filenames. A tuple of photos and labels is then saved.
Pooling
The pooling operation provides spatial variance making the system capable of recognizing an
object with some varied appearance. It involves adding a 2Dfilter over each channel of the
feature map and thus summarise features lying in that region covered by the filter.
So, pooling basically helps reduce the number of parameters and computations present in the
network. It progressively reduces the spatial size of the network and thus controls over fitting.
There are two types of operations in this layer; Average pooling and maximum pooling. Here,
we are using max-pooling which according to its name will only take out the maximum from
a pool. This is possible with the help of filters sliding through the input and at each stride, the
maximum parameter will be taken out and the rest will be dropped.
The pooling layer does not modify the depth of the network unlike in the convolution layer.
16
Chapter 4
Experimental Results and Comparison
The Dogs vs. Cats dataset is a standard computer vision dataset that involves classifying photos
as either containing a dog or cat.
Although the problem sounds simple, it was only effectively addressed in the last few years
using deep learning convolutional neural networks. While the dataset is effectively solved, it
can be used as the basis for learning and practicing how to develop, evaluate, and use
convolutional deep learning neural networks for image classification from scratch.
This includes how to develop a robust test harness for estimating the performance of the model,
how to explore improvements to the model, and how to save the model and later load it to make
predictions on new data.
In this tutorial, you will discover how to develop a convolutional neural network to classify
photos of dogs and cats. After completing this tutorial, you will know: How to load and prepare
photos of dogs and cats for modeling.
How to develop a convolutional neural network for photo classification from scratch and
improve model performance. How to develop a model for photo classification using transfer
learning.
17
Fig. 4.2 X test data
Visualization
18
Fig. 4.5: Loss and Accuracy graph
19
Chapter 5
Summary &Conclusions
Summary
In this report, we first briefly explained our motivation of this project and showed some
background materials. Then, we precisely illustrated our task, including the learning task and
the performance task. After that, we introduced our solution in detail, mainly including two
approaches. The first approach is a traditional pattern recognition model, by which we learned
the classification model from some human-crafted features, mainly including color feature,
Dense-SIFT feature, and a combination of the two. To improve the performance, we also
applied image segmentation approach to preprocess the data. However, due to poor
segmentation result, we did not achieve any improvement. The best accuracy we got from the
first method is only 71.47% (from an SVM classifier).
In terms of classifiers, we mainly considered SVMs and BP Neural Networks, taking our high
dimensional feature space into account. Various parameter settings were explored to improve
classification accuracy on the test dataset. For example, for the BP Neural Networks, we tried
different hidden layers and hidden units; for the SVMs, different kernel functions and C
parameters were used. Table 1 illustrates the best results of each model and related parameters.
20
Conclusions
In this paper we build a deep convolutional neural network for image classification (cat and
dog images). Despite of using only a subset of the images and accuracy of 88.10% was
obtained. If the whole dataset was being used the accuracy would have been even better.
This paper shows a classification process using convolutional neural networks which is a deep
learning architecture. Although there are many algorithms that perform image classification,
convolutional neural network is considered to be a standard image classification technique.
Convolutional neural network uses GPU technology because of large number of layers which
increases the number of computers. Therefore, in this paper, we presented a very small
convolutional neural network which can work on CPU as well. This network classifies the
images into one the two predefined classes say Cat and Dog. This same network can be used
for other datasets as well.
21
References
1) Golle, P. (2008, October). Machine learning attacks against the Asirra CAPTCHA. In
Proceedings of the 15th ACM conference on Computer and communications security
(pp. 535-542). ACM.
2) J. Elson, J. Douceur, J. Howell and J. Saul. Asirra: a CAPTCHA that exploits interest-
aligned manual image categorization. Proc. of ACM CCS 2007, pp. 366-374.
3) Muthukrishnan Ramprasath, M.Vijay Anand and Shanmugasundaram Hariharan,
Image Classification using Convolutional Neural Networks. International Journal of
Pure and Applied Mathematics, Volume 119 No. 17 2018, 1307-1319
4) Parkhi, O. M., Vedaldi, A., Zisserman, A., & Jawahar, C. V. (2012, June). Cats and
dogs. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on
(pp. 3498-3505). IEEE.
6) [6] Bang Liu, Yan Liu, Kai Zhou “Image Classification for Dogs and Cats”
7) Tianmei Guo, Jiwen Dong ,Henjian Li'Yunxing Gao, “Simple Convolutional Neural
Network on Image Classification” , IEEE 2nd International Conference on Big Data
Analytics at Beijing, China on 10-12 March 2017
9) Travis Williams, Robert Li, “Advanced Image Classification using Wavelets and
Convolutional Neural Networks”, 15th International Conference on Machine Learning
and Applications at Anaheim, CA, USA on 18-20 December 2016.
22
12) Qing Li, Weidong Cai, Xiaogang Wang, Yun Zhou, David Dagan Feng and
Mei Chen, “Medical Image Classification with Convolutional Neural Network”, 13th
International Conference on Control Automation Robotics and Vision at Singapore on
10-12 December 2014.
23