Seminar
Seminar
On
Latest trends in Object Detection using A.I.
MASTER OF TECHNOLOGY
In
Under Guidance of
Dr. Brajesh Kumar Singh
SESSION – 2021-2022
The PG Department of CSE
R.B.S. ENGINEERING TECHNICAL CAMPUS BICHPURI, AGRA
DECLARATION
I declare that the seminar work presented in this report titled “Latest trends in Object Detection
using A.I.”, submitted to the Computer Science and Engineering Department, R.B.S. Engineering
Technical Campus Bichpuri, Agra, for the award of the Master of Technology degree in
Computer Science and Engineering, is my original work. I have not plagiarized or submitted the
same work for the award of any other degree.
This is to certify that the Seminar entitled “LATEST TRENDS IN OBJECT DETECTION USING
A.I” has been submitted by SHAKTI PUNJ in partial fulfillment of the degree of Master of
Technology in Computer Science & Engineering of “R.B.S. Engineering Technical Campus
Bichpuri, Agra” in the academic session 2021-22 (2nd Semester).
SHAKTI PUNJ
ABSTRACT
Artificial intelligence (AI) refers to what information about the language structure being
transmitted to the machine: It should result in a more intuitive and faster solution, based on a
learning algorithm that repeats patterns in new data. Good results are obtained in imitating
the cognitive process whose several layers of densely connected biological subsystems are
invariant to many input transformations. This invariant so sought after by AI and cognitive
computing is in the universal structure of language, provider of the universal language
algorithm. The representation property to improve machine learning (ML) generalizes the
execution of a set of underlying variation factors that must be described in the form of other
simpler underlying variation factors, preventing the “curse of dimensionality.” The universal
model specifies a generalized function (representational capacity of the model) in the
universal algorithm, serving as a framework for the algorithm to be applied in a specific
circumstance.
LIST OF FIGURES
vi
ACRONYMS AND ABBREVIATION
PR Pattern Recognition
CR Computer Version
vii
TABLE OF CONTENTS
Page.No
DECLARATION CERTIFICATE i
ACKNOWLEDGEMENT i
i
v
v
ABSTRACT
v
i
LIST OF FIGURES vii
1 INTRODUCTION 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Aim of the Seminar . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Scope of the Seminar . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 2
1.5 Algorithm and Techniques used. . .. . . . . . . . . . . . . . . . . . . 2
1.5.1 CIFAR10/100 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.5.2 The MNIST dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 LITERATURE REVIEW 3
3 SEMINAR DESCRIPTION 5
3.1 Existing System . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.1.1 Disadvantages ........................ 5
3.2 Proposed System . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2.1 Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2.2 Disadvantages ........................ 6
3.3 Feasibility Study . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4 METHODOLOGIES 8
4.1 Artificial Neural Networks ...................... 8
4.2 Image Recognition .......................... 9
4.2.1 Image classification with localization ............ 9
4.2.2 Object detection ....................... 9
4.2.3 Object (semantic) segmentation ............... 9
4.2.4 Instance segmentation......................................................................................10
4.3 Pattern Recognition..........................................................................................................10
4.3.1 Sensing...............................................................................................................10
4.3.2 Segmentation.....................................................................................................10
4.3.3 Post Processing..................................................................................................11
7 APPENDIX
A BIOLOGICAL POSTULATES USED.............................................................................................14
B PERCEPTRON......................................................................................................................15
REFERENCE 16
Chapter 1 INTRODUCTION
1.1 Introduction
Human problem solving is basically on recognizing the inputs, image or any other
patterns if provided. In any pattern recognition task or image processing humans
perceive patterns in the input data and manipulate the pattern and image directly. In
this paper we discuss attempts at developing computing models based on artificial
neural networks (ANN) to deal with various pattern and image recognition situations.
Like conventional algorithms, AI methods also need a clear specification of
theproblem, and mapping of the problem into a form suitable for the methods to be
applicable. For example, in order to apply heuristic search methods, one needs to map
the problem as a search problem. Likewise, to solve a problem using a rule-based
approach, it is necessary to explicitly state the rules governing it. Artificial
NeuralNetworks are a recent development tool that is modeled from biological neural
net-works. The powerful side of this new tool is its ability to solve problems that are
very hard to be solved by traditional computing methods (e.g. by algorithms). This
work briefly explains Artificial Neural Networks and their applications, describing
how to implement a simple ANN for image recognition.
The aim of this seminar is to explore and analyze how the patterns and images are
recognized by using artificial neural networks by this it will be able to know how a
machine can understand them. It also deals with the advantages and its uses that it
gives to the modern technologies.
1
1.3 Scope of the Seminar
The future scope of Image and Pattern recognition is very vast. The Pattern
recognition can be used in classification, identification, verification, feature extraction,
indexing, reconstructing and other process in all biometrics features. Robotics is one
of the leading areas where image recognition is used where Humanoids are trained and
are also used in translating languages, performing surgery, reprogramming defects in
human DNA, and automatic driving all forms of transport.
1.4 Methodology
Image Recognition -
1. Object Detection
2. Semantic Segmentation
3. Instance Segmentation
Pattern Recognition–
1. Sensing
2. Segmentation
3. Post Processing
1.5.1 CIFAR10/100
This dataset includes 60,000 colored images. Its data is divided into 10/100
mutually exclusive classes. Here we have taken two sample inputs one using
augmented data and one without using it i.e. we state them Exp1 and Exp2
respectively. We see entirely different scenario in both Exp1 and Exp2. With
data augmentation we received 76 percent accuracy.
It has total of 70,000 images with grid of 28x28, they have handwritten digits,
70,000 images comprises of 60,000 and 10,000 which are for training and
respectively.
Chapter 2 LITERATURE REVIEW
This seminar mainly describes about the techniques that are used for image and
pattern recognition by using artificial neural networks. It tells about how the working
in recognising the patterns and pictures takes place and its uses and applicationsin the
technologies that are yet to arrive. It provides detailed explanation of all the
methodologies involved and analytic techniques that are taken to achieve it.
The existing methods for processing digital images are there which actually deal
with enhancing the image picture quality, brightness, color etc. In the existing system
the use of artificial neural networks is not present, so self recognising ability of any
recognition is not possible. So remembering of images and patterns is not possible.
These factors can be degraded due to aging process. Such an image recognition
technique algorithm concentrates on improving that factor alone. There are not
designed to analysis and improve in the cracks region. The cracks removal has tobe
rectified in the different manner. And in case of pattern recognition algorithms it
requires high hardware and it haves a complex algorithm. So these sorts of issues can
be solved in the proposed system.
3.1.1 Disadvantages
The proposed system deals with digital image processing technique that detects the images
by comparing with the already recognised images and patterns which makes it easier. A
system that is capable of tracking and interpolating cracks. The user should manually select
a point on each crack to be restored. A method for the detection of cracks using
multioriented Gabor filters. Crack detection and removal bears certain similarities with
methods proposed for the detection and removal of scratches and other artifacts from motion
picture films. However, such methods rely on information obtained over several adjacent
frames for both artifact detection andfilling and, thus, are not directly applicable in the case
of painting cracks.
3.2.1 Advantages
3.2.2 Disadvantages
1. Complex algorithm
2. Requires high cost for upgrading
Image processing using artificial neuronal networks (ANN) has been successfully
used in various fields of activity such as geotechnics, civil engineering, mechanics,
industrial surveillance, defence department, automatics and transport. Image pre-
processing, date reduction, segmentation and recognition are the processes used in
managing images with ANN. An image can be represented as a matrix, each element
of the matrix containing colour information for a pixel. The matrix is used as input
data into the neuronal network. The small dimensions of the images, to easily and
quickly help learning, establish the size of the vector and the number of input vectors.
The transfer function used is a sigmoidal function the current usage of the terms like
AI systems, intelligent systems, knowledge-based systems, expert systems etc., are
intended to show the urge to build machines that can demonstrateintelligence similar
to human beings in performing some simple tasks. In these tasks we look at the
performance of a machine and compare it with the performance of a person. We
attribute intelligence to the machine if the performances match. But theway the tasks
are performed by a machine and by a human being are basically different; the
machine performing the task in a step-by-step sequential manner dictatedby an
algorithm, modified by some known heuristics.
Chapter 4 METHODOLOGIES
Artificial Neural Networks (ANNs) are a new approach that follows a different
way from traditional computing methods to solve problems. Since conventional
computers use algorithmic approach, if the specific steps that the computer needs to
follow are not known, the computer cannot solve the problem. That means, traditional
computing methods can only solve the problems that we have already understood and
knew how to solve. However, ANNs are, in some way, much more powerful because
they can solve problems that we do not exactly know how to solve. That’s why, of
late, their usage is spreading over a wide range of area including, virus detection,
robot control, intrusion detection systems, pattern (image, fingerprint, noise..)
recognition and so on. ANN models such as GAN,CNN,RNN,MLP,DBN Reservoir
computing, and Transformer models are performing excellently in their application to
PR tasks.
A typical Back Propagation ANN is as depicted below. The black nodes (on the
extreme left) are the initial inputs. Training such a network involves two phases. In
the first phase, the inputs are propagated forward to compute the outputs for each
output node. Then, each of these outputs is subtracted from its desired output, causing
anerror [an error for each output node]. In the second phase, each of these output
errorsis passed backward and the weights are fixed. These two phases is continued
until the sum of square of output errors reaches an acceptable value.
Image recognition is the task of identifying images and categorizing them in one of
several predefined distinct classes. So, image recognition software and apps can
define what’s depicted in a picture and distinguish one object from another. The field
of study aimed at enabling machines with this ability is called computer vision. Being
one of the computer vision (CV) tasks, image classification serves as the foundation
for solving different CV problems, including:
Placing an image in a given class and drawing a bounding box around an objectto
show where it’s located in an image.
Categorizing multiple different objects in the image and showing the location of each
of them with bounding boxes. So, it’s a variation of the image classification with
localization tasks for numerous objects.
Differentiating multiple objects (instances) belonging to the same class each per-
son in a group.
The methodology used for pattern recognition is of different steps like sensing,
segmentation, post processing. And the explanation of these steps are given be- low
4.3.1 Sensing
The input to a pattern recognition system is often some kind of a transducer, such
as a camera or a microphone array. The difficulty of the problem may well depend on
the characteristics and limitations of the transducer- its bandwidth, resolution,
sensitivity, distortion, signal-to-noise ratio, latency, etc.
4.3.2 Segmentation
In this post processing, the obtained pattern is processed with ANN. It is generally
to be used to recommend actions (put this fish in this bucket, put that fish in that
bucket) which is nothing but comparing with the earlier saved fish patterns, each
action having an associated cost. The post-processor uses the output of the classifier
to decide on the recommended action. By these steps the patterns are classified and
recognised
The seminar report was helpful as it provided necessary information about the
image and pattern recognition techniques that are existing and the improvements
on the upcoming technologies in various fields in artificial neural networks. Using
them many of the conventional problems can be reduced. When these techniques
are effectively many applications like detection, identification, indexing, language
translation, reprogramming etc.
The implementation of these techniques in our daily life effects and develops the
field of artificial intelligence etc. Some of them are such as robotics where
humanoid robots can be developed and can be used effectively in daily activities.
These can make many dramatic and useful changes in lives of humans and can
lead to efficient wayof life.
We also understand some key features means an algorithm in object detection
must be-
Light weight
For better and fast results the input and requirements must be light weight and
should have light weight architecture. Image searching should be done by
similar fragments and must not require various layers
Fast speed
Search and detection engines must work fast so as to reduce the time involved
in fragmentation, comparison or while delivering results. Accuracy with speed
is best combination which can be obtained by altering the matrix and changing
number of epochs
Whether the images are of good quality or deformed the detection process
should be equally good and must follow similar principles. Max pooling CNN
layers contain several pixel enlarging tools which are used to for the same
reason
Less Equipment
It should not involve many tools and should be cost efficient. It has some
hidden layers which also carries weight. Weight of respective classes is
calculating by multiplying input by its corresponding weight and then adding
to provide the single result.
Chapter 6
6.1 Conclusion
While investigating the works chronologically we have noticed that though there
are some merits and demerits of each individual work the application of ANN in each
pattern recognition case always performed better result than that of without
implementing ANN. The accuracy level of forecasting on the basis of present data set
(experience) was always better.
They are many present and more applications to be coming on in future like
Interactive Voice Response (IVR) with pattern recognition based on Neural
Networks. The addition of voice pattern recognition in the authentication process can
potentially further enhance the security level. The developed system is fully
compliant with landline phone system. And in Image recognition few other like
advanced facial recognitions using image recognition will arrive as new technologies
in future. These are some future enhancements in these fields.
APPENDIX A
When we are observing things through naked eye, several neurons function to cover
all the objects in view region. Receptive field is the area where the respective
neurons are stimulated, similarly the CNN also stimulate neurons of a respective
region
NEURON
It is also called brain cell it stimulates as soon as eyes observe anything or there is
stimulation in human body and pass the message from brain to each part of the
body vice versa. Neurons are connected to each other head to tail or head to head.
PERCEPTRON
Signification
With its discovery it was known that objected can be detected and
identified but later inthe researches it was known that perceptrons cannot
be trained.
REFERENCE
[1] Snapp R., CS 295: Pattern Recognition, Course Notes, Department of Computer
Science, University of Vermont, (http://www.cs.uvm.edu/
snapp/teaching/CS295PR/whatispr.html)
[2] Duda, R.O., Hart, P.E., and Stork D.G., (2001). Pattern Classification. (2nd ed.).
New York: Wiley-Interscience Publication.