0% found this document useful (0 votes)

11 views9 pages

DeepLearning RobotVision

Uploaded by

Shashank Ghosh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views9 pages

DeepLearning RobotVision

Uploaded by

Shashank Ghosh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Chapter 34

Deep Learning for Robot Vision

Mamilla Keerthikeshwar and S. Anto

Abstract Deep learning comes under a class of machine learning where we use it
for extremely high-level output, like recognition of images, etc. It has been used in
pattern recognition over a vast area such as handmade crafts to extract the data from
learning procedures. At present, it has gained a great significance in robot vision.
In this paper, we show how neural networks play a vital role in robot vision. Image
segmentation, which is the initial step, is used to preprocess the images and videos.
The multilayered artificial neural networks have a lot more applications. It can be
applied in drug detection, military bases, and many more. The main objective of this
paper is to review how deep learning algorithms and deep nets can be used in various
areas of robot vision. There are some predefined deep learning algorithms that are
available in the market, which are used here to perform this comparative study. These
will help us to have a clear insight while building vision systems using deep learning.

34.1 Introduction

Deep learning is the booming topic in the area of research, and it has gained a
lot of attention in the last couple of years. It is involved in machine learning and
robotics. A lot of conferences and workshops are being conducted on this [1–3]. The
convolutional neural network has a lot of applications in robot vision. Many more
algorithms have been developed for robot vision.
The main intention of this paper is it acts as a guide for new developers who
are keenly interested in robot vision. In this, convolutional neural network plays
an important role, and mostly convolutional neural networks are employed and in

M. Keerthikeshwar (B) · S. Anto

School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, Tamil
Nadu, India
e-mail: [email protected]
S. Anto
e-mail: [email protected]

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 357
A. N. R. Reddy et al. (eds.), Intelligent Manufacturing and Energy Sustainability,
Smart Innovation, Systems and Technologies 213,
https://doi.org/10.1007/978-981-33-4443-3_34
358 M. Keerthikeshwar and S. Anto

some cases for pedestrian detection or to find fast-moving objects using semantic
segmentation and how different datasets are used.

34.2 Deep Learning

Deep learning is a subset of machine learning that uses the artificial networks for
the preprocessing of the data, and it also creates patterns that are used for decision
making. It is a sequential process that takes step by step. Deep learning that includes
AI can learn the unstructured data. Deep learning is used to detect fraud.
Deep learning has many applications. It is used in the fields of the research area,
military base camps to detect the human movements, and pose estimation, and it is
also used in the image processing where it is used to segregate the image by pixel is
used to predict the genes.

34.3 Convolutional Neural Networks

The convolutional neural network comes under a class of neural networks. It is

used for analyzing visual imagery. It is in the form of layers where one neuron in
one layer is connected to all the neurons in the next layer. Image representation is
done by convolutional neural networks. Deep neural networks using convolutional
temporal architecture and ordered LSTM cells can be used for classifying video.
The final layer uses the temporal feature and late pooling for second convolutional
and smaller temporal uses for slow pooling [4]. Human pose estimation is one of
the applications that uses a convolutional neural network. This pose is estimated
which results in deviated pose prediction. This uses novel structure convolutional
networks for training deep networks. It has three methods to estimate pose (i) design
novel networks, (ii) multi-task network is designed, and (iii) human pose is evaluated
[5]. Another convolutional neural network is the residual attention network which
incorporates the state-of-the-art bottom-up top-down structure; it has two branches
(i) Mask branch, it has four convolutional layers, and it is employed in the image
segmentation for robots. (ii) Trunk branch, it is employed for video classification
and uses an end-to-end approach to classify the videos, and it classifies video by
layer approach [6]. The fully convolutional network is a convolutional network used
for the detection of objects regionally. It uses the sensitive position of maps and
ROI. The regional proposal came out from RCNN with deep networks [2]. Bayesian
convolutional neural network shows six DOF camera pose from a single RDB image
[7].
Using convolutional neural networks, one can detect the pedestrians, and LIDAR
is used to detect the light and sense the objects. The very popular framework in this is
the Caffe framework [8]. Pedestrian detection makes use of support vector machine
[SVM], and Kalman filter is used for tracking the pedestrian [9, 10]. Convolutional
34 Deep Learning for Robot Vision 359

neural networks have a lot many more applications such as it is used to build 3D scenes
from a single image, and this uses machine learning and uses CRF for refining. It also
uses a database source, but that does not display 3D, and this type of convolutional
network is fully convolutional networks [11]. Different scale-specific detectors are
combined to produce strong object detector [12]. The accurate detection of the simple
stage is by rolling convolutional. It is a two convolutional neural network used for
object detection [1].
Shared convolutional neural networks is another neural network used for object
discovery, where the speed is about 100 frames per second on GPU [13]. ADE20K is
a dataset used for perspective adaptive convolutions. For this, an efficient algorithm
for online training of network trajectories is used, and this analyzes problems and
uses XNOR and SqueezeNet as detectors [14]. Image is segmented pixel-wise by
encoding and decoding the image at SegNet. It uses the VGG16 network which
contains 13 convolutional layers, and it is specially designed for the robotic vision.
It takes input picture and differentiate the image into several colors, for the image
which is of the same type, and then, they have the same color and the images which
are different have other colors [15]. The 3D scene uses an NYU depth V2 dataset and
SUNRGBD dataset [11]. Segmentation using convolution networks uses PASCAL
[16]. VOC, NYUDv2, and SIFT as its datasets [10, 17]. CNN can be also used for
playing games such as soccer using back propagation through time (BPTT) and
real-time recurrent learning (RTRL) [18].

34.3.1 Fast RCNN

We mainly focus on the fast RCNN for the object detection. In this, convolutional
feature maps are generated by taking in the input images. These images are then
extracted, and then these are again reshaped into fixed size.
Algorithm:
Step 1: Image is taken as the input, and these are processed.
Step 2: ConvNet generates the regions which we wanted to extract. The image is
sent to ConvNet for extraction.
Step 3: For resizing the extracted image, Rol pooling layer is used for the images
generated from the ConvNet, and later, this is passed to fully connected network.
Step 4: After getting the result from the fully connected layer, these are further passed
on to the Softmax layer and linear regression layer.
360 M. Keerthikeshwar and S. Anto

34.4 Generative Adversarial Networks

Generative adversarial networks are used for semi-supervised learning. It helps the
robot to interact with objects. It is a minimax game between a generator (G) and a
discriminator (D) by mapping noise z from p(z), where p(z) is noise distribution. The
function of the discriminator is that it differentiates between real date and generator
sample. For robots like self-driving or warehouse robots, it uses depth perception
that also makes use of 3D object recognition and path planning. So to use this, the
robot should also know the depth of the ground, to know the depth of the ground it
makes use of some highly expensive hardware tools, and to overcome this problem,
YGAN is used, that uses data from cameras from all the sides to estimate the depth
[19, 20]. GAN has a lot of applications, where image resolution and classification
is one of them. To classify GAN’s accuracy, we take datasets consisting of repeated
results approximately 7000 series with 1000 series in each material. It got the result
of 7000 by sixfold cross-checking, with each fold being 1000 samples [21].

34.5 Restricted Boltzmann Machine

Restricted Boltzmann machine is an unsupervised model that produces never seen

data from the original data. It is of the form of layers with one visible layer and
several hidden layers. But Boltzmann machine is different from restricted Boltzmann
machine, in restricted Boltzmann machine the visible node and the hidden nodes are
not linked to one another, whereas in Boltzmann machine, the nodes are linked to each
other. A deep belief network is a process where multiple RBMs are stacked together
can be fine-tuned by process and back propagation. In the restricted Boltzmann
machine, all the neurons behave individually [22].
Restricted Boltzmann machine has a lot many applications, and automatic hand
sign language is one among them. Input is taken by RGB and depth. These inputs
are sent to the RBM. The output RBM is simplified to another RBM, and this model
is trained by datasets [23].

34.6 Recurrent Neural Networks

Recurrent neural networks are used to join images that are drawn. Kazuma Sasaki
stated that they have conducted two experiments. First experiment tells that model
can learn 15 drawing shapes by the bottom-up process. In the second experiment,
four images are trained with four deformed variations per each type, and the images
are segregated based on their type using drawings and image classification [24].
Recurrent neural network can also be a part to develop a 3D scene layout. A robotic
camera is installed, and it captures images, and these images are filtered using RGB,
34 Deep Learning for Robot Vision 361

depth, and foreground, and later after that, the image is converted into a 3D tensor. To
convert it into a 3D scene layout, it is passed through recurrent neural network [25].
Recurrent neural networks can be also used planners for the bio-inspired robotic
motion, it uses long-term memory networks of sequential data, and it also makes
use of the simulated fish trajectories. Using this, it can be implemented in the robots
without even knowing their position. This work is related to animal behavior and
then used to operate the robots [26]. Recurrent neural networks works effectively
for path planning and also in object avoidance which is generally represented in the
form of neurons [27].

34.7 CNN Architectures

In the last couple of years, we have witnessed a numerous increase in the CNN
architectures. These architectures are used by giving input datasets. In this paper, we
have taken the most used CNN architectures and described them how are they useful
in the robot vision.

34.7.1 AlexNet

AlexNet is one of the convolutional neural networks. It has over eight layers out of
which five are convolutional layers, and the remaining three layers are fully connected
layers. Some networks use tanh function, but AlexNet uses rectified linear units. It
also has multiple GPU training where the time is reduced and makes it run faster.
AlexNet checks that it is overfitting [4]. To print a real-time 3D scene fully, convo-
lutional neural network uses AlexNet instead of VGG [11]. Image segmentation by
a convolutional network uses AlexNet [10]. This has the same architecture like that
of the LeNet. AlexNet uses rectified linear units instead of tanh functions since this
accelerates the speed six times at the same accuracy and uses dropout as it overcomes
the overfitting but doubles the time for 0.5.

34.7.2 GoogLeNet

GoogLeNet is another convolutional neural network that is pretrained. Similar to

AlexNet, GoogLeNet also has 22 layers. The image in this can be trained using an
image net or place S65 dataset. It allows only one unique video to be processed
by multiple image processing [4]. Semantic segmentation uses GoogLeNet as one
of the datasets [10]. The performance of GoogLeNet is similar to that of human-
level performance, and it requires human training to beat the GoogLeNet. This is
362 M. Keerthikeshwar and S. Anto

the combination of a group of small convolutions which is to reduce the parame-

ters. AlexNet has over 60 million parameters, but GoogLeNet has over 4 million
parameters, which is quite small. GoogLeNet is more accurate.

34.7.3 RGB-D

RGB-D is the mixture of depth with the RGB image. In this, depth image is an
image in which distance is calculated from plane to RGB image. It is used for object
detection. The architecture for RGB-D has three steps: (i) input image is processed,
(ii) train network, (iii) classifying depth images [28, 29]. RGB-D dataset is the largest
dataset that holds household objects. The object recorded in a video sequence and
the objects is rotated in a circular plane, so that the object is captured from all the
sides. The video is recorded using a Kinect style 3D camera. This even has other
indoor and outdoor environments such as garden, kitchen, living room, and it can
capture these scenes from long distances even though they are partially included or
fully involved in the frame.

34.7.4 KITTI

KITTI is the dataset used for the detection of moving objects. As pedestrians move
from one place to another there, KITTI dataset is used to detect them [8, 9]. This
KITTI dataset also uses Velodyne LIDAR. Fast object detection is done using KITTI
and Caltech dataset [15]. Single-stage detectors RRC using novel recurrent rolling has
achieved a benchmark in KITTI. Scale-dependent, pooling, and cascaded classifiers
make use of the PASCAL object detection challenge and KITTI. It has two colored
cameras and a gray-scale camera installed which are used to detect the objects [16].

34.7.5 ImageNet

ImageNet is a database that contains hundreds of images per a single node. The
performance of residual attention is done by the ImageNet [5]. ImageNet was created
for educators and researchers for those who use a lot of images for training. To make
it easy, a large database of images is created, and this database is termed as ImageNet.
It does not own any of the copyrights for images, but instead, it only holds URLs
and images all together.
34 Deep Learning for Robot Vision 363

34.7.6 CIFAR-100

CIFAR-100 is the same as CIFAR-10 but with 100 classes and over 500 images
in each class, and these 100 classes are divided into 20 superclasses. It contains
thousands of images per class. It has five training batches and one test batch. They
may contain images from any class. It is used in residual attention network evaluation
[6].

34.8 Discussion

As mentioned earlier, deep neural networks are suitable for robotics applications to
deal with the limited camera resolution and human pose. For the biomedical image to
be segmented, we provide elastic deformation with fully based implementation [30].
As said, convolutional neural networks are used to estimate human pose, and PoseNet
is used to discriminate the original pose and fake pose [5]. Image classification is
done by residual attention network which can capture attention and can be extended
to convolutional network [6]. ResNet is used for image classification in region-
based fully convolutional network [2]. Trajectory-centric RL algorithm will able to
learn different type of skills, and these are used to autoencoders [31]. For pedestrian
detection, we employ LIDAR and fusion methods, where fusion performs well [8].
A convolutional neural network is also employed in the real-time 3D scene and
uses CRF to refine the boundary and remove extra groups [11]. For scene parsing,
we employ perspective adaptive convolutions in parallel GPU and this improves
accuracy [32]. Signet is another architecture that is smaller, faster, and more efficient
[15]. Noise-aware training is accurate, and it also improves recognition accuracy
[28]. Shared convolutional neural networks employed in object detection and have
better performance than a single model [13]. Gradient-based algorithm for online uses
XNOR, SqueezeNet [14]. GTX.TITAN X GPU is used for testing grasp detection
[13]. Image description in the wild (IDW) is used to improve segmentation accuracy
using weak supervisions [33].

34.9 Conclusion

This paper has addressed the use of convolutional networks and image segmentation
in the area of robot vision. It mainly focused on object detection, pedestrian detection,
and showed how different a networks have been used in robot vision development
[9]. This paper will help and provide a valuable guide for developers and researchers
who are working in the robot vision since it gives them the basic idea of all the
algorithms used and different datasets that have been used in it. Training with more
sets in RGB-D object detection does not show any better results [28].
364 M. Keerthikeshwar and S. Anto

It is expected that the robot vision using deep learning will increase in the next
few years and thanks for better thinking and adapting the DNN for robot vision.
We all know that robots should interact with the environment and human beings. It
should be adapted to the surroundings, and to do this, it should be trained properly.
Geometry-based and deep-based methods will be a part of state-of-the-art vision
systems which leads to the increase of robotic autonomy and training of DNNs.
Different CNN architectures are made use of for robot vision. There are many other
architectures of CNN and other networks. GoogLeNet and AlexNet are the most
used CNN architectures. When a single model architecture for original images is
considered, ResNet has top accuracy than other architectures and followed by VGG
architecture. AlexNet and GoogLeNet have the least accuracy. When the error rate
is considered, GoogLeNet has more error rate than other architectures. When the
preprocessed images are taken into the consideration, AlexNet tops the list.

References

1. J. Ren, X. Chen, J. Liu, W. Sun, J. Pang, Q. Yan, Y.-W. Tai, L. Xu, Accurate single stage
detector using recurrent rolling convolution, in CVPR (2017)
2. J. Dai, Y. Li, K. He, J. Sun, R-FCN: object detection via region-based fully convolutional
networks. arXiv:1605.06409
3. G. Wang, P. Luo, L. Lin, X. Wang, Learning object interactions and descriptions for semantic
image segmentation, in Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition (2017), pp. 5859–5867
4. Y.H. Ng, M. Hausknecht, S. Vijayanarasimhan, O. Vinyals, R. Monga, G. Toderici, Beyond
short snippets: deep networks for video classification, in IEEE Conference on Computer Vision
and Pattern Recognition[CVPR] (2015), pp. 4694–4702
5. Y. Chen, C. Shen, X.-S. Wei, L. Liu, J. Yang, Adversarial PoseNet: a structure-aware
convolutional network for human pose estimation. arXiv:1705.00389
6. F. Wang, M. Jiang, C. Qian, et al. Residual attention network for image classification (2017).
arXiv preprint arXiv:1704.06904
7. A. Kendall, R. Cipolla, Modelling uncertainty in deep learning for camera relocalization, in
IEEE International Conference on Robotics and Automation [ICRA] (May 2016)
8. J. Schlosser, C.K. Chow, Z. Kira. Fusing LIDAR and images for pedestrian detection using
convolutional neural networks, in IEEE International Conference on Robotics and Automation
[ICRA] (May 2016)
9. M. Szarvas, A. Yoshizawa, M. Yamamoto, J. Ogata, Pedestrian detection with convolutional
neural networks. IEEE Proc. Intel. Veh. Sympos. 2005, 224–229 (2005). https://doi.org/10.
1109/IVS.2005.1505106
10. J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, in
CVPR 2015 [best paper honorable mention]
11. S. Yang, D. Maturana, S. Scherer, Real-time 3D scene layout from a single image using convolu-
tional neural networks, in IEEE International Conference on Robotics and Automation [ICRA]
(2016)
12. Z. Cai, Q. Fan, R.S. Feris, N. Vasconcelos, A unified multi-scale deep convolutional neural
network for fast object detection, in European Conference on Computer Vision (Springer
International Publishing, 2016), pp. 354–370
13. D. Guo, T. Kong, F. Sun, H. Liu, Object discovery and grasp detection with a shared convolu-
tional neural network, in IEEE International Conference on Robotics and Automation [ICRA]
(2016)
34 Deep Learning for Robot Vision 365

14. N. Cruz, K. Lobos-Tsunekawa, J. Ruiz-del-Solar, Using convolutional neural networks in robots

with limited computational resources: detecting NAO robots while playing soccer (2017).
arXiv:1706.06702
15. V. Badrinarayanan, A. Kendall, R. Cipolla, SegNet: a deep convolutional encoder-decoder
architecture for image segmentation (2015). arXiv:1511.00561
16. F. Yang, W. Choi, Y. Lin, Exploit all the layers: fast and accurate CNN object detector with scale
dependent pooling and cascaded rejection classifiers, in Proceedings of the IEEE International
Conference on Computer Vision and Pattern Recognition (2016)
17. J. Fu, J. Liu, Y. Wang, H. Lu, Stacked deconvolutional network for semantic segmentation.
arXiv preprint arXiv:1708.04943 [2017]
18. R.J. Williams, J. Peng, An efficient gradient-based algorithm for on-line training of recurrent
network trajectories. Neural Comput. 2(4), 490–501 (1990)
19. M. Alonso Jr, Y-GAN: a generative adversarial network for depthmap estimation from multi-
camera stereo images, 3 Jun 2019. arXiv preprint arXiv:1906.00932
20. A. Pronobis, R.P. Rao, Learning deep generative spatial models for mobile robots, in 2017
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 24 Sep 2017.
IEEE, pp. 755–762
21. Z. Erickson, S. Chernova, C.C. Kemp, Semi-supervised haptic material recognition for robots
using generative adversarial networks (2017). arXiv:1707.02796v2
22. A. Al, M. Zain Amin, A briefly explanation of restricted boltzmann machine with practical
implementation in pytorch
23. R. Rastgoo, K. Kiani, S. Escalera, Multi-modal deep hand sign language recognition in still
images using restricted boltzmann machine, in Entropy 23 Oct 2018
24. K. Sasaki, K. Noda, T. Ogata, Visual motor integration of robot’s drawing behavior using
recurrent neural network. Rob. Auton. Syst. 86, 184–195 (2016)
25. R. Cheng, Z. Wang, K. Fragkiadaki, Geometry-aware recurrent neural networks for active
visual recognition, in 32nd Conference on Neural Information Processing Systems (NeurIPS
2018), Montréal, Canada
26. A. Khan, F. Zhang, Using recurrent neural networks (RNNs) as planners for bio-inspired robotic
motion, in 2017 IEEE Conference on Control Technology and Applications (CCTA), 27 Aug
2017. IEEE, pp. 1025–1030
27. N. Bin, C. Xiong, Z. Liming, X. Wendong, Recurrent neural network for robot path plan-
ning, in International Conference on Parallel and Distributed Computing: Applications and
Technologies, 8 Dec 2004 (Springer, Berlin, Heidelberg, 2004), pp. 188–191
28. A. Eitel, J.T. Springenberg, L. Spinello, M. Riedmiller, W. Burgard, Multimodal deep learning
for robust RGB-D object recognition, in IEEE/RSJ International Conference on Intelligent
Robots and Systems [IROS], Hamburg, Germany (2015)
29. M. Schwarz, H. Schulz, S. Behnke, RGB-D object recognition and pose estimation based on
pre-trained convolutional neural network features, in ICRA (2015)
30. O. Ronneberger, P. Fischer, T. Brox. U-net: convolutional networks for biomedical image
segmentation (2015). arXiv preprint arXiv:1505.04597
31. C. Finn, X.Y. Tan, Y. Duan, T. Darrell, S. Levine, P. Abbeel, Deep spatial autoencoders for
visuomotor learning, in IEEE International Conference on Robotics and Automation [ICRA]
(2016)
32. R. Zhang, S. Tang, Y. Zhang, J. Li, S. Yan, Perspective-adaptive convolutions for scene parsing,
in IEEE Transaction on Pattern Analysis and Machine Intelligence [Early Access]
33. P. Luo, G. Wang, L. Lin, X. Wang, Deep dual learning for semantic image segmentation, in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017),
pp. 2718–2726

Convolutional Neural PDF
No ratings yet
Convolutional Neural PDF
187 pages
The Little Book of Deep Learning
100% (1)
The Little Book of Deep Learning
140 pages
AI in Education
100% (3)
AI in Education
39 pages
Electricity Load Forecasting - A Systematic Review
No ratings yet
Electricity Load Forecasting - A Systematic Review
19 pages
Decision Tree and PCA-based Fault Diagnosis of Rotating Machinery
No ratings yet
Decision Tree and PCA-based Fault Diagnosis of Rotating Machinery
18 pages
Computer Vision Used in Self-Driving Cars
No ratings yet
Computer Vision Used in Self-Driving Cars
30 pages
M10 - Introduction To TensorFlow, Deep Learning and Application
No ratings yet
M10 - Introduction To TensorFlow, Deep Learning and Application
25 pages
Chapitre 8 2024
No ratings yet
Chapitre 8 2024
231 pages
Convolutional Neural Networks-CNN PDF
No ratings yet
Convolutional Neural Networks-CNN PDF
95 pages
Unit 3
No ratings yet
Unit 3
105 pages
Object Detection Using Convolutional Neural Network Transfer Learning
No ratings yet
Object Detection Using Convolutional Neural Network Transfer Learning
11 pages
Computer Vision With Deep Learning
No ratings yet
Computer Vision With Deep Learning
5 pages
DNN - M2 - Deep Feedforward NN 23dec
No ratings yet
DNN - M2 - Deep Feedforward NN 23dec
97 pages
Deep Residual Learning
No ratings yet
Deep Residual Learning
80 pages
Lecture2.2 UnimodalRepresentations Part1 PDF
No ratings yet
Lecture2.2 UnimodalRepresentations Part1 PDF
92 pages
Week5 Computer Vision
No ratings yet
Week5 Computer Vision
58 pages
Building Your Deep Neural Network - Step by Step v8 PDF
No ratings yet
Building Your Deep Neural Network - Step by Step v8 PDF
44 pages
8 Deep Learning CNN
No ratings yet
8 Deep Learning CNN
63 pages
ML 2
No ratings yet
ML 2
70 pages
Military AI-Week 05-AI in Computer Vision
No ratings yet
Military AI-Week 05-AI in Computer Vision
65 pages
Dl-Unit 5
No ratings yet
Dl-Unit 5
62 pages
ME P4252-II Semester - MACHINE LEARNING
No ratings yet
ME P4252-II Semester - MACHINE LEARNING
46 pages
4a Convolutional Neural Networks
No ratings yet
4a Convolutional Neural Networks
56 pages
An Optical Neural Network Using Less Than 1 Photon Per Multiplication
No ratings yet
An Optical Neural Network Using Less Than 1 Photon Per Multiplication
42 pages
Post-Reading Report Alex Shen (Mid Exam)
No ratings yet
Post-Reading Report Alex Shen (Mid Exam)
36 pages
DLP&P Notes Faculty: Ms. Meenakshi Chaudhary: What Is A Convolutional Neural Network (CNN) ?
No ratings yet
DLP&P Notes Faculty: Ms. Meenakshi Chaudhary: What Is A Convolutional Neural Network (CNN) ?
50 pages
Lecture 5
No ratings yet
Lecture 5
36 pages
W11 Lecture ITS69204 Image Recognition
No ratings yet
W11 Lecture ITS69204 Image Recognition
44 pages
Bare jrnl2 444444
No ratings yet
Bare jrnl2 444444
43 pages
CNN Deep
No ratings yet
CNN Deep
35 pages
FT04 Haghighat Independent 2023
No ratings yet
FT04 Haghighat Independent 2023
40 pages
A Survey On Computer Vision Algorithms
No ratings yet
A Survey On Computer Vision Algorithms
16 pages
ch4 CNN
No ratings yet
ch4 CNN
35 pages
Chapter 5 Deep Learning
No ratings yet
Chapter 5 Deep Learning
35 pages
Convolutional Neural Networks For Visual Recognition
No ratings yet
Convolutional Neural Networks For Visual Recognition
45 pages
Image Classification Using Resnet
No ratings yet
Image Classification Using Resnet
28 pages
Deep Learning in Neural Networks: An Overview
No ratings yet
Deep Learning in Neural Networks: An Overview
31 pages
Final Report
No ratings yet
Final Report
20 pages
Module V-Deep Learning
No ratings yet
Module V-Deep Learning
19 pages
Module 5
No ratings yet
Module 5
20 pages
Unit Iv - NNDL
No ratings yet
Unit Iv - NNDL
32 pages
Miracle Light Christian Academy: Fourth Periodical Examination
No ratings yet
Miracle Light Christian Academy: Fourth Periodical Examination
4 pages
DL Unit-V
No ratings yet
DL Unit-V
17 pages
Machine Learning Unit 4
No ratings yet
Machine Learning Unit 4
21 pages
Computer Visiondk
No ratings yet
Computer Visiondk
12 pages
CNN Apps
No ratings yet
CNN Apps
17 pages
Admin,+4554 Article+Text 17736 2 10 20210928
No ratings yet
Admin,+4554 Article+Text 17736 2 10 20210928
13 pages
Tmp4e31 TMP
No ratings yet
Tmp4e31 TMP
7 pages
SoS'25 Midterm - Report
No ratings yet
SoS'25 Midterm - Report
14 pages
Ch-3 Convolutional Neural Networks (CNNS)
No ratings yet
Ch-3 Convolutional Neural Networks (CNNS)
11 pages
Review On Carbonation
No ratings yet
Review On Carbonation
17 pages
Literature Review On Image Classification Architecture
No ratings yet
Literature Review On Image Classification Architecture
14 pages
Solar Vision Leveraging Automated Detection and Quantification of Solar Panels in Urban Areas To Promote Renewable Energy Adoption and Achieve Sustainable Development Goals
No ratings yet
Solar Vision Leveraging Automated Detection and Quantification of Solar Panels in Urban Areas To Promote Renewable Energy Adoption and Achieve Sustainable Development Goals
18 pages
Qian Li (2021) - Prediction of Rock Abrasivity and Hardness From Mineral Composition
No ratings yet
Qian Li (2021) - Prediction of Rock Abrasivity and Hardness From Mineral Composition
13 pages
Image Processing Captured Image: Figure 2-1 How The TSR System Works
No ratings yet
Image Processing Captured Image: Figure 2-1 How The TSR System Works
11 pages
Lu DeepONet NMachineIntell21
No ratings yet
Lu DeepONet NMachineIntell21
15 pages
Assignment-6 STC-DL
No ratings yet
Assignment-6 STC-DL
17 pages
IMINT Target Acquisition Using Deep Learning
No ratings yet
IMINT Target Acquisition Using Deep Learning
5 pages
The Great Transformer
No ratings yet
The Great Transformer
14 pages
Deep Learning Algorithms Report PDF
No ratings yet
Deep Learning Algorithms Report PDF
11 pages
Object Detection With Deep Learning - A Review Summary
No ratings yet
Object Detection With Deep Learning - A Review Summary
11 pages
An AI-Based Medical Chatbot Model For Infectious Disease Prediction
No ratings yet
An AI-Based Medical Chatbot Model For Infectious Disease Prediction
15 pages
Visual Image Understanding
No ratings yet
Visual Image Understanding
7 pages
Neural Network Compressed
No ratings yet
Neural Network Compressed
9 pages
Deep-Drone-Object 2
No ratings yet
Deep-Drone-Object 2
8 pages
Accelerating Convolutional Neural Network by Exploiting Sparsity On Gpus
No ratings yet
Accelerating Convolutional Neural Network by Exploiting Sparsity On Gpus
11 pages
Project Report (2) RRRRRRRRRRR
No ratings yet
Project Report (2) RRRRRRRRRRR
10 pages
Deepfake Forensics An AI Synthesized Det
No ratings yet
Deepfake Forensics An AI Synthesized Det
10 pages
5 Major Computervision Technique
No ratings yet
5 Major Computervision Technique
10 pages
Convolutional Neural Networks (CNNS) : Foundations and Applications in Visual Representation Learning
No ratings yet
Convolutional Neural Networks (CNNS) : Foundations and Applications in Visual Representation Learning
9 pages
Object Detection Using Deep CNNs Trained On Synthetic Images
No ratings yet
Object Detection Using Deep CNNs Trained On Synthetic Images
8 pages
Recovering Quantitative Models of Human Information Processing With Differentiable Architecture Search
No ratings yet
Recovering Quantitative Models of Human Information Processing With Differentiable Architecture Search
8 pages
Deep Learning Approach For Object Detection Using CNN: Abstract
No ratings yet
Deep Learning Approach For Object Detection Using CNN: Abstract
7 pages
Comprehensive Review of R-CNN and Its Variant Arch
No ratings yet
Comprehensive Review of R-CNN and Its Variant Arch
8 pages
Multi-Layered Deep Convolutional Neural Network For Object Detection
No ratings yet
Multi-Layered Deep Convolutional Neural Network For Object Detection
6 pages
Computer Vision Revision Notes - 250322 - 101703
No ratings yet
Computer Vision Revision Notes - 250322 - 101703
4 pages
A Review of Advances in Image Recognition Models F
No ratings yet
A Review of Advances in Image Recognition Models F
5 pages
A Basic Introduction To Neural Networks
No ratings yet
A Basic Introduction To Neural Networks
6 pages
Object Detection Using CNN
No ratings yet
Object Detection Using CNN
5 pages
9-Ch 3 Question Bank
No ratings yet
9-Ch 3 Question Bank
7 pages
Deep Learning Syllabus
No ratings yet
Deep Learning Syllabus
4 pages
Last Lab Report
No ratings yet
Last Lab Report
6 pages
Solving High-Dimensional Partial Differential Equations Using Deep Learning
No ratings yet
Solving High-Dimensional Partial Differential Equations Using Deep Learning
6 pages
Dip 7
No ratings yet
Dip 7
4 pages
Inverse Design of Rectangular Microstrip Patch Antenna Using Neural Network Combining With Time-Domain Representation of S-Parameters
No ratings yet
Inverse Design of Rectangular Microstrip Patch Antenna Using Neural Network Combining With Time-Domain Representation of S-Parameters
4 pages
Sagar Paper
No ratings yet
Sagar Paper
4 pages
7 Applications of Convolutional Neural Networks - FWS
No ratings yet
7 Applications of Convolutional Neural Networks - FWS
3 pages
2 - EE-Tuning An Economical Yet Scalable Solution For Tuning Early-Exit Large Language Models2402.00518
No ratings yet
2 - EE-Tuning An Economical Yet Scalable Solution For Tuning Early-Exit Large Language Models2402.00518
1 page

DeepLearning RobotVision

Uploaded by

DeepLearning RobotVision

Uploaded by

Chapter 34

Deep Learning for Robot Vision

Mamilla Keerthikeshwar and S. Anto

M. Keerthikeshwar (B) · S. Anto

34.2 Deep Learning

34.3 Convolutional Neural Networks

The convolutional neural network comes under a class of neural networks. It is

34.3.1 Fast RCNN

34.4 Generative Adversarial Networks

34.5 Restricted Boltzmann Machine

Restricted Boltzmann machine is an unsupervised model that produces never seen

34.6 Recurrent Neural Networks

34.7 CNN Architectures

GoogLeNet is another convolutional neural network that is pretrained. Similar to

the combination of a group of small convolutions which is to reduce the parame-

14. N. Cruz, K. Lobos-Tsunekawa, J. Ruiz-del-Solar, Using convolutional neural networks in robots

You might also like