CV Sce

Gkv

Uploaded by

sudarshan chaugule

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views12 pages

CV Sce

Gkv

Uploaded by

sudarshan chaugule

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 12

VIIT Department of Electronics and

Telecommunications

Presentation on

Monocular Deep Depth Estimation

Amit Jambhale-B.tech-22110333-413018
Sudarshan Chaugule–B.tech- 22111173- 413044
Vedant Tupe –B.tech- 22111226- 413049
Vedant kulkarni –B.tech- 22220153- 413063

Subject : Computer Vision

Guided by
Dr.Yogesh H. Dandawate
Department of Electronics and Telecommunications
Vishwakarma Institute of Information Technology,Pune
1
Outline VIIT Department of Electronics
and Telecommunications

• Introduction
•Motivation
•Literature Survey
•Objectives
•Methodology
•Dataset
•Results
•Conclusions
•References

2
Introduction VIIT Department of Electronics
and Telecommunications

• Deep depth estimation is a computer vision task that predicts the depth of each
pixel in a scene from a single 2D image, enabling the reconstruction of its 3D
structure.
• In this context, Convolutional Neural Networks (CNNs) play a pivotal role due to
their ability to automatically extract spatial features from images, making the
process robust and efficient.
• By leveraging CNNs, deep depth estimation eliminates the reliance on traditional
methods that often require stereo images or hand-crafted features. This
advancement has made monocular depth estimation more accessible and
versatile..
• The core of CNN-based depth estimation models lies in their encoder-decoder
architecture.
• The encoder processes the input image, extracting hierarchical depth-related
features through convolution and pooling layers, while the decoder reconstructs a
depth map at the same resolution as the input image using upsampling or
deconvolution layers..
3
Motivation VIIT Department of Electronics
and Telecommunications

• Monocular depth estimation simplifies tasks like augmented reality and

autonomous driving by using a single RGB camera..
• Conventional methods (like shape-from-focus and stereo vision) have strict
requirements, which monocular methods address through deep learning.
• Active techniques (e.g., LiDAR) are costly and complex, while monocular
methods are more affordable and functional with RGB cameras.
• Traditional stereo methods rely on texture matching between images, which
becomes difficult in textureless regions or when objects are occluded.
Monocular deep learning models can learn to estimate depth by inferring
contextual information, making them more robust in these situations.

4
Objectives or
Aim and Objectives VIIT Department of Electronics
and Telecommunications

• Aim – To build and evaluate a Monocular deep depth estimation model

using the DIODE dataset for accurate depth estimation.

• Objectives – To preprocess the dataset, train the model with effective

optimization techniques, validate its performance, and visualize predictions.

5
Literature Survey
VIIT Department of Electronics
and Telecommunications
Paper Author Solution Proposed

Deep Ordinal Regression Network for Fu, H., Gong, M., The paper "Deep Ordinal Regression Network for
Monocular Depth Estimation Monocular Depth Estimation" by Fu et al. (2018)
Wang, C., proposes a new approach to monocular depth
Batmanghelich, K., estimation by framing it as an ordinal regression
& Tao, D. problem. The authors introduce a spacing-increasing
discretization (SID) strategy to discretize depth
values, addressing slow convergence and poor
solutions common in traditional regression methods.
Their method leverages a multi-scale network
structure that avoids unnecessary spatial pooling,
resulting in higher accuracy and faster convergence..
Deep Convolutional Neural Fields Liu, Fayao, The paper introduces a novel approach combining
for Depth Estimation from a Single deep convolutional neural networks (CNN) with
Chunhua Shen continuous conditional random fields (CRF) for depth
Image Guosheng Lin estimation. Unlike previous methods, this model does
not rely on geometric priors or extra information. It
learns both unary and pairwise potentials within a
unified deep CNN framework, enabling precise depth
estimation in indoor and outdoor scenes.

6
Methodology
VIIT Department of Electronics
and Telecommunications

7
Dataset used VIIT Department of Electronics
and Telecommunications

• The dataset used in our project is DIODE (Dense Indoor and Outdoor Depth),
which is a widely used benchmark dataset for depth estimation.
• The DIODE (Dense Indoor and Outdoor Depth) dataset is designed for the task
of depth estimation in both indoor and outdoor environments..
• It contains stereo image pairs with corresponding depth ground truth, making
it an essential resource for training and evaluating depth estimation models.
• However, we use the validation set generating training and evaluation subsets
for our model.
• The reason we use the validation set rather than the training set of the original
dataset is because the training set consists of 81GB of data, which is
challenging to download compared to the validation set which is only 2.6GB.

8
Results VIIT Department of Electronics
and Telecommunications

9
Conclusion
VIIT Department of Electronics
and Telecommunications

• Monocular Deep Depth Estimation (MDE) represents a significant

advancement in depth perception technology, offering a practical
and cost-effective alternative to traditional depth sensing methods
like stereo vision and LiDAR.
• By leveraging deep learning, monocular MDE can estimate depth
from a single RGB image, eliminating the need for complex and
expensive multi-camera or active sensing setups.
• This approach has the potential to revolutionize applications in
augmented reality, autonomous driving, and robotics by enabling
real-time, robust depth estimation in a variety of challenging
environments, including textureless regions and occluded objects.

10
References VIIT Department of Electronics
and Telecommunications

• Zhao, Chaoqiang, et al. "Monocular depth estimation based on deep learning:

An overview." Science China Technological Sciences 63.9 (2020): 1612-1627.
• Fu, Huan, et al. "Deep ordinal regression network for monocular depth
estimation." Proceedings of the IEEE conference on computer vision and
pattern recognition. 2018.
• Ming, Yue, et al. "Deep learning for monocular depth estimation: A review."
Neurocomputing 438 (2021): 14-33..
• Masoumian, Armin, et al. "Monocular depth estimation using deep learning: A
review." Sensors 22.14 (2022): 5353..

11
VIIT Department of Electronics
and Telecommunications