Computer Vision

computer vision

Uploaded by

jashwanthkumar.ad21

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

Computer Vision

computer vision

Uploaded by

jashwanthkumar.ad21

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

21AI601 - COMPUTER VISION

Unit I & LP3- DEPTH ESTIMATION AND MULTI CAMERA

VIEWS: PERSPECTIVE, BINOCULAR STEREOPSIS: CAMERA
AND EPIPOLAR GEOMETRY

1. DEPTH ESTIMATION AND MULTI CAMERA VIEWS

 Depth is a critical part of computer vision, which gives the computer information about the
distance of objects to the camera.
 Image depth estimation is about figuring out how far away objects in an image are.
 Depth estimation involves determining the distance of each pixel in relation to the camera.
 Depth is extracted from either monocular (single) or stereo (multiple views of a scene)
images.
 The task requires an input RGB image and outputs a depth image.
 The depth image includes information about the distance of the objects in the image from
the viewpoint, which is usually the camera taking the image.
 Some of the applications of depth estimation include smoothing blurred parts of an image,
better rendering of 3D scenes, self-driving cars, grasping in robotics, robot-assisted
surgery, automatic 2D-to-3D conversion in film, and shadow mapping in 3D computer
graphics, etc.
 Another significant application is self-driving cars, which need to know how far away the
vehicle ahead of it is to avoid collisions.
 Image depth estimation is about figuring out how far away objects in an image are.
 It’s an important problem in computer vision because it helps with things like creating 3D
models, augmented reality, and self-driving cars.
 One way of obtaining depth information is through stereo vision, which uses two cameras,
usually side by side. Both cameras take a picture of the same scene.
 Objects that appear in both images will be at different positions, where the disparity is the
difference between the positions.
 Objects close to the cameras will have a larger horizontal disparity in the images, whereas
faraway objects will have a smaller disparity.
 In the past, people used techniques like stereo vision or special sensors to estimate depth.
But now, there’s a new method called Depth Prediction Transformers (DPTs) that uses
deep learning.
 Depth estimation and multi-camera views in computer vision are closely related concepts,
as depth information is often derived from multiple camera views to create a more accurate
representation of the 3D structure of a scene.

1.1 How do we estimate depth?

 Our eyes estimate depth by comparing the image obtained by our left and right eye.
 The minor displacement between both viewpoints is enough to calculate an approximate
depth map. We call the pair of images obtained by our eyes a stereo pair.
 This, combined with our lens with variable focal length, and general experience of “seeing
things”, allows us to have seamless 3D vision.

1.2 Methods for depth estimation

1.2.1 Passive Methods:
Passive methods use information from a single image or a stereo pair of images without actively
projecting any additional light or signals.
Stereo Vision:
Principle: Stereo vision involves using two or more cameras to capture a scene from slightly
different viewpoints. The disparity between corresponding points in the left and right images is
used to calculate depth through triangulation. The baseline (distance between the cameras) affects
the accuracy of depth estimation.
Multi-camera Setup: Multiple cameras are positioned at different locations to capture a scene
from various perspectives, enhancing the accuracy of depth estimation.
Applications: Stereo vision is widely used in robotics, autonomous vehicles, and 3D
reconstruction
Depth from Defocus:
This method uses the blur information in images to estimate depth. Objects at different distances
will have different amounts of blur in the image. Cameras with controllable aperture sizes can
exploit this information to estimate depth.

Structure from Motion (SfM):

SfM involves analyzing the motion of features across multiple frames of a video sequence. By
tracking the movement of these features, the depth information can be inferred.

Depth from Semantic Segmentation:

By using deep learning techniques, such as convolutional neural networks (CNNs), depth can be
estimated based on semantic segmentation information. The network learns to associate certain
object classes with specific depths.

1.2.2 Active Methods:

Active methods involve actively projecting light or signals into the scene and measuring their
interactions with objects to determine depth.

Time-of-Flight (ToF):
ToF cameras emit light pulses and measure the time it takes for the light to travel to the object and
back. This information is used to calculate the distance between the camera and the object.
Structured Light:
Principle: Structured light systems project known patterns onto a scene, and depth is calculated
based on the deformation of the pattern. Depth is then calculated by analyzing the deformation of
the pattern. Using multiple cameras can enhance the accuracy and coverage of the depth
information.
Applications: Multi-camera structured light setups are used in industrial applications, such as
quality control and 3D scanning.

Lidar (Light Detection and Ranging):

Lidar systems use laser beams to measure the distance to objects. By analyzing the time it takes
for the laser beams to travel to the object and back, a 3D point cloud of the scene can be generated.
Lidar systems, which use laser beams to measure distance, can be combined with multi-camera
setups to create more comprehensive 3D representations.
Sensor Fusion: Integrating lidar data with information from multiple cameras helps overcome
limitations, such as occlusions or difficulties in textureless areas.
Applications: Autonomous vehicles often use a combination of lidar and camera data for robust
perception.
Active Stereo Vision:
Similar to stereo vision, but with the addition of active illumination. This can improve performance
in low-light conditions.
Deep Learning Approaches:
In recent years, deep learning methods, particularly convolutional neural networks (CNNs) and
recurrent neural networks (RNNs), have shown remarkable success in depth estimation. End-to-
end models can be trained to directly predict depth from monocular or stereo images.
Principle: Deep learning models, particularly convolutional neural networks (CNNs), can be
trained to estimate depth directly from multiple camera views.
End-to-End Models: These models take advantage of the rich features captured by multiple views
and learn complex mappings from images to depth maps.
Applications: Multi-view deep learning models are applied in areas like augmented reality and
3D scene understanding.

2. PERSPECTIVE, BINOCULAR STEREOPSIS

2.1 Binocular Stereopsis
Binocular stereopsis in computer vision is a technique that mimics the human visual system's ability
to perceive depth by using information from both eyes. This method involves capturing and
analyzing images from two slightly offset cameras, simulating the way human eyes provide
different viewpoints of the same scene. Binocular stereopsis is a fundamental concept in stereo
vision, and it plays a crucial role in tasks such as depth perception and 3D reconstruction.

2.1.1. Principle of Binocular Stereopsis:

Binocular Disparity: The key idea behind binocular stereopsis is the disparity between the images
captured by the left and right cameras. Disparity refers to the horizontal shift or difference in the
apparent position of an object in the left and right images.
Triangulation: By analyzing the disparity, the depth of objects in the scene can be determined
using triangulation principles.
2.1.2. Stereo Camera Setup:
Camera Configuration: Two cameras are positioned at a slight horizontal offset, simulating the
separation between human eyes. This offset is referred to as the baseline.
Calibration: Precise calibration of the cameras is essential to ensure accurate correspondence
between points in the left and right images.

2.1.3. Depth Estimation using Binocular Stereopsis:

Correspondence Matching: The process begins with identifying corresponding points in the left
and right images. This involves finding features or patterns that can be matched between the two
images.
Disparity Calculation: Once correspondences are established, the disparity between these points
is calculated. The greater the disparity, the closer the object is to the cameras.
Depth Map Generation: By mapping the disparity values across the entire image, a depth map can
be created, providing the depth information for each pixel.

2.1.4. Challenges and Solutions:

Occlusions: Occluded regions pose challenges to stereo vision. Advanced algorithms are employed
to handle occlusions and interpolate depth information in such areas.
Textureless Surfaces: Regions with little or no texture may result in ambiguous correspondences.
Proper handling of such situations is critical for accurate depth estimation.

2.1.5. Applications:
Robotics: Binocular stereopsis is widely used in robotics for tasks like navigation and object
manipulation.
Autonomous Vehicles: Depth perception is crucial for autonomous vehicles to understand the
environment and make informed decisions.
3D Reconstruction: Binocular stereopsis is a fundamental technique for creating detailed 3D
models of scenes and objects.

2.1.6. Improvements with Machine Learning:

Deep Learning: Convolutional Neural Networks (CNNs) are employed for feature extraction and
correspondence matching, improving the robustness of binocular stereopsis.
End-to-End Models: Some approaches use end-to-end learning to directly predict depth maps from
stereo image pairs.

2.1.7. Limitations:
Calibration Sensitivity: Precise calibration is critical, and small errors in camera alignment can
lead to inaccuracies.
Limited Baseline: A smaller baseline may result in less accurate depth estimation, especially for
distant objects.

2.2 Perspective Stereopsis

Perspective stereopsis in computer vision refers to the use of perspective cues to estimate depth
information in a scene, often relying on monocular (single-camera) imagery. Unlike binocular
stereopsis, which uses information from two offset cameras, perspective stereopsis exploits the
inherent depth cues present in a single image captured by a camera.
2.2.1 Key aspects of perspective stereopsis in computer vision:
Perspective Cues:
Size and Position of Objects: In a perspective image, objects that are closer to the camera appear
larger than those farther away. The position of an object in the image also provides cues about its
depth.
Perspective Projection:
Projection Geometry: The projection of 3D points onto a 2D image plane follows the principles of
perspective projection. This results in depth information being encoded in the image through the
size and position of objects.
Depth from Motion:
Motion Parallax: Perspective stereopsis can leverage motion parallax, where objects at different
depths move at different rates when the camera or the objects are in motion. Analyzing this motion
provides depth information.
Texture Gradients:
Perspective-induced Gradients: The rate of change of texture in an image can be indicative of
depth. As objects move away from the camera, the density of texture (pixels) in the image
decreases.
Single-Camera Setup:
Monocular Vision: Perspective stereopsis relies on a single camera to capture images. It doesn't
require the use of multiple cameras or stereo pairs.
Depth Estimation Techniques:
Depth from Focus: By analyzing the sharpness or blurriness of different regions in the image,
depth information can be estimated. Objects in focus are likely to be closer, while blurry regions
may indicate distance.
Depth from Defocus: Similar to depth from focus, but it involves deliberately defocusing certain
regions and analyzing the resulting blur for depth estimation.
Machine Learning Approaches:
Deep Learning: Convolutional Neural Networks (CNNs) and other deep learning architectures can
be trained to directly predict depth from monocular images, leveraging a large dataset with ground
truth depth information.
Applications:
Autonomous Systems: Perspective stereopsis is used in robotics and autonomous systems for tasks
such as obstacle avoidance and navigation.
Augmented Reality: Depth information from a single camera is valuable for overlaying virtual
objects onto the real world with proper depth cues.
Challenges:
Ambiguity: Monocular depth estimation can be inherently ambiguous, especially when objects
have similar appearances but different depths.
Scene Complexity: Highly textured or occluded scenes may pose challenges for accurate depth
estimation.
Integration with Other Techniques:
Sensor Fusion: Combining perspective stereopsis with data from other sensors (such as IMUs or
additional cameras) can enhance the robustness of depth estimation.
Perspective stereopsis is a valuable approach in scenarios where only a single camera is available,
and it complements binocular and multi-camera methods in computer vision applications.
Advances in machine learning have led to improvements in the accuracy and robustness of
monocular depth estimation techniques, making them increasingly relevant in various real-world
applications.

3. CAMERA AND EPIPOLAR GEOMETRY

3.1 Epipolar Geometry
 Epipolar geometry is a fundamental concept in computer vision and stereo vision that
describes the geometric relationship between two cameras capturing the same scene from
different viewpoints.
 This concept is particularly useful in tasks such as stereo reconstruction, 3D scene
reconstruction, and structure from motion.
 The epipolar geometry between two views is essentially the geometry of the intersection
of the image planes with the pencil of planes having the baseline as axis (the baseline is
the line joining the camera centres).

3.2 The geometric entities involved in epipolar geometry

Epipolar Line:
Given two cameras, each capturing an image of the same scene from a different perspective, the
epipolar line in one image corresponds to the line along which the corresponding point must lie in
the other image.
Epipole:
The epipole is the point of intersection of the line connecting the camera centers (baseline) with
the image plane. The epipole in one image corresponds to the position of the other camera in the
first image.
Epipolar Constraint:
The epipolar constraint states that the epipolar lines in one image correspond to the intersection of
the corresponding planes with the image planes of the other camera. This constraint reduces the
search space for finding corresponding points.
Epipolar Geometry Matrix (Fundamental Matrix):
The relationship between corresponding points in two images can be mathematically expressed
using the Fundamental Matrix. The Fundamental Matrix relates the pixel coordinates in one image
to the epipolar lines in the other image.
Essential Matrix:
The Essential Matrix is a 3x3 matrix that encapsulates the intrinsic parameters of the cameras and
the rotation and translation between them. It is closely related to the Fundamental Matrix and is
used in the process of recovering the relative pose between the cameras.

Triangulation:
Once the epipolar geometry is established, triangulation can be used to compute the 3D position
of a point in the scene by finding the intersection of the corresponding rays in the two cameras.

Computer Vision Lecture Notes All
No ratings yet
Computer Vision Lecture Notes All
18 pages
Gibilisco (Quiz Summary)
No ratings yet
Gibilisco (Quiz Summary)
44 pages
Monocular Depth Estimation Based On Deep Learning: An Overview
No ratings yet
Monocular Depth Estimation Based On Deep Learning: An Overview
14 pages
Computer Stereo Vision: Exploring Depth Perception in Computer Vision
From Everand
Computer Stereo Vision: Exploring Depth Perception in Computer Vision
Fouad Sabry
No ratings yet
Depth Reconstruction With Deep Neural Networks (Part 1)
No ratings yet
Depth Reconstruction With Deep Neural Networks (Part 1)
66 pages
Stereo Vision Using The Opencv Library: Sebastian DR Oppelmann Moos Hueting Sander Latour Martijn Van Der Veen June 2010
No ratings yet
Stereo Vision Using The Opencv Library: Sebastian DR Oppelmann Moos Hueting Sander Latour Martijn Van Der Veen June 2010
15 pages
Vision Lec 10
No ratings yet
Vision Lec 10
23 pages
Monocular Depth Estimation Based On Deep Learning An Overview
No ratings yet
Monocular Depth Estimation Based On Deep Learning An Overview
16 pages
D S: C G S D: Epth Plat Onnecting Aussian Platting AND Epth
No ratings yet
D S: C G S D: Epth Plat Onnecting Aussian Platting AND Epth
15 pages
Depth Perception in Single RGB Camera System Using Lens Aperture and Object Size: A Geometrical Approach For Depth Estimation
No ratings yet
Depth Perception in Single RGB Camera System Using Lens Aperture and Object Size: A Geometrical Approach For Depth Estimation
16 pages
Chugunov the Implicit Values of a Good Hand Shake Handheld Multi-Frame CVPR 2022 Paper
No ratings yet
Chugunov the Implicit Values of a Good Hand Shake Handheld Multi-Frame CVPR 2022 Paper
11 pages
Mobicom22 Final138
No ratings yet
Mobicom22 Final138
14 pages
UNIT IV AICV AIDS
No ratings yet
UNIT IV AICV AIDS
22 pages
Stereo_Matching_and_Rectification
No ratings yet
Stereo_Matching_and_Rectification
13 pages
KNEC PAST PAPERS ON STEREOSCOPY &analogue Photogrammetric Plotters
No ratings yet
KNEC PAST PAPERS ON STEREOSCOPY &analogue Photogrammetric Plotters
12 pages
Real Time 3D Depth Estimation and
No ratings yet
Real Time 3D Depth Estimation and
6 pages
Stereo Vision
No ratings yet
Stereo Vision
19 pages
What Is The Goal Stereo Vision?
No ratings yet
What Is The Goal Stereo Vision?
7 pages
Group 3 Report
No ratings yet
Group 3 Report
14 pages
Deep Learning Stereo Vision at The Edge: Luca Puglia and Cormac Brick
No ratings yet
Deep Learning Stereo Vision at The Edge: Luca Puglia and Cormac Brick
10 pages
Ummenhofer DeMoN Depth and CVPR 2017 Paper
No ratings yet
Ummenhofer DeMoN Depth and CVPR 2017 Paper
10 pages
3D Stereo Camera
No ratings yet
3D Stereo Camera
7 pages
CV SCE
No ratings yet
CV SCE
12 pages
Stereo Vision
No ratings yet
Stereo Vision
214 pages
Object Detection: Advances, Applications, and Algorithms
From Everand
Object Detection: Advances, Applications, and Algorithms
Fouad Sabry
No ratings yet
Image
No ratings yet
Image
14 pages
Demon: Depth and Motion Network For Learning Monocular Stereo
No ratings yet
Demon: Depth and Motion Network For Learning Monocular Stereo
22 pages
Iccais
No ratings yet
Iccais
5 pages
2D-to-3D Photo Rendering For 3D Displays: Comandu@dsi - Unifi.it Atsuto - Maki@crl - Toshiba.co - Uk
No ratings yet
2D-to-3D Photo Rendering For 3D Displays: Comandu@dsi - Unifi.it Atsuto - Maki@crl - Toshiba.co - Uk
8 pages
Development of A Stereo Vision Measurement Architecture For An Underwater Robot
No ratings yet
Development of A Stereo Vision Measurement Architecture For An Underwater Robot
4 pages
801de018-d7ac-4b1b-9e5b-3541b3ee9e41
No ratings yet
801de018-d7ac-4b1b-9e5b-3541b3ee9e41
18 pages
Efficient Hybrid Tree-Based Stereo Matching With Applications To Postcapture Image Refocusing
No ratings yet
Efficient Hybrid Tree-Based Stereo Matching With Applications To Postcapture Image Refocusing
15 pages
Stereo Vision PHD Thesis
100% (2)
Stereo Vision PHD Thesis
7 pages
Improving Structured Light Based Depth and Pose Estimation Using Cnns
No ratings yet
Improving Structured Light Based Depth and Pose Estimation Using Cnns
77 pages
Depth Estimation by Combining Binocular Stereo and Monocular
No ratings yet
Depth Estimation by Combining Binocular Stereo and Monocular
10 pages
08-Monocular Depth Estimation
No ratings yet
08-Monocular Depth Estimation
15 pages
Computer Vision Lecture Notes All
No ratings yet
Computer Vision Lecture Notes All
18 pages
Mod1 Notes From Studocs
No ratings yet
Mod1 Notes From Studocs
18 pages
Distance Fog: Exploring the Visual Frontier: Insights into Computer Vision's Distance Fog
From Everand
Distance Fog: Exploring the Visual Frontier: Insights into Computer Vision's Distance Fog
Fouad Sabry
No ratings yet
Unit 9 Stereos
No ratings yet
Unit 9 Stereos
51 pages
Im High Quality Structure ICCV 2015 Paper
No ratings yet
Im High Quality Structure ICCV 2015 Paper
9 pages
4.3.stereo-app
No ratings yet
4.3.stereo-app
70 pages
Full Thesis
No ratings yet
Full Thesis
57 pages
Neural RGB D Sensing: Depth and Uncertainty From A Video Camera
No ratings yet
Neural RGB D Sensing: Depth and Uncertainty From A Video Camera
13 pages
Computer Vision Lecture Notes All Compress
No ratings yet
Computer Vision Lecture Notes All Compress
17 pages
2019 Scopus AReviewon Stereo Vision Algorithms Challengesand Solutions
No ratings yet
2019 Scopus AReviewon Stereo Vision Algorithms Challengesand Solutions
18 pages
Multi View Three Dimensional Reconstruction: Advanced Techniques for Spatial Perception in Computer Vision
From Everand
Multi View Three Dimensional Reconstruction: Advanced Techniques for Spatial Perception in Computer Vision
Fouad Sabry
No ratings yet
Neural RGBRD Sensing Depth and Uncertainty From A Video Camera
No ratings yet
Neural RGBRD Sensing Depth and Uncertainty From A Video Camera
10 pages
A Polynomial Based Depth Estimation From A Single Image: Project ON
No ratings yet
A Polynomial Based Depth Estimation From A Single Image: Project ON
46 pages
Stereoscopy: Submitted By-Saumya Tripathi 2K7/ME/304
No ratings yet
Stereoscopy: Submitted By-Saumya Tripathi 2K7/ME/304
22 pages
Object Distance Measurement by Stereo Vision
No ratings yet
Object Distance Measurement by Stereo Vision
5 pages
Atapour-Abarghouei Real-Time Monocular Depth CVPR 2018 Paper PDF
No ratings yet
Atapour-Abarghouei Real-Time Monocular Depth CVPR 2018 Paper PDF
11 pages
Lens Light: in Barrel Distortion, Straight Lines Bulge Outwards at The Center, As in A
No ratings yet
Lens Light: in Barrel Distortion, Straight Lines Bulge Outwards at The Center, As in A
7 pages
Video Depth without Video Models
No ratings yet
Video Depth without Video Models
13 pages
07 ShapefX
No ratings yet
07 ShapefX
115 pages
Overview on 3 d Reconstruction From Images
No ratings yet
Overview on 3 d Reconstruction From Images
7 pages
Vision Por Computadora
No ratings yet
Vision Por Computadora
14 pages
Important Questions
No ratings yet
Important Questions
102 pages
Ijcai07 monocularStereoDepth
No ratings yet
Ijcai07 monocularStereoDepth
7 pages
Stereo Vision Due Diligence
No ratings yet
Stereo Vision Due Diligence
6 pages
Lecture 13
No ratings yet
Lecture 13
130 pages
KeePass Password Safe - CodeProject
No ratings yet
KeePass Password Safe - CodeProject
8 pages
IR RECEIVER Module PDF
No ratings yet
IR RECEIVER Module PDF
5 pages
Id:69123 SAT69 - MM Validate Run MPS Single Item
No ratings yet
Id:69123 SAT69 - MM Validate Run MPS Single Item
34 pages
Fragmented Contract Management: Challenges, Impacts and Solutions
No ratings yet
Fragmented Contract Management: Challenges, Impacts and Solutions
22 pages
Man Vlfsin34 en
No ratings yet
Man Vlfsin34 en
31 pages
CMControl P Brochure ENU
No ratings yet
CMControl P Brochure ENU
12 pages
DF6503 Online Power Quality Monitoring Terminal
No ratings yet
DF6503 Online Power Quality Monitoring Terminal
24 pages
LedOK Kit Phone APP Instructions-V.1.0
No ratings yet
LedOK Kit Phone APP Instructions-V.1.0
12 pages
Original: English Translation of German
No ratings yet
Original: English Translation of German
56 pages
Step To Step Guide On Saas Product Development Process
No ratings yet
Step To Step Guide On Saas Product Development Process
6 pages
Modern Systems Analysis and Design
No ratings yet
Modern Systems Analysis and Design
35 pages
Python For Data Science
No ratings yet
Python For Data Science
22 pages
Proving Coriolis Meters 4130
No ratings yet
Proving Coriolis Meters 4130
6 pages
Important Lab
No ratings yet
Important Lab
9 pages
Samsung Mm-E320 SCH
No ratings yet
Samsung Mm-E320 SCH
8 pages
Data Analytics and Performance
100% (7)
Data Analytics and Performance
81 pages
Mavic Dablo CV
No ratings yet
Mavic Dablo CV
1 page
Memoir PDF
No ratings yet
Memoir PDF
162 pages
Module 4 The Netiquette and The Computer Ethics
No ratings yet
Module 4 The Netiquette and The Computer Ethics
22 pages
GEH-6721 Vol I
100% (1)
GEH-6721 Vol I
202 pages
5G Wireless Technology Agency by Slidesgo
No ratings yet
5G Wireless Technology Agency by Slidesgo
40 pages
Resume International
No ratings yet
Resume International
1 page
Multi Charts Userguide
No ratings yet
Multi Charts Userguide
905 pages
time-history-earthquake-engineering
No ratings yet
time-history-earthquake-engineering
4 pages
01 Financials Overview
No ratings yet
01 Financials Overview
26 pages
(Ebook) VLSI design by Das, Debaprasad ISBN 9780198094869, 9781680158717, 0198094868, 1680158716 - Download the ebook now for instant access to all chapters
No ratings yet
(Ebook) VLSI design by Das, Debaprasad ISBN 9780198094869, 9781680158717, 0198094868, 1680158716 - Download the ebook now for instant access to all chapters
51 pages
User Engine
80% (5)
User Engine
4 pages
CK11N
No ratings yet
CK11N
3 pages
637669063
No ratings yet
637669063
6 pages