Part 09 MD
Part 09 MD
Ji Hui
1
Sereopsis and eyes
2
Multi-camera systems for 3D perception
Using multiple (stereo) cameras to capturing 3D information from two or more images
3
Depth from Stereopsis
4
3D perception for stereo imaging
5
World co-ordinate vs image co-ordinate
6
Geometric camera calibration
Estimating the co-ordinate mapping among image frame, camera frame and world frame
Intrinsic: pixel co-ordinates camera co-ordinates
Extrinsic: camera coordinates world co-ordinates
8
Intrinsic mapping
Intrinsic mapping describes the relation between camera frame and image frame in practice.
The mapping between two, in homogenous form:
9
Calibration matrix
Combining both extrinsic and intrinsic mapping:
with
10
Calibration with known correspondence
11
Calibration: how to estimate
Using the following equation to estimate parameters of calibration matrix
12
Finding and from via RQ decomposition
RQ decomposition is based on QR decomposition which decompose a matrix to
13
Scipy: RQ decomposition
import numpy as np
from scipy import linalg
np.set_printoptions(precision=4)
A = np.random.randn(4,5)
(R,Q) = linalg.rq(A)
print(A, ': The matrix A')
print(R, ': The upper-triangular matrix')
print(Q, ': The unitary matrix')
14
Camera calibration: Estimating and
First stage: Estimate calibration matrix from point correspondences.
Second stage: Decompose M into internal and external mappings
i. Estimating translation.
Non-homogenous co-ordinates for : Thus, for ,
.
Homogenous co-ordinates for :
16
Stereo vision
For an point in the world, its single view cannot determine its location in space
17
Stereo vision
For an point in the world, 2 different views can determine its location in space
The 3D co-ordinates of the point can be found by triangulation.
18
Pixel correspondence
Triangulation needs dense pixel correspondence
Ransac-based matching only gives sparse point corresponce
Epipolar geometry: For each pixel, its correspondence lies in a line, epipolar line.
For 2 parallel cameras. The correspondence (right) of each pixel (left) lies in a horizontal line
20
Epipolar geometry
epipoles
intersection of baseline with image plane
projection of projection center in other image
vanishing point of camera motion direction
epipolar plane = plane containing baseline (1-D family)
epipolar line = intersection of epipolar plane with image
21
Examples of other configurations
Converging camera
22
Calibrated camera: essential matrix
Set the first camera's coordinate system as world coordinates and define the rigid motion,
rotation and translation , that map from to
23
Calibrated camera: essential matrix
The relationship
showed that the vectors and are co-planar as the vector is summation of the
vector and . That means,
where
Thus, we have
24
Uncalibrated camera: fundamental matrix
leads to
where
25
Properties of fundamental (essential) matrix
Fundamental matrix :
is of rank 2,
Has 7 degrees of freedom
There are 9 elements, but scaling can be omitted and
Essential matrix :
is of rank 2
Its two nonzero singular values.
Has only 5 degrees of freedom, 3 for rotation, 2 for translation
26
-point algorithm
Recall that fundamental matrix is determined by the correspondencepairs
27
Triangulation: calibrated camera
Finding as the midpoint of the common perpendicular to the two rays in space.
Linear triangulation:
28
Reconstruction via minimizing geometric error
Finding a pair whose rays intersections and is close to :
29
Rectification
Camera rectification for simplifying reconstruction
Re-project image planes onto common plane such as all epipolar lines are horizontal, i.e.
two cameras are parallel.
The distance between two optical center is called the stereo baseline, and is assumed to
be known.
30
Depth and disparity from rectified camera
Point correspondence, and in each own co-ordinates, in cameras w/. baseline
The measurement is called disparity of matched point pair
The distance between two points are
Notice that
Thus, the disparity is proportional to inverse depth :
31
Correlation-based dense correspondence
3D scene reconstruction requires dense correspondences
For rectified camera, Correspondence is done as follows. For each epipolar line, and each
point in left image, finding a point in right image with closest intensity.
Often, it is not possible without additional constraints, e.g.
32
Window matching
Issue: ambiguity exists when comparing only single intensity
Idea: comparing the neighboring window
Window matching
For each window (e.g. ), match to closest window on epipolar line in another image.
Two often seem matching metrics:
33
Demonstration of matching in stereo
Given a point in left image, we scan the scanline and find one local window patch that is
most similar to the one in the left.
The similarity of two patches are measured by matching cost, which could be SSD or
correlation.
The correspondent pixel is the one with the lowest matching cost.
34
Results with different patch sizes
Smaller patches: more detail, but noisy. Bigger: less detail, but smooth
35
Demonstration
36
Depth map from Stereo
37
Solutions with rectified camera systems
Intel realsense depth system
Cheap: 200+ USD as of 2023-03
Long-range sensing
Depth accuracy decreases when increasing range
38
Solutions with other sensors
3D data
39
Solutions with other sensors
Lidar in iphone
3D face data
40