0% found this document useful (0 votes)
17 views40 pages

Part 09 MD

computer vision slides

Uploaded by

yyyangwhu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views40 pages

Part 09 MD

computer vision slides

Uploaded by

yyyangwhu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Visual Information Interpretation

3D vision: Camera calibration and Stereopsis

Ji Hui

1
Sereopsis and eyes

2
Multi-camera systems for 3D perception

Using multiple (stereo) cameras to capturing 3D information from two or more images

Multi-camera survillance Stereo camera for driverless car

3
Depth from Stereopsis

Shape from "disparity" between two views


3D shape of scene from two (multiple) images from different viewpoints

4
3D perception for stereo imaging

Depth is represented by color

5
World co-ordinate vs image co-ordinate

6
Geometric camera calibration
Estimating the co-ordinate mapping among image frame, camera frame and world frame
Intrinsic: pixel co-ordinates camera co-ordinates
Extrinsic: camera coordinates world co-ordinates

Usage of calibration mapping


Measure the size of an object in world units
Determine the location of the camera in the world
Warping the images to preferred co-ordinates.
7
Extrinsic mapping
Extrinsic mapping describes the relation between camera co-ordinate and world co-ordinate
(world frame)
describes the position of the origin of camera frame with respect to world frame
describes the rotation aligning the camera frame with the world frame
Consider a point
: coordinate of the point in camera frame
: coordinate of the point in world frame
The mapping between two coordinates:

8
Intrinsic mapping

Intrinsic mapping describes the relation between camera frame and image frame in practice.
The mapping between two, in homogenous form:

9
Calibration matrix
Combining both extrinsic and intrinsic mapping:

with

where and are the entries and columns of .


, , 3x4 matrix defined up to scale, has 11 degrees of freedom (5 internal
parameters for intrinsic matrix, 3 rotation parameters, 3 translation parameters)
The 3x3 submatrix is non-singular:

10
Calibration with known correspondence

Point correspondence: matching the corners of board:

11
Calibration: how to estimate
Using the following equation to estimate parameters of calibration matrix

How to estimate from point correspondences

Let denote the 12 entries of matrix . These constraints are transformed to a


linear system

It can be solved by the least squares estimation:

whose solution is the eigenvector of least eigenvalue of in magnitude.


The solution is not optimal, as it does not reflect structure of matrix .

12
Finding and from via RQ decomposition
RQ decomposition is based on QR decomposition which decompose a matrix to

is an orthogonal matrix satisfying


is an upper triangular matrix.
QR can also be used for RQ decompostion: . Define

Let be the QR decomposition. Then, , where

13
Scipy: RQ decomposition
import numpy as np
from scipy import linalg
np.set_printoptions(precision=4)
A = np.random.randn(4,5)
(R,Q) = linalg.rq(A)
print(A, ': The matrix A')
print(R, ': The upper-triangular matrix')
print(Q, ': The unitary matrix')

14
Camera calibration: Estimating and
First stage: Estimate calibration matrix from point correspondences.
Second stage: Decompose M into internal and external mappings
i. Estimating translation.
Non-homogenous co-ordinates for : Thus, for ,
.
Homogenous co-ordinates for :

Thus, is defined as the eigenvector of the smallest eigenvalue of , in


homogenous co-ordinate.
ii. Estimating camera rotation and intrinsic parameters .
Recall that

where , : upper-triangular, :unitary matrix.


Running the RQ decompostion on the left block of , , to have
and .
15
3D vision: Stereosis
Two viewing points provide disparity, which translates to depth

Multi-view images can provide a 3D model of object

16
Stereo vision
For an point in the world, its single view cannot determine its location in space

17
Stereo vision
For an point in the world, 2 different views can determine its location in space
The 3D co-ordinates of the point can be found by triangulation.

18
Pixel correspondence
Triangulation needs dense pixel correspondence
Ransac-based matching only gives sparse point corresponce

It appears that given a pixel in left image, finding its


correspondence in right image needs to run 2D search over the whole image
How do we match a point in the first image to a point in the second? How can we constrain
our search?
19
Epipolar geometry: Pixel correspondence
Find pairs of points that correspond to same scene point

Epipolar geometry: For each pixel, its correspondence lies in a line, epipolar line.

For 2 parallel cameras. The correspondence (right) of each pixel (left) lies in a horizontal line
20
Epipolar geometry

epipoles
intersection of baseline with image plane
projection of projection center in other image
vanishing point of camera motion direction
epipolar plane = plane containing baseline (1-D family)
epipolar line = intersection of epipolar plane with image
21
Examples of other configurations
Converging camera

Forward camera motion

22
Calibrated camera: essential matrix

Suppose we know intrinsic mapping of cameras: and .


Convert to normalized coordinates by pre-multiplying all points with the inverse of the
calibration matrix

Set the first camera's coordinate system as world coordinates and define the rigid motion,
rotation and translation , that map from to

23
Calibrated camera: essential matrix
The relationship

showed that the vectors and are co-planar as the vector is summation of the
vector and . That means,

where

Thus, we have

24
Uncalibrated camera: fundamental matrix

If we do not know the intrinsic parameters. Then,

leads to

where

25
Properties of fundamental (essential) matrix

Fundamental matrix :

is of rank 2,
Has 7 degrees of freedom
There are 9 elements, but scaling can be omitted and
Essential matrix :

is of rank 2
Its two nonzero singular values.
Has only 5 degrees of freedom, 3 for rotation, 2 for translation

26
-point algorithm
Recall that fundamental matrix is determined by the correspondencepairs

Let denote the entries of the matrix , we have where

Normalized -point method


Normalize points by shifting to the origin
Computing by SVD for minimizing Mean squares error (MSE)
Enforce the rank- constraints.
Output by re-shifting back.

27
Triangulation: calibrated camera
Finding as the midpoint of the common perpendicular to the two rays in space.

Linear triangulation:

where is determined by the pairs .


The linear system can be solved by

28
Reconstruction via minimizing geometric error
Finding a pair whose rays intersections and is close to :

Example of : , which is equivalent to minimizing the reprojection error:

29
Rectification
Camera rectification for simplifying reconstruction
Re-project image planes onto common plane such as all epipolar lines are horizontal, i.e.
two cameras are parallel.

The distance between two optical center is called the stereo baseline, and is assumed to
be known.
30
Depth and disparity from rectified camera
Point correspondence, and in each own co-ordinates, in cameras w/. baseline
The measurement is called disparity of matched point pair
The distance between two points are

Notice that
Thus, the disparity is proportional to inverse depth :

31
Correlation-based dense correspondence
3D scene reconstruction requires dense correspondences

For rectified camera, Correspondence is done as follows. For each epipolar line, and each
point in left image, finding a point in right image with closest intensity.
Often, it is not possible without additional constraints, e.g.

32
Window matching
Issue: ambiguity exists when comparing only single intensity
Idea: comparing the neighboring window
Window matching
For each window (e.g. ), match to closest window on epipolar line in another image.
Two often seem matching metrics:

Additional constraints on dense correspondence


Ordering constraint: order of points in two images is same.
Smooth constraint: disparity doesn't change too quickly
Uniqueness constraint: each feature at most has one match

33
Demonstration of matching in stereo
Given a point in left image, we scan the scanline and find one local window patch that is
most similar to the one in the left.
The similarity of two patches are measured by matching cost, which could be SSD or
correlation.
The correspondent pixel is the one with the lowest matching cost.

34
Results with different patch sizes
Smaller patches: more detail, but noisy. Bigger: less detail, but smooth

35
Demonstration

36
Depth map from Stereo

Original image grouth truth

Results w/ only window matching Result w/ additional constraints

37
Solutions with rectified camera systems
Intel realsense depth system
Cheap: 200+ USD as of 2023-03
Long-range sensing
Depth accuracy decreases when increasing range

38
Solutions with other sensors

Kinect: Structured infrared light

3D data

39
Solutions with other sensors

Lidar in iphone

3D face data

40

You might also like