3D Vision: Topic 9 Stereo Vision (I)

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 36

Introduction to

Computer Vision 3D Vision

CMPSCI 591A/691A
CMPSCI 570/670

Topic 9
Stereo Vision (I)
Introduction to
Computer Vision 2D to 3D Inference

 Observations
 Objects are mostly 3D
 Images are 2D arrays of intensity, color values, etc.
 3D depth information is not explicitly encoded in images (it is
explicitly recorded in range images)
 However, 2D analysis implicitly uses 3D info
 3D structures are generally not random
 coherency in motion
 Man-made objects are of regular shapes and boundaries
 straight lines and smooth curves in images
 Explicit 3D information can be recovered using 2D shape cues
 disparities in stereo
 shading change due to orientation
 texture gradient due to view point change etc.
Introduction to
Computer Vision

View Images as “windows” into the 3D world


Introduction to
Computer Vision Shape Inference Techniques
Introduction to
Computer Vision Stereo

 The Stereo Problem


 Reconstruct scene geometry from two or more
calibrated images

 Basic Principle: Triangulation


 Gives reconstruction as intersection of two rays
 Requires point correspondence
 This is the hard part
Introduction to
Computer Vision Stereo Vision

 Problem
 Infer 3D structure of a scene from two or more images taken from
different viewpoints

 Two primary Sub-problems


 Correspondence problem (stereo match) -> disparity map
 Similarity instead of identity
 Occlusion problem: some parts of the scene are visible in one eye only
 Reconstruction problem -> 3D
 What we need to know about the cameras’ parameters
 Often a stereo calibration problem
 Lectures on Stereo Vision
 Stereo Geometry – Epipolar Geometry (*)
 Correspondence Problem (*) – Two classes of approaches
 3D Reconstruction Problems – Three approaches
Introduction to
Computer Vision A Stereo Pair

 Problems
 Correspondence problem (stereo match) -> disparity map
 Reconstruction problem -> 3D
3D?

CMU CIL Stereo Dataset : Castle sequence


http://www-2.cs.cmu.edu/afs/cs/project/cil/ftp/html/cil-ster.html
Introduction to
Computer Vision More Images…
Introduction to
Computer Vision More Images…
Introduction to
Computer Vision More Images…
Introduction to
Computer Vision More Images…
Introduction to
Computer Vision More Images…
Introduction to
Computer Vision Stereo View

Left View Right View

Disparity
Introduction to
Computer Vision Stereo Disparity

 The separation between two matching objects is


called the stereo disparity.

 Disparity is measured in pixels and can be positive or


negative (conventions differ). It will vary across the
image.
Introduction to
Computer Vision Disparity and Depth

 If the cameras are pointing in


the same direction, the
geometry is simple.
 b is the baseline of the camera
system,
 Z is the depth of the object,
 d is the disparity (left x minus
right x) and b
 f is the focal length of the
cameras.
 Then the unknown depth is
given by
fb
Z= d
Introduction to
Computer Vision Basic Stereo Configuration
Introduction to
Computer Vision Parallel Cameras

 Disparity is inversely proportional to depth—so stereo is


most accurate for close objects.
 Once we have found depth, the other coordinates in 3-D
follow easily — e.g. taking either one of the images:

 where x is the image coordinate, and likewise for Y.


Introduction to
Computer Vision Converging Cameras

 This is the more realistic case.


Object at
 The depth at which the cameras depth Z
converge, Z0, is the depth at
which objects have zero disparity.
 Finding Z0 is part of stereo
calibration.
 Closer objects have convergent
disparity (numerically positive)
and further objects have divergent
disparity (numerically negative).
Introduction to
Computer Vision The Correspondence Problem

 To measure disparity, we first have to find corresponding


points in the two images.
 This turns out not to be easy.
 Our own visual systems can match at a low level, as
shown by random-dot stereograms, in which the
individual images have no structure above pixel scale, but
which when fused show a clear 3-D shape.
Introduction to
Computer Vision Stereo Matching

 Stereo matchers need to start from some assumptions.


 Corresponding image regions are similar.
 A point in one image may only match a single point in the other
image.
 If two matched features are close together in the images, then
in most casestheir disparities will be similar, because the
environment is made of continuous surfaces separated by
boundaries.
 Many matching methods exist. The basic distinction is
between
 feature-based methods which start from image structure
extracted by preprocessing; and
 correlation-based methods which start from individual grey-
levels.
Introduction to
Computer Vision Feature Based Methods

 Extract feature descriptions: regions, lines, ….

Size,
aspect ratio,
average grey level
etc.

 Pick a feature in the left image.


 Take each feature in the right image in turn (or just
those close to the epipolar line), and measure how
different it is from the original feature:
Introduction to
Computer Vision Feature-Based Methods

 S is a measure of similarity
 w0 etc. are weights, and
 the other symbols are different measures of the feature in the
right and left images, such as length, orientation, average grey
level and so on.
 Choose the right-image feature with the largest value of S
as the best match for the original left-image feature.
 Repeat starting from the matched feature in the right image,
to see if we achieve consistency.
Introduction to
Computer Vision Feature Based Methods

 It is possible to use very simple features (just points,


in effect) if the constraint that the disparity should vary
smoothly is taken into account.
 Feature-based methods give a sparse set of
disparities — disparities are only found at feature
positions.
Introduction to
Computer Vision Correlation Based Methods

 Take a small patch of the left image as a mask and


convolve it with the part of the right image close to the
epipolar line.
 Over an area
 The peak of the convolution output gives the position
of the matching area of the right image, and hence the
disparity of the best match.
Introduction to
Computer Vision Correlation-Based Methods

Where to search?

section of left
image convolve
Convolution
peak at
position of
corresponding
patch in right
image
 Convolution-based methods can give a dense set of
disparities — disparities are found for every pixel.
 These methods can be very computationally intensive, but
can be done efficiently on parallel hardware.
Introduction to
Computer Vision Stereo Correspondence

 Epipolar Constraint
 Reduces correspondence problem to 1D search along
conjugate epipolar lines

 Stereo rectification: make epipolar lines horizontal


Introduction to
Computer Vision Correspondence Difficulties

 Why is the correspondence problem difficult?


 Some points in each image will have no corresponding
points in the other image.
(1) the cameras might have different fields of view.
(2) due to occlusion.
 A stereo system must be able to determine the image
parts that should not be matched.
Introduction to
Computer Vision Variations
 Scale-space. Methods that exploit scale-space can be very
powerful.
 Smooth the image using Gaussian kernels
 Match each feature with the nearest one in the other image —
location will be approximate because of blurring.
 Use the disparities found to guide the search for matches in a
less smoothed image.
 Relaxation. This is important in the context of neural
modelling.
 Set up possible feature matches in a network.
 Construct an energy function that captures the constraints of
the problem.
 Make incremental changes to the matches so that the energy
is steadily reduced.
Introduction to
Computer Vision Other Aspects of Stereo

 Very precise stereo systems can be made to estimate


disparity at sub-pixel accuracies. This is important for
industrial inspection and mapping from aerial and
satellite images.
 In controlled environments, structured light (e.g.
stripes of light) can be used to provide easy matches
for a stereo system.
Introduction to
Computer Vision Structured Light

 Structured lighting
 Feature-based methods are not applicable when the objects
have smooth surfaces (i.e., sparse disparity maps make
surface reconstruction difficult).
 Patterns of light are projected onto the surface of objects,
creating interesting points even in regions which would be
otherwise smooth.

 Finding and matching such


points is simplified by knowing
the geometry of the projected
patterns.
Introduction to
Computer Stanford’s
Vision Digital Michaelangelo Project

 http://graphics.stanford.edu/projects/mich/

maximum height of gantry: 7.5 meters


weight including subbase: 800 kilograms
Introduction to
Stanford’s
Computer Vision Digital Michaelangelo Project

480 individually aimed scans


2 billion polygons
7,000 color images
32 gigabytes
30 nights of scanning
1,080 man-hours
22 people
Introduction to
Computer Vision Correspondence Problems

 Which point matches with which point?


Introduction to
Computer Vision Difficulties

 Multiple matches are always likely


 Simple features (e.g., black dots)
 large number of potential matches
 precise disparity
 Complex features (e.g., polygons)
 small number of potential matches
 less precise disparity
Introduction to
Computer Vision Trinocular Stereo

From H. G. Wells, War of the Worlds


Introduction to
Computer Vision Three Camera Stereo

 A powerful way of eliminate spurious matches


 Hypothesize matches between A & B
 Matches between A & C on green epipolar line
 Matches between B & C on red epipolar line
 There better be something at the intersection

You might also like