0% found this document useful (0 votes)
23 views41 pages

Lecture20 Calibration Cont, Stereo

The document discusses camera calibration and stereo vision, emphasizing the importance of intrinsic and extrinsic camera parameters for accurate measurements and navigation. It outlines methods for estimating camera parameters, including using known 3D geometry and multiple views, and introduces stereo vision concepts such as correspondence and reconstruction. The document also covers the basic principles of stereo vision, including triangulation and the relationship between disparity and depth.

Uploaded by

kayakbackwards
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views41 pages

Lecture20 Calibration Cont, Stereo

The document discusses camera calibration and stereo vision, emphasizing the importance of intrinsic and extrinsic camera parameters for accurate measurements and navigation. It outlines methods for estimating camera parameters, including using known 3D geometry and multiple views, and introduces stereo vision concepts such as correspondence and reconstruction. The document also covers the basic principles of stereo vision, including triangulation and the relationship between disparity and depth.

Uploaded by

kayakbackwards
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

Computer Vision:

Camera Calibration, Stereo


Vision

Prof. Majid Komeili


COMP 4102
Winter 2025
Camera Calibration
Extrinsic Parameters and Proj. Matrix
Do we need intrinsic camera params?
• If we are just labelling objects in images we do not
need to know intrinsic camera params
• Also true for other applications like panoramas.
• If we want to measure objects or navigate in an
environment we need camera params
• Also true for applications like stereo vision.
• For high accuracy need linear params (K) and non-linear
params to cancel Radial and tangential distortions.
• But can still get results with only K
• Can get an estimate of K from the image
• But for non-linear params need calibration
How to find the camera parameters
• Can use the EXIF tag for any digital image
• Has focal length f in millimeters but not the pixel size
• But you can get the pixel size from the camera manual
• If there is not a lot of image distortion due to optics then
this approach is sufficient (this only get the linear
params K)
• Can perform explicit camera calibration
• Put a calibration pattern in front of the camera
• Take a number of different pictures of this pattern
• Now run the calibration algorithm (different types)
• Result is intrinsic camera parameters (linear and non-
linear) and the extrinsic camera parameters of all the
images
• Need explicit calibration when high accuracy is required
Calibration using known 3D geometry
• Use a calibration object with known 3D geometry
(often a box, not planar)
• Write projection equations linking known coordinates
of a set of 3D points and their projections and solve for
camera parameters
• Given a set of one or more images of the calibration
pattern estimate
• Intrinsic camera parameters
o (depend only on camera characteristics)
• Extrinsic camera parameters
o (depend only on position camera)
• We do not estimate distortion parameters
• Consider only one view of the calibration object
Estimate camera params – one view
Projection matrix Calibration pattern
Camera parameters
• Intrinsic parameters (K matrix)
• There are 5 intrinsic parameters
• Focal length f
• Pixel size in x and y directions, sx and sy
• Principal point ox, oy
• But they are not independent
• Focal length fx = f / sx and fy = f / sy
• Principal point ox, oy
• This makes four intrinsic parameters
• Extrinsic parameters [R| T]
• Rotation matrix and translation vector of camera
• Relations camera position to a known frame
• Projection matrix
• 3 by 4 matrix P =K [R | T] is called projection matrix
Projection Equations
Two different calibration methods
• Both use a set of 3d points and 2d projections
• Direct approach (called Tsai method)
• Write projection equations in terms of all the parameters
o That is all the unknown intrinsic and extrinsic parameters
• Solve for these parameters using non-linear equations
• Projection matrix approach
• Compute the projection matrix (the 3x4 matrix M)

• Compute camera parameters as closed-form functions of M


Two different calibration methods
• Both approaches work with same data
• Projection matrix approach is simpler to explain than the
direct approach
• Direct approach requires an extra step
• There are also other calibration methods
• But all calibration methods
• Use patterns with known geometry or shape
• Take multiple views of these patterns
• Match the information across the different views
• Perform some mathematics to calculate the intrinsic
and extrinsic camera parameters
• We look at simplified case of only one view!
Estimating the projection matrix

𝑀=
Estimating the projection matrix
World – Frame Transform
• Drop “w”
• N pairs (xi,yi) <-> (Xi,Yi,Zi)
Linear equations of m
• 2N equations, 11 independent variables
• N >=6 , SVD => m up to an unknown scale
Solving this Homogeneous System
• M linear equations of form 𝐴𝒙 = 0
• If we have a given solution 𝑥1, s.t. 𝐴𝑥1 = 0 then 𝑐 ∗ 𝑥1 is also a solution 𝐴(𝑐 ∗
𝑥1) = 0
• Need to add a constraint on 𝒙,
• Basically make 𝒙 a unit vector X T X = 1

• Can prove that the solution is the eigenvector corresponding to the single zero
eigenvalue of that matrix AT A

• This can be computed using eigenvector of SVD routine

• Then finding the zero eigenvalue (actually smallest)


• Returning the associated eigenvector
Decompose projection matrix
• 3x4 Projection Matrix M computed previously
• Both intrinsic (4) and extrinsic (6) – 10 parameters

෡ = 𝛾𝑀 to parameters (p134-135)
• From 𝑀
• You get 12 numbers once you have computed the projection matrix
• But you know that each of these numbers equals each of the
elements in these equations
• Now you use constraints on a rotation matrix (rows are all unit
length and are orthogonal) along with other constraints
• You apply these one at a time and can compute all of R, T and K
• You need some simple algebraic manipulation to accomplish this
• Remember to use the constraints on rows of R to simplify equations
෡ = 𝛾𝑀 to parameters (p134-135)
From 𝑀
• Find scale |𝛾| by using unit vector 𝑅3𝑇
• Divide computed 𝑀 by |𝛾| to get new 𝑀 matrix
• Determine 𝑇𝑧 and sign of 𝛾 from 𝑚34 (i.e. 𝑞43 )
• If necessary change sign of every element in the new 𝑀 so that it
is true that every 𝑇𝑧 > 0
• Define 𝑞1 = (𝑚11 , 𝑚12 , 𝑚13 ), 𝑞2 = 𝑚21 , 𝑚22 , 𝑚23 and 𝑞3 =
(𝑚31 , 𝑚32 , 𝑚33 )
• Find (𝑂𝑥, 𝑂𝑦) by dot products of Rows 𝑜𝑥 = 𝑞1 . 𝑞3 , oy = 𝑞2 . 𝑞3 ,
using the orthogonal constraints of R
• Determine 𝑓𝑥 and 𝑓𝑦 from 𝑞1 . 𝑞1 and 𝑞2 . 𝑞2

• Now compute all the rest: 𝑅1𝑇 , 𝑅2𝑇 , 𝑇𝑥 , 𝑇𝑦


Multiple View/Camera Calibration
• Previous math describes the calibration process for
a single image
• We usually take multiple images of the same calibration
target (from a variety of different views)
• Simultaneously find all extrinsic parameters and all the
intrinsic parameters of the single camera
• Also calibrate radial distortion using the fact that
there are straight lines in the pattern
• OpenCV code can do this using a checkerboard
pattern
• Zhang’s algorithm is used most in practice
Stereo Vision
A simple system
Stereo
• Stereo
• Ability to infer information on the 3-D structure and distance
of a scene from two or more images taken from different
viewpoints
• Humans use only two eyes/images (try thumb trick)
• Two important problems in stereo
• Correspondence and reconstruction
• Correspondence
• What parts of left and right images are parts of same object?
• Reconstruction
• Given correspondences in left and right images, and possibly
information on stereo geometry, compute the 3D location
and structure of the observed objects
What is stereo vision?
• Narrower formulation: given a calibrated binocular stereo
pair, fuse it to produce a depth image
Stereo
Depth from two views
Stereo = Correspond + Reconstruct

• Basic Reconstruction Principle: Triangulation


• Gives reconstruction as intersection of two rays
• Requires
▪ Camera calibration
▪ Point correspondence
Active (Fixation) Stereo – not simple
• Eyes can rotate so that
they fixate at a point
• The black dots in the
image
• Fixation is common in
mammals
• It happens a lot and is fast
• But building stereo
systems that fixate is very
difficult
• The geometry of the
matching process changes
with the fixation point
Stereo

Slide credit: Babak Taati


Simplest Case: Parallel images
• Image planes of cameras are
parallel to each other and to the
baseline
• Camera centers are at same
height
• Focal lengths are the same
Simple Stereo Camera
Parallel Calibrated Cameras

Slide credit: Babak Taati


Simple Stereo System
• Left and right image planes are coplanar
• Represented by 𝐼𝐿 and 𝐼𝑅
• So this means that all matching features are on the same
horizontal line
• So the search for matches can proceed on the same line
Left Right
Simple Stereo System
• Consider 𝑃𝑙 . Search for
the corresponding point
in the right image plan
under the following
constraints.
1- Points on the red line
2- Points to the left of the
dashed line.
Simple Stereo System (2D)
• Distance between centers of projection is called the
baseline B.
• Centers of projection of cameras 𝐶𝐿 and 𝐶𝑅
• Point P in 3D space projects to 𝑃𝐿 and 𝑃𝑅
• 𝑋𝐿 and 𝑋𝑅 are co-ordinates of 𝑃𝐿 and 𝑃𝑅 with
respect to principal points 𝐶𝐿 and 𝐶𝑅
• These are camera co-ordinates in millimeters
• Z is the distance between point P and the baseline
• Z is called the depth
Simple Stereo System
Basic Stereo Derivations
+ - + -

B
Derive expression for Z as a function of 𝑥𝑙 , 𝑥𝑟 , 𝑓 and B
Here 𝑥𝑟 is projection on rightmost camera looking out from cameras
Stereo Derivations (camera coords)

+ - + -
B B

B B

1
B 𝑍∝
𝑑
Note that 𝑥𝑙 is always to left of 𝑥𝑟 so disparity d >= 0 in this case.
If increase in opposite direction swap 𝑥𝑟 and 𝑥𝑙 in d formula.
B
Stereo Derivations
The units of f, B, d, and Z need to be consistent for the equation Z=f.B/d to
work properly.
• f (focal length): Units can be in pixels or millimeters (mm), depending on
the coordinate system used. In stereo vision with digital images, it's
usually in pixels.
• B (baseline): This is a physical distance between the two camera centers.
Units are typically in millimeters (mm), centimeters (cm), or meters (m).
• d (disparity): This is the difference in pixel coordinates, so the unit is in
pixels.
• Z (depth): Since Z=f.B/d the units of Z will match the units of B (typically
mm, cm, or m), provided that f is in pixels and d is also in pixels.
Why 𝑥𝑟 – 𝑥𝑙 is always positive
Range Versus Disparity – z fixed for d
Range Versus Disparity – z fixed for d

Points along these lines


have the same
L→R displacement
(disparity)
Range Versus Disparity – z fixed for d

Uncertainty area for


one pixel disparity

You might also like