Lecture18
Lecture18
Camera Calibration
CSE 803– Fall 2024, MSU
Xiaoming Liu
Thank many researchers who have made their slides and course materials available
Our goal: Recovery of 3D structure
A. Criminisi, M. Kemp, and A. Zisserman, Bringing Pictorial Space to Life: computer techniques for the
analysis of paintings, Proc. Computers and the History of Art, 2002
Next few classes
Photosensitive Material
Slide inspired by S. Seitz; image from Michigan Engineering
Projection Matrix
f
O
𝑥
𝑓𝑥 𝑓 0 0 0
𝑦 𝑓𝑥/𝑧
𝑓𝑦 ≡ 0 𝑓 0 0 𝑧 →
𝑓𝑦/𝑧
𝑧 0 0 1 0 1
X?
X?
X?
http://en.wikipedia.org/wiki/Ames_room
Slide Credit: J. Hays
Single-view Ambiguity
x
x
Exploit disparity as
depth cue using single
image.
(Single image random
dot stereogram, Single
image stereogram)
R,t
• One option: move, find correspondence.
• If you know how you moved and have a
calibrated camera, can solve for X
Original diagram credit: S. Lazebnik
Knowing R,t
credit: D Fouhey
Yeah, yeah, but…
You haven’t been here before, yet you probably
have a fairly good understanding of this scene.
credit: D Fouhey
Pictorial Cues – Shading
[From A.M. Loh. The recovery of 3-D structure using visual texture patterns. PhD thesis]
Pictorial Cues – Perspective effects
Desk surface:
probably flat
credit: D Fouhey
Reality of 3D Perception
Really fantastic article on cues for 3D from Cutting and Vishton, 1995: https://pmvish.people.wm.edu/cutting%26vishton1995.pdf
Multi-view geometry problems
Calibration:
We need camera
intrinsics / K in order
to figure out where
the rays are
Camera 1
K
credit: D Fouhey
? Slide credit:
Noah Snavely
Multi-view geometry problems
Recovering structure:
? Given cameras and
correspondences,
find 3D.
Camera 1 Camera 3
Camera 2
R1,t1 R2,t2 R3,t3 Slide credit:
Noah Snavely
Multi-view geometry problems
Stereo/Epipolar
Geomery:
Given 2 cameras and
find where a point
could be
Camera 1 Camera 3
Camera 2
R1,t1 R2,t2 R3,t3 Slide credit:
Noah Snavely
Multi-view geometry problems
Motion:
Figure out R, t for a
set of cameras given
correspondences
Camera 1
R1,t1 ? Camera 2
R2,t2 ? ?
Camera 3
R3,t3 Slide credit:
Noah Snavely
Outline
• (Today) Calibration:
• Getting intrinsic matrix/K
• Single view geometry:
• measurements with 1 image
• Stereo/Epipolar geometry:
• 2 pictures → depthmap
• Structure from motion (SfM):
• 2+ pictures → cameras, pointcloud
credit: D Fouhey
Typical Perspective Model
principal point (image coords
of camera origin on retina)
Just moves camera origin
focal length
rotation translation
𝑓 0 𝑢!
𝒑≡ 0 𝑓 𝑣! 𝑹"#" 𝒕"#$ 𝑿%#$
0 0 1
2D Projection of X 3D point
credit: D Fouhey
Camera Calibration
𝑓 0 𝑢!
𝒑≡ 0 𝑓 𝑣! 𝑹"#" 𝒕"#$ 𝑿%#$
0 0 1
𝑢 𝑋
𝑣 ≡ 𝑴"#% 𝑌
𝑍
1
1
If I can get pairs of [X,Y,Z] and [u,v]
→ equations to constrain M
How do I get [X,Y,Z], [u,v]
credit: D Fouhey
Camera Calibration
A funny object with multiple planes.
credit: D Fouhey
Camera Calibration Targets
Using a tape measure
Known 2d Known 3d
image coords locations
880 214 312.747 309.140 30.086
43 203 305.796 311.649 30.356
270 197 307.694 312.358 30.418
886 347 310.149 307.186 29.298
745 302 311.937 310.105 29.216
943 128 311.202 307.572 30.682
476 590 307.106 306.876 28.660
419 214 309.317 312.490 30.230
317 335 307.435 310.151 29.318
783 521 308.253 306.300 28.881
235 427 306.650 309.301 28.905
665 429 308.069 306.831 29.189
655 362 309.671 308.834 29.029
427 333 308.255 309.955 29.267
412 415 307.546 308.613 28.963
746 351 311.036 309.206 28.913
434 415 307.518 308.175 29.069
525 234 309.950 311.262 29.990
716 308 312.160 310.772 29.080
602 187 311.988 312.709 30.514
Image credit: J. Hays
Camera Calibration Targets
A set of views of a plane (not covered today)
…
credit: D Fouhey
Camera Calibration Targets
A single, huge plane. What’s this for?
credit: D Fouhey
Camera calibration
pi
𝒑𝒊 ×𝑴𝑿𝒊 = 𝟎
credit: D Fouhey
Camera Calibration: Linear Method
𝒑𝒊 ×𝑴𝑿𝒊 = 𝟎
𝑢' 𝑴 𝟏 𝑿𝒊 0
𝑣' × 𝑴𝟐 𝑿𝒊 = 0
1 𝑴 𝟑 𝑿𝒊 0
…Some tedious math occurs…
(see Homography deriviation)
𝟎𝑻 −𝑿𝑻𝒊 𝒗𝒊 𝑿𝑻𝒊 𝑴𝑻𝟏 0
𝑿𝑻𝒊 𝟎𝑻 −𝒖𝒊 𝑿𝑻𝒊 𝑴𝑻𝟐 = 0
𝑻 𝑻 𝑻 0
−𝒗𝒊 𝑿𝒊 𝒖𝒊 𝑿 𝒊 𝟎 𝑻 𝑴𝟑
credit: D Fouhey
Camera Calibration: Linear Method
𝟎𝑻 −𝑿𝒊 𝑻 𝑻
𝑣' 𝑿𝒊 𝑴𝑻𝟏 0
𝑻 𝑻 =
𝑿𝒊 𝟎 𝑻
−𝑢' 𝑿𝒊 𝑻 𝑴𝟐 0
−𝑣' 𝑿𝑻𝒊 𝑢' 𝑿𝑻𝒊 𝟎𝑻 𝑴𝑻𝟑 0
Derivation from L. Lazebnik; note we negate one of the equations from the cross product
In Practice
Degenerate configurations (e.g., all points on one
plane) an issue. Usually need multiplane targets.
credit: D Fouhey
In Practice
I pulled a fast one.
credit: D Fouhey
In Practice
If pi = Mxi is overconstrained, the objective function
isn’t actually the one you care about.
Instead:
1) initialize parameters with linear model
2) Apply off-the-shelf non-linear optimizer to:
X?
p1 p2
credit: D Fouhey
Triangulation
Rays in principle should intersect, but in practice
usually don’t exactly due to noise, numerical errors.
X?
p1 p2
credit: D Fouhey
Triangulation – Geometry
Find shortest segment between viewing rays, set X to
be the midpoint of the segment.
p1 p2
credit: D Fouhey
Triangulation – Non-linear Optim.
! !
Find X minimizing 𝑑 𝒑" , 𝑴" 𝑿 + 𝑑 𝒑! , 𝑴! 𝑿
p1 p2
M 1X M 2X j
credit: D Fouhey
Triangulation – Linear Optimization
𝒑𝟏 ≡ 𝑴𝟏 𝑿 𝒑𝟏 ×𝑴𝟏 𝑿 = 𝟎 [𝒑𝟏𝒙 ]𝑴𝟏 𝑿 = 𝟎
𝒑𝟐 ≡ 𝑴𝟐 𝑿 𝒑𝟐 ×𝑴𝟐 𝑿 = 𝟎 [𝒑𝟐𝒙 ]𝑴𝟐 𝑿 = 𝟎
0 −𝑎& 𝑎! 𝑏"
Cross Prod. 𝒂×𝒃 = 𝑎& 0 −𝑎" 𝑏! = 𝒂' 𝒃
as matrix −𝑎! 𝑎" 0 𝑏&
credit: D Fouhey
Incidence Field is a Learnable Monocular 3D Prior
Pixel-wise
Determine
Depth &
Incidence Field
Normal
Incidence
Field
RANSAC Solver RANSAC Solver
RANSAC
Intrinsic Intrinsic
Tame a wild camera: in-the-wild monocular camera calibration. Shengjie Zhu, et al. NeurIPS’2023.
57
CVL Computer Vision Lab
Application: Image Resize & Crop Detection and Restoration
Tame a wild camera: in-the-wild monocular camera calibration. Shengjie Zhu, et al. NeurIPS’2023.
58
CVL Computer Vision Lab