0% found this document useful (0 votes)
1 views

Lecture18

Uploaded by

jm.zhang.97
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Lecture18

Uploaded by

jm.zhang.97
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

Intro to 3D +

Camera Calibration
CSE 803– Fall 2024, MSU

Xiaoming Liu

Thank many researchers who have made their slides and course materials available
Our goal: Recovery of 3D structure

J. Vermeer, Music Lesson, 1662

A. Criminisi, M. Kemp, and A. Zisserman, Bringing Pictorial Space to Life: computer techniques for the
analysis of paintings, Proc. Computers and the History of Art, 2002
Next few classes

• First: some intuitions and examples from


biological vision about 3D perception
• But first, a brief review
Let’s Take a Picture!

Photosensitive Material
Slide inspired by S. Seitz; image from Michigan Engineering
Projection Matrix

Projection (fx/z, fy/z) is matrix multiplication

f
O
𝑥
𝑓𝑥 𝑓 0 0 0
𝑦 𝑓𝑥/𝑧
𝑓𝑦 ≡ 0 𝑓 0 0 𝑧 →
𝑓𝑦/𝑧
𝑧 0 0 1 0 1

Slide inspired from L. Lazebnik


Single-view Ambiguity

X?
X?
X?

• Given a calibrated camera and an image, we


only know the ray corresponding to each pixel.
• Nowhere near enough constraints for X
Diagram credit: S. Lazebnik
Single-view Ambiguity

http://en.wikipedia.org/wiki/Ames_room
Slide Credit: J. Hays
Single-view Ambiguity

Diagram credit: J. Hays


Single-view Ambiguity

Rashad Alakbarov shadow sculptures


credit: D Fouhey
Resolving Single-view Ambiguity

• Shoot light (lasers etc.) out of your eyes!


• Con: not so biologically plausible, dangerous?
credit: D Fouhey
Resolving Single-view Ambiguity

• Shoot light (lasers etc.) out of your eyes!


• Con: not so biologically plausible, dangerous?
credit: D Fouhey
Resolving Single-view Ambiguity
X

x
x

• Stereo: given 2 calibrated cameras in different


views and correspondences, can solve for X
Original diagram credit: S. Lazebnik
Human stereopsis: disparity

Human eyes fixate on point in space – rotate so that


corresponding images form in centers of fovea.
credit: D Fouhey
Human stereopsis: disparity

Disparity occurs when


eyes fixate on one object;
others appear at different
visual angles
Stereo photography and stereo viewers
Take two pictures of the same subject from two slightly
different viewpoints and display so that each eye sees
only one of the images.

Image from fisher-price.com


Invented by Sir Charles Wheatstone, 1838

Slide credit: J. Hays


http://www.johnsonshawmuseum.org
Slide credit: J. Hays
http://www.well.com/~jimg/stereo/stereo_list.html
Slide credit: J. Hays
http://www.well.com/~jimg/stereo/stereo_list.html
Slide credit: J. Hays
Autostereograms

Exploit disparity as
depth cue using single
image.
(Single image random
dot stereogram, Single
image stereogram)

Slide credit: J. Hays, Images from magiceye.com


Autostereograms

Slide credit: J. Hays, Images from magiceye.com


Yeah, yeah, but…
Not all animals see stereo:
Prey animals (large field of view to spot predators)
Stereoblind people
Resolving Single-view Ambiguity

R,t
• One option: move, find correspondence.
• If you know how you moved and have a
calibrated camera, can solve for X
Original diagram credit: S. Lazebnik
Knowing R,t

• How do you know how


far you moved?
• Can solve via vision
• Can solve via ears
• Why does your inner
ear have 3 ducts?
• Can solve via signals
sent to muscles

credit: D Fouhey
Yeah, yeah, but…
You haven’t been here before, yet you probably
have a fairly good understanding of this scene.

credit: D Fouhey
Pictorial Cues – Shading

[Figure from Prados & Faugeras 2006]


Pictorial Cues – Texture

[From A.M. Loh. The recovery of 3-D structure using visual texture patterns. PhD thesis]
Pictorial Cues – Perspective effects

Image credit: S. Seitz


Pictorial Cues – Familiar Objects

Monitor: probably not


12 feet wide.

Desk surface:
probably flat
credit: D Fouhey
Reality of 3D Perception

• 3D perception is absurdly complex and


involves integration of many cues:
• Learned cues for 3D
• Stereo between eyes
• Stereo via motion
• Integration of known motion signals to muscles
(efferent copy), acceleration sensed via ears
• Past experience of touching objects
• All connect: learned cues from 3D probably
come from stereo/motion cues in large part

Really fantastic article on cues for 3D from Cutting and Vishton, 1995: https://pmvish.people.wm.edu/cutting%26vishton1995.pdf
Multi-view geometry problems
Calibration:
We need camera
intrinsics / K in order
to figure out where
the rays are

Camera 1
K
credit: D Fouhey
? Slide credit:
Noah Snavely
Multi-view geometry problems
Recovering structure:
? Given cameras and
correspondences,
find 3D.

Camera 1 Camera 3
Camera 2
R1,t1 R2,t2 R3,t3 Slide credit:
Noah Snavely
Multi-view geometry problems
Stereo/Epipolar
Geomery:
Given 2 cameras and
find where a point
could be

Camera 1 Camera 3
Camera 2
R1,t1 R2,t2 R3,t3 Slide credit:
Noah Snavely
Multi-view geometry problems
Motion:
Figure out R, t for a
set of cameras given
correspondences

Camera 1
R1,t1 ? Camera 2
R2,t2 ? ?
Camera 3
R3,t3 Slide credit:
Noah Snavely
Outline

• (Today) Calibration:
• Getting intrinsic matrix/K
• Single view geometry:
• measurements with 1 image
• Stereo/Epipolar geometry:
• 2 pictures → depthmap
• Structure from motion (SfM):
• 2+ pictures → cameras, pointcloud

credit: D Fouhey
Typical Perspective Model
principal point (image coords
of camera origin on retina)
Just moves camera origin
focal length

rotation translation

𝑓 0 𝑢!
𝒑≡ 0 𝑓 𝑣! 𝑹"#" 𝒕"#$ 𝑿%#$
0 0 1
2D Projection of X 3D point
credit: D Fouhey
Camera Calibration
𝑓 0 𝑢!
𝒑≡ 0 𝑓 𝑣! 𝑹"#" 𝒕"#$ 𝑿%#$
0 0 1
𝑢 𝑋
𝑣 ≡ 𝑴"#% 𝑌
𝑍
1
1
If I can get pairs of [X,Y,Z] and [u,v]
→ equations to constrain M
How do I get [X,Y,Z], [u,v]
credit: D Fouhey
Camera Calibration
A funny object with multiple planes.

credit: D Fouhey
Camera Calibration Targets
Using a tape measure
Known 2d Known 3d
image coords locations
880 214 312.747 309.140 30.086
43 203 305.796 311.649 30.356
270 197 307.694 312.358 30.418
886 347 310.149 307.186 29.298
745 302 311.937 310.105 29.216
943 128 311.202 307.572 30.682
476 590 307.106 306.876 28.660
419 214 309.317 312.490 30.230
317 335 307.435 310.151 29.318
783 521 308.253 306.300 28.881
235 427 306.650 309.301 28.905
665 429 308.069 306.831 29.189
655 362 309.671 308.834 29.029
427 333 308.255 309.955 29.267
412 415 307.546 308.613 28.963
746 351 311.036 309.206 28.913
434 415 307.518 308.175 29.069
525 234 309.950 311.262 29.990
716 308 312.160 310.772 29.080
602 187 311.988 312.709 30.514
Image credit: J. Hays
Camera Calibration Targets
A set of views of a plane (not covered today)


credit: D Fouhey
Camera Calibration Targets
A single, huge plane. What’s this for?

credit: D Fouhey
Camera calibration

• Given n points with known 3D coordinates Xi


and known image projections pi, estimate the
camera parameters Xi

pi

Slide credit: S. Lazebnik


Camera Calibration: Linear Method
𝒑𝒊 ≡ 𝑴𝑿𝒊
Remember (from geometry): this implies MXi pi
are scaled copies of each other
𝒑𝒊 = 𝜆𝑴𝑿𝒊 , 𝜆 ≠ 0
Remember (from homography fitting): this
implies their cross product is 0

𝒑𝒊 ×𝑴𝑿𝒊 = 𝟎

credit: D Fouhey
Camera Calibration: Linear Method
𝒑𝒊 ×𝑴𝑿𝒊 = 𝟎
𝑢' 𝑴 𝟏 𝑿𝒊 0
𝑣' × 𝑴𝟐 𝑿𝒊 = 0
1 𝑴 𝟑 𝑿𝒊 0
…Some tedious math occurs…
(see Homography deriviation)
𝟎𝑻 −𝑿𝑻𝒊 𝒗𝒊 𝑿𝑻𝒊 𝑴𝑻𝟏 0
𝑿𝑻𝒊 𝟎𝑻 −𝒖𝒊 𝑿𝑻𝒊 𝑴𝑻𝟐 = 0
𝑻 𝑻 𝑻 0
−𝒗𝒊 𝑿𝒊 𝒖𝒊 𝑿 𝒊 𝟎 𝑻 𝑴𝟑
credit: D Fouhey
Camera Calibration: Linear Method
𝟎𝑻 −𝑿𝒊 𝑻 𝑻
𝑣' 𝑿𝒊 𝑴𝑻𝟏 0
𝑻 𝑻 =
𝑿𝒊 𝟎 𝑻
−𝑢' 𝑿𝒊 𝑻 𝑴𝟐 0
−𝑣' 𝑿𝑻𝒊 𝑢' 𝑿𝑻𝒊 𝟎𝑻 𝑴𝑻𝟑 0

How many linearly independent equations?


2
How many equations per [u,v] + [X,Y,Z] pair?
2
If M is 3x4, how many degrees of freedom?
11
credit: D Fouhey
Camera Calibration: Linear Method
𝑻 𝑻
𝟎𝑻 𝑿𝒊 −𝑣$ 𝑿𝒊
𝑿𝑻𝟏 −𝑢$ 𝑿𝑻𝒊 𝑴𝑻𝟏 0
𝟎𝑻 𝑻 =
⋯ ⋯ ⋯ 𝑴𝟐 0
𝟎𝑻 𝑿𝑻𝒏 −𝑣$ 𝑿𝑻𝒏 𝑴𝑻𝟑 0
𝑿𝑻𝒏 𝟎 𝑻 −𝑢- 𝑿𝑻𝒏
How do we solve problems of the form
arg min 𝑨𝒏 !! , 𝒏 !! = 1 ?
Eigenvector of ATA with smallest eigenvalue

Derivation from L. Lazebnik; note we negate one of the equations from the cross product
In Practice
Degenerate configurations (e.g., all points on one
plane) an issue. Usually need multiplane targets.

credit: D Fouhey
In Practice
I pulled a fast one.

We want: 𝒑 ≡ 𝑲"#" [𝑹"#" , 𝒕"#$ ] 𝑿%#$


We get: 𝒑 ≡ 𝑴"#% 𝑿%#$
What’s the difference between K[R,t] and M?

Solution: QR-decomposition on left-most 3x3 matrix


→ finite options of a upper triangular matrix * rotation

credit: D Fouhey
In Practice
If pi = Mxi is overconstrained, the objective function
isn’t actually the one you care about.
Instead:
1) initialize parameters with linear model
2) Apply off-the-shelf non-linear optimizer to:

B proj 𝑴𝑿𝒊 − 𝑢' , 𝑣' . /


/

Advantage: can also add radial distortion, not


optimize over known variables, add constraints
credit: D Fouhey
What Does This Get You?
Given projection pi of unknown 3D point X in two or
more images (with known cameras Mi), find X
Triangulation
Given projection pi of unknown 3D point X in two or
more images (with known cameras Mi), find X
Why is the calibration here important?

X?

p1 p2

credit: D Fouhey
Triangulation
Rays in principle should intersect, but in practice
usually don’t exactly due to noise, numerical errors.

X?

p1 p2

credit: D Fouhey
Triangulation – Geometry
Find shortest segment between viewing rays, set X to
be the midpoint of the segment.

p1 p2

credit: D Fouhey
Triangulation – Non-linear Optim.
! !
Find X minimizing 𝑑 𝒑" , 𝑴" 𝑿 + 𝑑 𝒑! , 𝑴! 𝑿

p1 p2
M 1X M 2X j

credit: D Fouhey
Triangulation – Linear Optimization
𝒑𝟏 ≡ 𝑴𝟏 𝑿 𝒑𝟏 ×𝑴𝟏 𝑿 = 𝟎 [𝒑𝟏𝒙 ]𝑴𝟏 𝑿 = 𝟎
𝒑𝟐 ≡ 𝑴𝟐 𝑿 𝒑𝟐 ×𝑴𝟐 𝑿 = 𝟎 [𝒑𝟐𝒙 ]𝑴𝟐 𝑿 = 𝟎

0 −𝑎& 𝑎! 𝑏"
Cross Prod. 𝒂×𝒃 = 𝑎& 0 −𝑎" 𝑏! = 𝒂' 𝒃
as matrix −𝑎! 𝑎" 0 𝑏&

Two eqns per


[𝒑𝟏𝒙 ]𝑴𝟏 𝑿 = 𝟎 ([𝒑𝟏𝒙 ]𝑴𝟏 )𝑿 = 𝟎
camera for 3
[𝒑𝟐𝒙 ]𝑴𝟐 𝑿 = 𝟎 ([𝒑𝟐𝒙 ]𝑴𝟐 )𝑿 = 𝟎 unkn. in X

credit: D Fouhey
Incidence Field is a Learnable Monocular 3D Prior

Learnable Learnable RGB

Pixel-wise
Determine
Depth &
Incidence Field
Normal

Incidence
Field
RANSAC Solver RANSAC Solver

RANSAC

Intrinsic Intrinsic

Tame a wild camera: in-the-wild monocular camera calibration. Shengjie Zhu, et al. NeurIPS’2023.
57
CVL Computer Vision Lab
Application: Image Resize & Crop Detection and Restoration

• Original (Natural) Image: • Resized & Cropped (Modified) Image:

Tame a wild camera: in-the-wild monocular camera calibration. Shengjie Zhu, et al. NeurIPS’2023.
58
CVL Computer Vision Lab

You might also like