0% found this document useful (0 votes)
28 views47 pages

Structure From Motion: Class 9

The document discusses structure from motion techniques including factorization, sequential structure from motion, and calibrated structure from motion. It covers factorization methods like affine, projective, and perspective factorization. It also describes sequential structure from motion which initializes from two views and extends to additional views, as well as techniques for calibrated cameras including the 5-point algorithm for relative motion and 3-point algorithm for pose estimation. Finally it mentions recent work on minimal solvers using Groebner bases.

Uploaded by

Ashoka Vanjare
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views47 pages

Structure From Motion: Class 9

The document discusses structure from motion techniques including factorization, sequential structure from motion, and calibrated structure from motion. It covers factorization methods like affine, projective, and perspective factorization. It also describes sequential structure from motion which initializes from two views and extends to additional views, as well as techniques for calibrated cameras including the 5-point algorithm for relative motion and 3-point algorithm for pose estimation. Finally it mentions recent work on minimal solvers using Groebner bases.

Uploaded by

Ashoka Vanjare
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 47

Structure from motion

Class 9
Read Chapter 5
TexPoint fonts used in EMF.
Read the TexPoint manual before you delete this box.: AAAAAA
Geometric Computer Vision course schedule
(tentative)
Lecture Exercise
Sept 16 Introduction -
Sept 23 Geometry & Camera model Camera calibration
Sept 30 Single View Metrology
(Changchang Wu)
Measuring in images
Oct. 7 Feature Tracking/Matching Correspondence computation
Oct. 14 Epipolar Geometry F-matrix computation
Oct. 21 Shape-from-Silhouettes Visual-hull computation
Oct. 28 Stereo matching papers
Nov. 4 Stereo matching (continued) Project proposals
Nov. 11 Structured light and active range sensing Papers
Nov. 18 Structure from motion and visual SLAM Papers
Nov. 25 Multi-view geometry and self-calibration Papers
Dec. 2 3D modeling, registration and
range/depth fusion (Christopher Zach?)
Papers
Dec. 9 Shape-from-X and image-based rendering Papers
Dec. 16 Final project presentations Final project presentations
Todays class
Structure from motion

factorization
sequential

bundle adjustment


Factorization
Factorise observations in structure of the scene
and motion/calibration of the camera

Use all points in all images at the same time

Affine factorisation
Projective factorisation
Affine camera
The affine projection equations are


1

(
(
(
(

=
(

j
j
j
y
i
x
i
ij
ij
Z
Y
X
P
P
y
x

1
0001
1
(
(
(
(

(
(
(

=
(
(
(

j
j
j
y
i
x
i
ij
ij
Z
Y
X
P
P
y
x

~
~
4
4
(
(
(

=
(

=
(
(

j
j
j
y
i
x
i
ij
ij
y
i ij
x
i ij
Z
Y
X
P
P
y
x
P y
P x
how to find the origin? or for that matter a 3D reference point?
affine projection preserves center of gravity

=
i
ij ij ij
x x x
~

=
i
ij ij ij
y y y
~
Orthographic factorization
The ortographic projection equations are

where
n j m i
j i ij
,..., 1 , ,..., 1 , M m = = = P
All equations can be collected for all i and j

where
| |
n
m
mn m m
n
n
M ,..., M , M , ,
m m m
m m m
m m m
2 1
2
1
2 1
2 22 21
1 12 11
=
(
(
(
(
(

=
(
(
(
(

= M
P
P
P
P m

M P m =
M
~
~
m
(
(
(

=
(

=
(

=
j
j
j
j
y
i
x
i
i
ij
ij
ij
Z
Y
X
,
P
P
,
y
x
P
Note that P and M are resp. 2mx3 and 3xn matrices and
therefore the rank of m is at most 3
(Tomasi Kanade92)
Orthographic factorization
Factorize m through singular value decomposition

An affine reconstruction is obtained as follows
T
V U m E =
T
V M U P E = =
~
,
~
(Tomasi Kanade92)
| |
n
m
mn m m
n
n
M ,..., M , M
m m m
m m m
m m m
min
2 1
2
1
2 1
2 22 21
1 12 11
(
(
(
(

(
(
(

P
P
P

Closest rank-3 approximation yields MLE!


0
~ ~
1
~ ~
1
~ ~
1
1
1
=
=
=



T
T
T
T
T
T
y
i
x
i
y
i
y
i
x
i
x
i
P P
P P
P P
A A
A A
A A
0
~ ~
1
~ ~
1
~ ~
=
=
=
T
T
T
y
i
x
i
y
i
y
i
x
i
x
i
P P
P P
P P
C
C
C
A metric reconstruction is obtained as follows

Where A is computed from
Orthographic factorization
Factorize m through singular value decomposition

An affine reconstruction is obtained as follows
T
V U m E =
T
V M U P E = =
~
,
~
M A M A P P
~
,
~
1
= =

0
1
1
=
=
=
T
T
T
y
i
x
i
y
i
y
i
x
i
x
i
P P
P P
P P
3 linear equations per view on
symmetric matrix C (6DOF)

A can be obtained from C
through Cholesky factorisation
and inversion
(Tomasi Kanade92)
Examples
Tomasi Kanade92,
Poelman & Kanade94
Examples
Tomasi Kanade92,
Poelman & Kanade94
Examples
Tomasi Kanade92,
Poelman & Kanade94
Examples
Tomasi Kanade92,
Poelman & Kanade94
Perspective factorization
The camera equations


for a fixed image i can be written in matrix
form as

where
m j m i
j i ij ij
,..., 1 , ,..., 1 , M m = = = P
M P m
i i i
= A
| | | |
( )
im i i i
m im i i i
,..., , diag
M ,..., M , M , m ,..., m , m
2 1
2 1 2 1
= A
= = M m
Perspective factorization
All equations can be collected for all i as

where
PM m=
(
(
(
(

=
(
(
(
(

A
A
A
=
m n n
P
P
P
P
m
m
m
m
...
,
...
2
1
2 2
1 1

In these formulas m are known, but A
i
,P and M
are unknown
Observe that PM is a product of a 3mx4 matrix
and a 4xn matrix, i.e. it is a rank-4 matrix
Perspective factorization
algorithm
Assume that A
i
are known, then PM is known.

Use the singular value decomposition
PM=UE V
T


In the noise-free case
E=diag(o
1
,o
2
,o
3
,o
4
,0, ,0)
and a reconstruction can be obtained by setting:

P=the first four columns of UE.
M=the first four rows of V.

Iterative perspective
factorization
When A
i
are unknown the following algorithm can be
used:
1. Set
ij
=1 (affine approximation).
2. Factorize PM and obtain an estimate of P and M.
If o
5
is sufficiently small then STOP.
3. Use m, P and M to estimate A
i
from the camera
equations (linearly) m
i
A
i
=P
i
M
4. Goto 2.
In general the algorithm minimizes the proximity
measure P(A,P,M)=o
5

Note that structure and motion recovered
up to an arbitrary projective transformation
Further Factorization work
Factorization with uncertainty

Factorization for dynamic scenes
(Irani & Anandan, IJCV02)
(Costeira and Kanade 94)
(Bregler et al. 00, Brand 01)
(Yan and Pollefeys, 05/06)
practical structure and motion
recovery from images
Obtain reliable matches using matching or
tracking and 2/3-view relations
Compute initial structure and motion
Refine structure and motion
Auto-calibrate
Refine metric structure and motion

Initialize Motion
(P
1
,P
2
compatibel with F)
Sequential Structure and
Motion Computation
Initialize Structure
(minimize reprojection error)
Extend motion
(compute pose through matches
seen in 2 or more previous views)
Extend structure
(Initialize new structure,
refine existing structure)
Computation of initial
structure and motion
according to Hartley and Zisserman
this area is still to some extend a black-art
All features not visible in all images
No direct method (factorization not applicable)
Build partial reconstructions and assemble
(more views is more stable, but less corresp.)

1) Sequential structure and motion recovery
2) Hierarchical structure and motion recovery
Sequential structure and
motion recovery
Initialize structure and motion from two
views
For each additional view
Determine pose
Refine and extend structure

Determine correspondences robustly by
jointly estimating matches and epipolar
geometry



Initial structure and motion
| |
| | | | e ea F e P
0 I P
T
x
+ =
=
2
1
Epipolar geometry Projective calibration
0
1 2
= Fm m
T
compatible with F
Yields correct projective camera setup
(Faugeras92,Hartley92)
Obtain structure through triangulation
Use reprojection error for minimization
Avoid measurements in projective space
Compute Pi+1 using robust approach (6-point RANSAC)
Extend and refine reconstruction
) x ,..., X(x P x
1 1
=
i i i
2D-2D
2D-3D 2D-3D
m
i

m
i+1

M
new view
Determine pose towards existing structure
Compute P with 6-point RANSAC
Generate hypothesis using 6 points


Count inliers
Projection error
( ) ( ) ? x , x ..., , x X P
1 1
t d
i i i
<

Back-projection error ( ) i j t d
j i ij
< < ?, x , x F
Re-projection error ( ) ( ) t d
i i i i
<

x , x , x ..., , x X P
1 1
3D error
( ) ( ) ? X , x P
3
-1
D i i
t d <
A
Projection error with covariance ( ) ( ) t d
i i i
<
A
x , x ..., , x X P
1 1
Expensive testing? Abort early if not promising
Verify at random, abort if e.g. P(wrong)>0.95
(Chum and Matas, BMVC02)
Calibrated structure from motion
Equations more complicated, but less degeneracies
For calibrated cameras:
5-point relative motion (5DOF)
Nister CVPR03
3-point pose estimation (6DOF)
Haralick et al. IJCV94



D. Nistr, An efficient solution to the five-point relative pose
problem, In Proc. IEEE Computer Society Conference on Computer Vision
and Pattern Recognition (CVPR 2003), Volume 2, pages. 195-202, 2003.

R. Haralick, C. Lee, K. Ottenberg, M. Nolle. Review and Analysis of
Solutions of the Three Point Perspective Pose Estimation Problem.
Intl Journal of Computer Vision, 13, 3, 331-356, 1994.

Linear equations for 5 points




Linear solution space

Non-linear constraints

5-point relative motion
E = xX + yY + zZ + wW
detE = 0
EE
>
E
1
2
tr ace(EE
>
)E = 0
10 cubic polynomials
w = 1 scale does not matter, choose
(Nister, CVPR03)
5-point relative motion
Perform Gauss-Jordan elimination on polynomials
-z
-z
-z
[n]
represents polynomial of degree n in z
(Nister, CVPR03)
Three points perspective pose p3p
(Haralick et al., IJCV94)
All techniques yield 4
th
order
polynomial

Haralick et al. recommends using
Finsterwalders technique as it
yields the best results numerically
1903 1841
Minimal solvers
Lots of recent activity using Groebner bases:
Henrik Stewnius, David Nistr, Fredrik Kahl, Frederik Schaffalitzky: A
Minimal Solution for Relative Pose with Unknown Focal Length,
CVPR 2005.
H. Stewnius, D. Nistr, M. Oskarsson, and K. strm. Solutions to
minimal generalized relative pose problems. Omnivis 2005.
D. Nistr, A Minimal solution to the generalised 3-point pose
problem, CVPR 2004
Martin Bujnak, Zuzana Kukelova, Toms Pajdla: A general solution to
the P4P problem for camera with unknown focal length. CVPR
2008.
Brian Clipp, Christopher Zach, Jan-Michael Frahm and Marc Pollefeys, A
New Minimal Solution to the Relative Pose of a Calibrated Stereo
Camera with Small Field of View Overlap, ICCV 2009.
Zuzana Kukelova, Martin Bujnak, Toms Pajdla: Automatic Generator
of Minimal Problem Solvers. ECCV 2008.





Changchangs SfM code

for iconic graph
uses 5-point+RANSAC for 2-view initialization
uses 3-point+RANSAC for adding views
performs bundle adjustment
For additional images
use 3-point+RANSAC pose estimation
Hierarchical structure and motion
recovery
Compute 2-view
Compute 3-view
Stitch 3-view reconstructions
Merge and refine reconstruction

F
T
H
PM
Stitching 3-view reconstructions
Different possibilities
1. Align (P
2
,P
3
) with (P
1
,P
2
)
( ) ( )
-1
2 3
-1
1 2
H
H P' , P H P' , P min arg
A A
d d +
2. Align X,X (and CC) ( )

j
j j A
d HX' , X min arg
H
3. Minimize reproj. error
( )
( )

+
j
j j
j
j j
d
d
x' , HX P'
x , X' PH min arg
1 -
H
4. MLE (merge) ( )

j
j j
d x , PX min arg
X P,
Refining structure and motion
Minimize reprojection error



Maximum Likelyhood Estimation
(if error zero-mean Gaussian noise)
Huge problem but can be solved efficiently
(Bundle adjustment)
( )

= =
m
k
n
i
i k
D
i k
1 1
2
ki
M

, P

, m min
Non-linear least-squares


Newton iteration
Levenberg-Marquardt
Sparse Levenberg-Marquardt
(P) X f =
(P) X argmin
P
f
Newton iteration
Taylor approximation
A + ~ A + J ) (P ) (P
0 0
f f
P
X
J
c
c
=
Jacobian
) (P X
1
f
A = A ~ J J ) (P X ) (P X
0 0 1
e f f
( )
0
T
-1
T
0
T T
J J J J J J e e = A = A
A + =
+ i 1 i
P P
( )
0
T
-1
T
J J J e = A
( )
0
1 - T
-1
1 - T
J J J e E E = A
normal eq.
Levenberg-Marquardt
0
T T
J N J J e = A = A
0
T
J N' e = A
Augmented normal equations
Normal equations
J) diag(J J J N'
T T
+ =
3
0
10

=
10 / : success
1 i i
=
+
i i
10 : failure =
solve again
accept
small ~ Newton (quadratic convergence)
large ~ descent (guaranteed decrease)
Levenberg-Marquardt
Requirements for minimization
Function to compute f
Start value P
0

Optionally, function to compute J
(but numerical ok, too)

Sparse Levenberg-Marquardt
complexity for solving
prohibitive for large problems
(100 views 10,000 points ~30,000 unknowns)

Partition parameters
partition A
partition B (only dependent on A and itself)
0
T -1
J N' e = A
3
N
Sparse bundle adjustment
residuals:
normal equations:

with






note: tie points should be in partition A
Sparse bundle adjustment
normal equations:
modified normal equations:
solve in two parts:
Sparse bundle adjustment
U
1
U
2

U
3

W
T

W
V
P
1
P
2
P
3
M

Jacobian of has sparse block structure
= J
= = J J N
T
12xm 3xn
(in general
much larger)
im.pts.
view 1
( ) ( )

= =
m
k
n
i
i k
D
1 1
2
ki
M

, m
Needed for non-linear minimization
Sparse bundle adjustment
Eliminate dependence of camera/motion
parameters on structure parameters
Note in general 3n >> 11m
W
T
V
U-WV
-1
W
T

=
(



N
I 0
WV I
1
11xm 3xn
Allows much more efficient
computations
e.g. 100 views,10000 points,
solve 1000x1000, not 30000x30000

Often still band diagonal
use sparse linear algebra algorithms
Sparse bundle adjustment
normal equations:
modified normal equations:
solve in two parts:
Sparse bundle adjustment
Covariance estimation
-1
WV Y =
( )
1
a
T
b
1 -
a
V Y Y
W WV U

+
+ =
=
Y
a ab
= -
Related problems
On-line structure from motion and
SLaM (Simultaneous Localization
and Mapping)
Kalman filter (linear)
Particle filters (non-linear)
Open challenges
Large scale structure from motion
Complete building
Complete city


Next class:
Multi-View Geometry
and Self-Calibration

You might also like