0% found this document useful (0 votes)
36 views16 pages

Lowe Gordon 2005

Uploaded by

Abhijith AS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views16 pages

Lowe Gordon 2005

Uploaded by

Abhijith AS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Augmenting Reality, Naturally:

Scene Modelling, Recognition and Tracking


with Invariant Image Features

by
Iryna Gordon

in collaboration with David G. Lowe


Laboratory for Computational Intelligence
Department of Computer Science
University of British Columbia, Canada

1
the highlights

automation:
– acquisition of scene representation
– camera auto-calibration
– scene recognition from arbitrary viewpoints
computer vision
versatility:
– easy setup
– unconstrained scene geometry
– unconstrained camera motion
– distinctive natural features

2
natural features
Scale Invariant
Feature Transform
(SIFT)

 characterized by image location, scale, orientation and a descriptor vector

 invariant to image scale and orientation

 partially invariant to illumination & viewpoint changes

 robust to image noise

 highly distinctive and plentiful

David G. Lowe. Distinctive image features from scale-invariant keypoints.


International Journal of Computer Vision, 2004.

3
what the system needs

 computer

+
 off-the-shelf video camera

 set of reference images:


- unordered

+ - acquired with a handheld camera

- unknown viewpoints

- at least 2 images

4
what the system does

5
modelling reality: feature matching
• best match – smallest Euclidean distance between descriptor vectors
• 2-view matches found via Best-Bin-First (BBF) search on a k-d tree
• epipolar constraints computed for N -1 image pairs with RANSAC
• image pairs selected by constructing a spanning tree on the image set:

F. Schaffalitzky and A. Zisserman. Multi-view matching for unordered


image sets, or “How do I organize my holiday snaps?”. ECCV, 2002.

6
modelling reality: scene structure

• Euclidean 3D structure & auto-calibration from multi-view matches via


direct bundle adjustment:

min  w j x ij  x ij 
~ 2

a i j
ij

X
'j j'j 'j
f

pu
 

Z
  

'
x ij  ΠRi X j  t i   Π X ,Y , Z   

Y
' ' ' T

a
f

a
pv
j j j
 

Z
  
 
R. Szeliski and Sing Bing Kang. Recovering 3D shape and motion from image streams
using non-linear least squares. Cambridge Research, 1993.

7
modelling reality: an improvement
• Problem:
– computation time increases exponentially with the number of unknown parameters
– trouble converging if the cameras are too far apart (> 90 degrees)

• Solution:
– select a subset of images to construct a partial model
– incrementally update the model by resectioning and triangulation
– images processed in order automatically determined by the spanning tree

8
modelling reality: object placement
This image cannot currently be display ed.

initial placement in 2D determining relative depth adjusting size and pose

rendered object in reference images

9
camera pose estimation

• model points’ appearances in reference images are stored in a k-d tree


~  
• 2D-to-3D matches x tj , X j found with RANSAC for each video frame t
• camera pose computed via non-linear optimization:

min  w tj x tj  x tj    2 W p t  p 
~ 2
2

t
1
p j
t

• we regularize the solution to reduce virtual jitter


•  iteratively adjusted for each video frame:

 2N
 2 W p t  p   2
  2N 2 
t
1

W p t  p  2

t
1
10
video examples

11
in the future...

• optimize online computations for real-time performance:

– SIFT recognition with a frame-to-frame feature tracker

• introduce multiple feature types:

– SIFT features with edge-based image descriptors

• perform further testing:

– scalability to large environments

– multiple objects: real and virtual

12
13
thank you!

questions?

http://www.cs.ubc.ca/~skrypnyk/arproject/

14
modelling reality: an example

20 input images

0
20iterations:
10
50 iterations:error
error==62.5
1.7 pixels
4.2
0.2

15
registration accuracy
ground truth: ARToolKit marker measurement: virtual square

stationary camera moving camera moving camera

16

You might also like