0% found this document useful (0 votes)
21 views

Motion Estimation

The document discusses motion estimation in images and video. It outlines different causes of 2D motion, the difference between 2D motion and optical flow, and defines the optical flow equation. The document also discusses ambiguities in motion estimation, known as the aperture problem, where motion orthogonal to edges can be determined but motion along edges cannot due to the underdetermined optical flow equation in regions of flat texture or constant brightness.

Uploaded by

sasikala balan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Motion Estimation

The document discusses motion estimation in images and video. It outlines different causes of 2D motion, the difference between 2D motion and optical flow, and defines the optical flow equation. The document also discusses ambiguities in motion estimation, known as the aperture problem, where motion orthogonal to edges can be determined but motion along edges cannot due to the underdetermined optical flow equation in regions of flat texture or constant brightness.

Uploaded by

sasikala balan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 108

Motion Estimation

Yao Wang
Tandon School of Engineering, New York University

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 1


Outline

• What causes 2D motion? 2-D motion vs. optical flow


• Optical flow equation and ambiguity in motion estimation
• General methodologies in motion estimation
– Motion representation
– Motion estimation criterion
– Optimization methods
• Lucas-Kanade Flow Estimation Method
– Lucase-Kandade method
– Pyramid LK
– KLT Tracker
• Block Matching Algorithm
– EBMA algorithm
– Half-pel EBMA
– Hierarchical EBMA (HBMA)
• Deep learning based methods
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 2
What Causes 2D Motion?

Static camera, moving scene Static scene, moving camera

Moving scene, moving camera Static camera, moving scene, changing lighting
From http://courses.cs.washington.edu/courses/cse576/16sp/Slides/15_Flow.pdf

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 3


3D Motion to 2D Motion

From Katsaggelos’s Coursera Course on Image Processing, Lecture on motion estimation.

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 4


3-D Motion -> 2-D Motion

3-D MV

2-D MV

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 5


Sample 2D Motion Field

At each pixel (or center of a block) of


the anchor image (right), the motion
vector describes the 2D displacement
between this pixel and its
corresponding pixel in the other target
image (left)

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 6


Motion Field Definition

Anchor frame: y 1 ( x)
Target frame: y 2 (x)
Motion parameters: a
Motion vector at a
pixel in the anchor
frame: d (x)
Motion field: d(x; a), x Î L
Mapping function:
w (x; a) = x + d(x; a), x Î L

Ψ!(x)= Ψ"(w(x;a))=Ψ"(x+d(x;a))

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 7


Occlusion Effect

• Motion is undefined in uncovered


regions in the anchor frame
• Ideally a 2D motion field should
indicate uncovered pixels as
occluded instead of giving false MVs (target)

(Anchor)

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 8


Outline

• What causes 2D motion?


• 2-D motion vs. optical flow
• Optical flow equation and ambiguity in motion estimation
• General methodologies in motion estimation
– Motion representation
– Motion estimation criterion
– Optimization methods
• Lucas-Kanade Flow Estimation Method
– Lucase-Kandade method
– Pyramid LK
– KLT Tracker
• Block Matching Algorithm
– EBMA algorithm
– Half-pel EBMA
– Hierarchical EBMA (HBMA)
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 9
2-D Motion vs. Optical Flow

• 2-D Motion: Projection of 3-D motion, depending on 3D object motion and


projection operator
• Optical flow: “Perceived” 2-D motion based on changes in image pattern,
depending on the actual 3D motion as well as illumination and object surface
texture

On the left, a sphere is rotating


under a constant ambient
illumination, but the observed image
does not change.

On the right, a point light source is


rotating around a stationary
sphere, causing the highlight point
on the sphere to rotate.

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 10


True Motion vs. Perceived Motion over Small
Aperture

https://en.wikipedia.org/wiki/Barberpole_illusion

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 11


Outline

• What causes 2D motion?


• 2-D motion vs. optical flow
• Optical flow equation and ambiguity in motion estimation
• General methodologies in motion estimation
– Motion representation
– Motion estimation criterion
– Optimization methods
• Lucas-Kanade Flow Estimation Method
– Lucase-Kandade method
– Pyramid LK
– KLT Tracker
• Block Matching Algorithm
– EBMA algorithm
– Half-pel EBMA
– Hierarchical EBMA (HBMA)
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 12
Problem definition: optical flow estimation

H(x,y) (anchor) I(x,y) (target)

How to estimate pixel motion from image H to image I?


• Solve pixel correspondence problem
– given a pixel in H look for nearby pixels of the same color in I

Key assumptions
• color constancy: a point in H looks the same in I
– For grayscale images, this is brightness constancy
• small motion: points do not move very far
Courtesy of Rob Fergus, http://cs.nyu.edu/~fergus/teaching/vision/5_6_Fitting_Matching_Opticalflow.pdf
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 13
Optical flow constraints
(Assuming grayscale images and small motion)

• brightness constancy: (x,y) in H is moved to (x+u, y+v) in I


• H(x,y)= I(x+u, y+v), (u,v) is called flow vector or motion vector

• small motion: (u and v are less than 1 pixel)


– Using Taylor series expansion of I:

Courtesy of Rob Fergus, http://cs.nyu.edu/~fergus/teaching/vision/5_6_Fitting_Matching_Opticalflow.pdf


Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 14
Optical flow equation

Combining these two equations

𝐼% 𝑥, 𝑦 = 𝐼 𝑥, 𝑦 − 𝐻(𝑥, 𝑦)

Optical flow equation:


𝐼# 𝑥, 𝑦 𝑢 𝑥, 𝑦 + 𝐼$ 𝑥, 𝑦 𝑣 𝑥, 𝑦 = 𝐻 𝑥, 𝑦 − 𝐼 𝑥, 𝑦
or 𝛻 𝐼 ! 𝒖 = −𝐼"

Courtesy of Rob Fergus, http://cs.nyu.edu/~fergus/teaching/vision/5_6_Fitting_Matching_Opticalflow.pdf


Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 15
Ambiguities in Motion Estimation
(Aperture Problem)

• Optical flow equation only constrains


the flow magnitude in the gradient
direction (vn)
• The flow vector in the tangent
direction (vt) is under-determined
• In regions with constant brightness
( Ñy = 0), the flow is also
indeterminate
• --> Motion estimation is unreliable in
regions with flat texture. In non-flat
regions, motion orthogonal to the
edge can be determined reliably, not
not along the edge. Motion
estimation is most accurate around
corners.

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 16


Example

• Suppose the image is a vertical bar moving vertically


over a flat background
– Can you observe any change or detect the motion?
• No!
– What if the bar moves horizontally?
• Yes

– In this case, the gradient direction is horizontal, the tangent


direction is vertical
– Motion is indeterminate in the tangent direction

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 17


Example

• Suppose the imaged scene is part of the side of a big


box with a constant green color, and the box is moving
in the direction parallel to the side
– Can you observe any change or detect the motion?
• No
– What if the box moves towards the camera?
• No

– In this case, the gradient magnitude is zero everywhere


– Motion is indeterminate

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 18


Pop Quiz

• What is difference between true 2D motion and optical flow?

• Can we differentiate the optical flow due to camera motion, lighting


change, object motion?

• Which part of the flow vector cannot be determined from images?

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 19


Pop Quiz (w/answers)

• What is difference between true 2D motion and optical flow?


– Optical flow is the “perceived 2D motion”, which may not be the same as the
“true 2D motion”

• Can we differentiate the optical flow due to camera motion, lighting


change, object motion?
– When we just look at a small area of the image (aperture), we cannot
tell
– When we look at the whole picture, and using our accumulated
knowledge about what we typically see (prior knowledge), we may be
able to
• Which part of the flow vector cannot be determined from images?
– Along the edges
– In flat regions

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 20


Outline

• What causes 2D motion?


• 2-D motion vs. optical flow
• Optical flow equation and ambiguity in motion estimation
• General methodologies in motion estimation
– Motion representation
– Motion estimation criterion
– Optimization methods
• Lucas-Kanade Flow Estimation Method
– Lucase-Kandade method
– Pyramid LK
– KLT Tracker
• Block Matching Algorithm
– EBMA algorithm
– Half-pel EBMA
– Hierarchical EBMA (HBMA)
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 21
General Considerations
for Motion Estimation
• Two categories of approaches:
– Feature based: finding corresponding features in two different
images and then derive the entire motion field based on the
motion vectors at corresponding features.
• Last lecture!
• More often used for estimating global motion between the two
images due to camera motion (or view angle difference)
– Intensity based: directly finding MV at every pixel or parameters
that characterize the motion field based on constant intensity
assumption
• More often used for motion compensated prediction and filtering,
required in video coding, frame interpolation -> focus of this lecture

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 22


Three Problems in Motion Estimation

• How to represent the motion field?


• What criteria to use to estimate motion parameters?
• How to search motion parameters?

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 23


Motion Representation

Global: Pixel-based:
Entire motion field is One MV at each pixel,
represented by a few with some smoothness
global parameters constraint between
(affine, homography) adjacent MVs.

a= MV for each pixel

Block-based: Region-based:
Entire frame is divided Entire frame is divided
into blocks, and motion into regions, each
in each block is region corresponding
characterized by a few to an object or sub-
parameters (e.g. a object with consistent
constant MV) motion, represented by
a few parameters.
a= MV for each block a= Motion
parameters for
each region
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 24
Motion Representation: Mesh-Based

Cover an image with a mesh


(triangular or rectangular or
irregular), and define MVs at
mesh nodes, interpolate
MVs at other pixels from
nodal MVs.

Mesh is also known as a


control grid. Interpolation is
done using spline
interpolation. This method is
also known as spline-based
method.

Mostly used for deformable


registration in medical
images

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 25


Motion Estimation Criterion
• To minimize the displaced frame difference (DFD) (based on constant
intensity assumption)
EDFD (a) = å y 2 (x + d(x; a)) - y 1 (x) ® min
p

xÎL
a is motion parameter
p = 1 : MAD; P = 2 : MSE vector that defines the
entire motion field
• To satisfy the optical flow equation
p

( )
EOF (a) = ∑ ∇ψ 2(x) d(x;a)+ψ 2(x)−ψ 1(x) → min
T

! x∈Λ

• To impose additional smoothness constraint using regularization


technique (Important in pixel- and block-based representation)
Es (a) = å å
2
d(x; a) - d(y; a)
xÎL yÎN x

wDFD EDFD (a) + ws Es (a) ® min

• Bayesian (MAP) criterion: to maximize the a posteriori probability


P( D = dy 2 ,y1 ) ® max
Relation with previous notation: 𝜓"=I, 𝜓!=H
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 26
Relation Among Different Criteria

• OF criterion is good only if motion is small.


• OF criterion can yield closed-form solution when the
objective function is quadratic in MVs.
• When the motion is not small, can use coarse
exhaustive search to find a good initial solution, and
use this solution to deform target frame, and then apply
OF criterion between original anchor frame and the
deformed target frame.
• Bayesian criterion can be reduced to the DFD criterion
plus motion smoothness constraint

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 27


Optimization Methods
• Exhaustive search
– Typically used for the DFD criterion with p=1 (MAD)
– Guarantees reaching the global optimal
– Computation required may be unacceptable when number of
parameters to search simultaneously is large!
– Fast search algorithms reach sub-optimal solution in shorter time
• Gradient-based search
– Typically used for the DFD or OF criterion with p=2 (MSE)
• the gradient can often be calculated analytically
• When used with the OF criterion, closed-form solution may be obtained
– Reaches the local optimal point closest to the initial solution
• Multi-resolution search
– Search from coarse to fine resolution, faster than exhaustive search
– Avoid being trapped into a local minimum

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 28


Regularization for Dense Motion Estimation

• How do we estimate the motion at each pixel?


• Will yield chaotic motion field if we directly solve the optical flow
equation at that pixel (e.g. applying on 3 color channels to get 3
equations)
• General idea:
– Assuming motion field satisfies some properties (e.g., adjacent pixels
have similar MVs)
– Set up an optimization objective based on the DFD/OF as well as the
prior knowledge over the neighborhood
– Need to find MVs at all pixels 𝒖 𝒙 simultaneously or iteratively
– Adding such prior knowledge is known as Regularization

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 29


Outline

• What causes 2D motion?


• 2-D motion vs. optical flow
• Optical flow equation and ambiguity in motion estimation
• General methodologies in motion estimation
– Motion representation
– Motion estimation criterion
– Optimization methods
• Lucas-Kanade Flow Estimation Method
– Lucas-Kandade method
– Pyramid LK
– KLT Tracker
• Block Matching Algorithm
– EBMA algorithm
– Half-pel EBMA
– Hierarchical EBMA (HBMA)
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 30
Lucas-Kanade Method for Flow Estimation

• Optical flow equation give 1 equation to 2 unknowns at each pixel

• Assume pixels in a small neighborhood around the center pixel


have the same motion (u,v)
• If we use a 5x5 window, that gives us 25 equations per pixel!

Courtesy of Rob Fergus, http://cs.nyu.edu/~fergus/teaching/vision/5_6_Fitting_Matching_Opticalflow.pdf


Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 31
Lukas-Kanade Method

Prob: we have more equations than unknowns

Solution: solve least squares problem


• minimum least squares solution given by solution (in d) of:

• The summations are over all pixels in the K x K window, but


only use the solution to determine the MV at the center.
• This technique was proposed by Lukas & Kanade (1981)
Courtesy of Rob Fergus, http://cs.nyu.edu/~fergus/teaching/vision/5_6_Fitting_Matching_Opticalflow.pdf
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 32
Conditions for solvability

• Optimal (u, v) satisfies Lucas-Kanade equation

When is this Solvable?


• ATA should be invertible
• ATA should not be too small due to noise
– eigenvalues l1 and l2 of ATA should not be too small
• ATA should be well-conditioned
– l1/ l2 should not be too large (l1 = larger eigenvalue)
Courtesy of Rob Fergus, http://cs.nyu.edu/~fergus/teaching/vision/5_6_Fitting_Matching_Opticalflow.pdf

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 33


Eigenvectors of ATA

• Recall the Harris corner detector: M = ATA is the


second moment matrix
• The eigenvectors and eigenvalues of M relate to
edge direction and magnitude
• The eigenvector associated with the larger eigenvalue points in the
direction of fastest intensity change
• The other eigenvector is orthogonal to it
• Motion estimation is reliable only at locations with corners!

Courtesy of Rob Fergus, http://cs.nyu.edu/~fergus/teaching/vision/5_6_Fitting_Matching_Opticalflow.pdf

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 34


Interpreting the eigenvalues

Classification of image l2 “Edge”


points using eigenvalues
of the second moment l2 >> l1
matrix:
“Corner or multi-
directional”
l1 and l2 are large,
l1 ~ l2

l1 and l2 are small “Flat” “Edge”


region l1 >> l2

l1
Courtesy of Rob Fergus, http://cs.nyu.edu/~fergus/teaching/vision/5_6_Fitting_Matching_Opticalflow.pdf
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 35
Edge
Surface of 𝐴𝑑 − 𝑏

– large gradients orthogonal to edge


– large l1, small l2
Error surface has minimal along the edge
– Motion along edge cannot be determined
Courtesy of Rob Fergus, http://cs.nyu.edu/~fergus/teaching/vision/5_6_Fitting_Matching_Opticalflow.pdf
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 36
Low texture region

– gradients have small magnitude


– small l1, small l2
Error surface fairly flat. Local minima unreliable
–Motion around flat region cannot be determined
Courtesy of Rob Fergus, http://cs.nyu.edu/~fergus/teaching/vision/5_6_Fitting_Matching_Opticalflow.pdf
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 37
High textured region

– gradients are different, large magnitudes


– large l1, large l2
Error surface has unique minimum
– Motion can be determined well
Courtesy of Rob Fergus, http://cs.nyu.edu/~fergus/teaching/vision/5_6_Fitting_Matching_Opticalflow.pdf
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 38
Errors in Lukas-Kanade

What are the potential causes of errors in this procedure?


– Suppose ATA is easily invertible
– Suppose there is not much noise in the image
When our assumptions are violated
• Brightness constancy is not satisfied
• The motion is not small
• A point does not move like its neighbors
– window size is too large
– what is the ideal window size?

Courtesy of Rob Fergus, http://cs.nyu.edu/~fergus/teaching/vision/5_6_Fitting_Matching_Opticalflow.pdf

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 39


Problem when motion is large

• Optical flow equation is derived from Taylor extension


– Only correct when the motion is small
• When motion is large, there are ambiguity in estimating the motion by
trying to match over a small neighborhood

actual shift

estimated shift

nearest match is correct (no nearest match is incorrect


aliasing) (aliasing)

Which correspondence is correct?


Figure from Fergus
How to solve this ambiguity?

Courtesy of Rob Fergus, http://cs.nyu.edu/~fergus/teaching/vision/5_6_Fitting_Matching_Opticalflow.pdf


Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 40
Iterative Refinement

• Optical flow equation is correct only if true motion is small


• What if a point undergoes large motion (>1pixel)

• Iterative Lukas-Kanade Algorithm


1. Estimate velocity at each pixel by solving Lucas-Kanade
equations
2. Warp H towards I using the estimated flow field
- H’(x,y)=H(x-u,y-v)
- use image warping techniques
3. Repeat on I and H’ until convergence
4. Final motion is the sum of motion determined at each step.

Courtesy of Rob Fergus, http://cs.nyu.edu/~fergus/teaching/vision/5_6_Fitting_Matching_Opticalflow.pdf

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 41


Optical Flow: Iterative Estimation

Some Implementation Issues:


– Warping is not easy (ensure that errors in warping are smaller
than the estimate refinement)
– Warp one image (H), take derivatives of the other (I) so you
don’t need to re-compute the gradient after each iteration.
– Often useful to low-pass filter the images before motion
estimation (for better derivative estimation, and linear
approximations to image intensity)

Courtesy of Rob Fergus, http://cs.nyu.edu/~fergus/teaching/vision/5_6_Fitting_Matching_Opticalflow.pdf

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 42


Outline

• What causes 2D motion?


• 2-D motion vs. optical flow
• Optical flow equation and ambiguity in motion estimation
• General methodologies in motion estimation
– Motion representation
– Motion estimation criterion
– Optimization methods
• Lucas-Kanade Flow Estimation Method
– Lucase-Kandade method
– Pyramid LK
– KLT Tracker
• Block Matching Algorithm
– EBMA algorithm
– Half-pel EBMA
– Hierarchical EBMA (HBMA)
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 43
Another Way to Deal With Large Movement
Reduce the resolution!

Courtesy of Rob Fergus, http://cs.nyu.edu/~fergus/teaching/vision/5_6_Fitting_Matching_Opticalflow.pdf


Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 44
Coarse-to-fine optical flow estimation

u=1.25 pixels

u=2.5 pixels

u=5 pixels

image
image HH u=10 pixels image
image II

Gaussian pyramid of image H Gaussian pyramid of image I


Courtesy of Rob Fergus, http://cs.nyu.edu/~fergus/teaching/vision/5_6_Fitting_Matching_Opticalflow.pdf
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 45
Coarse-to-fine optical flow estimation

run iterative L-K


warp & upsample
run iterative L-K
.
.
.

imageHJ
image image
image II

Gaussian pyramid of image H Gaussian pyramid of image I


YaoCourtesy of Rob Fergus, http://cs.nyu.edu/~fergus/teaching/vision/5_6_Fitting_Matching_Opticalflow.pdf
Wang, 2021 ECE-GY 6123: Image and Video Processing 46
From Khurram Hassan‐Shafique CAP5415 Computer Vision 2003
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 47
From Khurram Hassan‐Shafique CAP5415 Computer Vision 2003

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 48


Open Competition for Optical Flow Estimation
http://vision.middlebury.edu/flow/

• A public domain database containing sample pairs of


images with ground truth flow
– Designed for evaluate motion estimation algorithms between
two frames
• Open to submission of algorithms and estimation
results
• The site evaluates the accuracy against ground truth
• Advance in many areas in computer vision has been
greatly facilitated by such challenges
– Image classification using deep learning
– Face detection, recognition
– Object detection and tracking

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 49


Challenge for Optical Flow Estimation
http://vision.middlebury.edu/flow/

Courtesy
Yao Wang, of Ali Farhadi. From http://courses.cs.washington.edu/courses/cse576/16sp/Slides/15_Flow.pdf
2021 ECE-GY 6123: Image and Video Processing 50
Challenge for Optical Flow Estimation
http://vision.middlebury.edu/flow/

Lucas-Kanade flow with pyramid Layer++


Courtesy of Ali Farhadi. From http://courses.cs.washington.edu/courses/cse576/16sp/Slides/15_Flow.pdf
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 51
Outline

• What causes 2D motion?


• 2-D motion vs. optical flow
• Optical flow equation and ambiguity in motion estimation
• General methodologies in motion estimation
– Motion representation
– Motion estimation criterion
– Optimization methods
• Lucas-Kanade Flow Estimation Method
– Lucase-Kandade method
– Pyramid LK
– KLT Tracker
• Block Matching Algorithm
– EBMA algorithm
– Half-pel EBMA
– Hierarchical EBMA (HBMA)
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 52
Kanade-Lucas-Tomasi (KLT) Tracker

• Detect feature points in the first image where both eigenvalues of the
moment matrix are large (similar to Harris corner detector)
• For each feature point in the first image, estimate its motion between first
and second frames using a small window surrounding it using the Lucas-
Kanade flow estimation method
• Repeat between frame 2 and frame 3, using tracked feature points in
Frame 2.
• Verify that the corresponding features between non-adjacent frames
satisfy a global affine mapping. Features that are outliers are dropped.
• References:
– Bruce D. Lucas and Takeo Kanade. An Iterative Image Registration Technique with an
Application to Stereo Vision. International Joint Conference on Artificial Intelligence, pages
674–679, 1981.
– Carlo Tomasi and Takeo Kanade. Detection and Tracking of Point Features. Carnegie
Mellon University Technical Report CMU-CS-91-132, April 1991.
– Jianbo Shi and Carlo Tomasi. Good Features to Track. IEEE Conference on Computer
Vision and Pattern Recognition, pages 593–600, 1994.

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 53


Pop Quiz

• What is the assumption of the Lucas-Kanade method?

• In what regions we cannot estimate the motion reliably?

• How to handle large motion?

• How does KLT tracker work?

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 54


Pop Quiz (w/Answer)

• What is the assumption of the Lucas-Kanade method?


– Color constancy
– Small motion
• In what regions we cannot estimate the motion reliably?
– Flat region
– Single edge direction
• How to handle large motion?
– Pyramid representation and coarse-to-fine estimation
• How does KLT tracker work?
– Detect features in the first frame
– Determine motions of these features with respect to the second
frame using LK method, correspondingly determine the
positions of the features in the second frame
– Repeat for next frame
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 55
Other Methods for Pixel-Based Motion
Estimation (optional)
• Horn-Schunck method
– DFD + motion smoothness criterion
• Multipoint neighborhood method
– Assuming every pixel in a small block surrounding a pixel has the
same MV (Lucas-Kanade method)
• Recommended reading :
– Sun, Deqing, Stefan Roth, and Michael J. Black. "Secrets of optical
flow estimation and their principles." In Computer Vision and Pattern
Recognition (CVPR), 2010 IEEE Conference on, pp. 2432-2439. IEEE,
2010.

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 56


Outline

• What causes 2D motion?


• 2-D motion vs. optical flow
• Optical flow equation and ambiguity in motion estimation
• General methodologies in motion estimation
– Motion representation
– Motion estimation criterion
– Optimization methods
• Lucas-Kanade Flow Estimation Method
– Lucase-Kandade method
– Pyramid LK
– KLT Tracker
• Block Matching Algorithm
– EBMA algorithm
– Half-pel EBMA
– Hierarchical EBMA (HBMA)
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 57
Block-Based Motion Estimation

• Assume all pixels in a block undergo a coherent motion, and


search for the motion parameters for each block independently
• Block matching algorithm (BMA): assume translational motion, 1
MV per block (2 parameter)
– Exhaustive BMA (EBMA)
– Fast algorithms
• Deformable block matching algorithm (DBMA): allow more complex
motion (affine, bilinear)

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 58


Block Matching Algorithm

• Overview:
– Assume all pixels in a block undergo a translation, denoted by a single
MV
– Estimate the MV for each block independently, by minimizing the DFD
error over this block
• Minimizing function:

å y 2 ( x + d m ) - y 1 ( x)
p
EDFD (d m ) = ® min
xÎBm

• Optimization method:
– Exhaustive search (feasible as one only needs to search one MV at a
time), using MAD criterion (p=1)
– Fast search algorithms
– Integer vs. fractional pel accuracy search

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 59


Exhaustive Block Matching Algorithm
(EBMA)

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 60


Sample Matlab Script for
Integer-pel EBMA
%f1: anchor frame; f2: target frame, fp: predicted image;
%mvx,mvy: store the MV image
%widthxheight: image size; N: block size, R: search range

for i=1:N:height-N,
for j=1:N:width-N %for every block in the anchor frame
MAD_min=256*N*N;mvx=0;mvy=0;
for k=-R:1:R,
for l=-R:1:R %for every search candidate (needs to be modified so that i+k etc are
within the image domain!)
MAD=sum(sum(abs(f1(i:i+N-1,j:j+N-1)-f2(i+k:i+k+N-1,j+l:j+l+N-1))));
% calculate MAD for this candidate
if MAD<MAX_min
MAD_min=MAD,dy=k,dx=l;
end;
end;end;
fp(i:i+N-1,j:j+N-1)= f2(i+dy:i+dy+N-1,j+dx:j+dx+N-1);
%put the best matching block in the predicted image
iblk=(floor)(i-1)/N+1; jblk=(floor)(j-1)/N+1; %block index
mvx(iblk,jblk)=dx; mvy(iblk,jblk)=dy; %record the estimated MV
end;end;

Note: A real working program needs to check whether a pixel in the candidate matching block falls outside the image
boundary and such pixel should not count in MAD. This program is meant to illustrate the main operations involved. Not the
actual working matlab script.
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 61
Complexity of Integer-Pel EBMA

• Assumption
– Image size: MxM
– Block size: NxN
– Search range: (-R,R) in each dimension
– Search stepsize: 1 pixel (assuming integer MV)
• Operation counts (1 operation=1 “-”, 1 “+”, 1 “*”):
– Each candidate position: N^2
– Each block going through all candidates: (2R+1)^2 N^2
– Entire frame: (M/N)^2 (2R+1)^2 N^2=M^2 (2R+1)^2
• Independent of block size!
• Example: M=512, N=16, R=16, 30 fps
– Total operation count = 2.85x10^8/frame =8.55x10^9/second
• Regular structure suitable for VLSI implementation
• Challenging for software-only implementation

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 62


Fractional Accuracy EBMA

• Real MV may not always be multiples of pixels. To allow sub-pixel


MV, the search stepsize must be less than 1 pixel
• Half-pel EBMA: stepsize=1/2 pixel in both dimension
• Difficulty:
– Target frame only have integer pels
• Solution:
– Interpolate the target frame by factor of two before searching
– Bilinear interpolation is typically used
• Complexity:
– The number of candidate MV to search increase from (2R+1)^2 to
(4R+1)^2, 4 times of integer-pel, plus additional operations for
interpolation.
• Fast algorithms:
– Search in integer precisions first, then refine in a small search region
in half-pel accuracy.

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 63


Half-Pel Accuracy EBMA

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 64


Bilinear Interpolation

(x,y) (x+1,y) (2x,2y) (2x+1,2y)

(2x,2y+1) (2x+1,2y+1)

(x,y+!) (x+1,y+1)

O[2x,2y]=I[x,y]
O[2x+1,2y]=(I[x,y]+I[x+1,y])/2
O[2x,2y+1]=(I[x,y]+I[x+1,y])/2
O[2x+1,2y+1]=(I[x,y]+I[x+1,y]+I[x,y+1]+I[x+1,y+1])/4

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 65


Implementation for Half-Pel EBMA

%f1: anchor frame; f2: target frame, fp: predicted image;


%mvx,mvy: store the MV image
%widthxheight: image size; N: block size, R: search range
%first upsample f2 by a factor of 2 in each direction
f3=imresize(f2, 2,’bilinear’) (or use you own implementation!)
for i=1:N:height-N, for j=1:N:width-N %for every block in the anchor frame
MAD_min=256*N*N;mvx=0;mvy=0;
for k=-R:0.5:R, for l=-R:0.5:R %for every search candidate (needs to be modified!)
%MAD=sum(sum(abs(f1(i:i+N-1,j:j+N-1)-f2(i+k:i+k+N-1,j+l:j+l+N-1))));
MAD=sum(sum(abs(f1(i:i+N-1,j:j+N-1)-f3(2*(i+k):2:2*(i+k+N-1),2*(j+l):2:2*(j+l+N-1)))));
% calculate MAD for this candidate
if MAD<MAX_min
MAD_min=MAD,dy=k,dx=l;
end;
end;end;
fp(i:i+N-1,j:j+N-1)= f3(2*(i+dy):2:2*(i+dy+N-1),2*(j+dx):2:2*(j+dx+N-1));
%put the best matching block in the predicted image
iblk=(floor)(i-1)/N+1; jblk=(floor)(j-1)/N+1; %block index
mvx(iblk,jblk)=dx; mvy(iblk,jblk)=dy; %record the estimated MV
end;end;
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 66
target frame

anchor frame
Predicted anchor frame (29.86dB)
Motion field

Yao Wang, 2021 Example:


ECE-GY Half-pel
6123: Image and VideoEBMA
Processing 67
Pros and Cons with EBMA

• Blocking effect (discontinuity across block boundary) in the


predicted image
– Because the block-wise translation model is not accurate
– Fix: Deformable BMA
• Motion field somewhat chaotic
– because MVs are estimated independently from block to block
– Fix 1: Mesh-based motion estimation
– Fix 2: Imposing smoothness constraint explicitly
• Noisy MV in the flat region, or if true motion is along the edge
direction
• Nonetheless, widely used for motion compensated prediction in
video coding
– Because its simplicity and optimality in minimizing prediction error

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 68


Fast Algorithms for BMA

• Key idea to reduce the computation in EBMA:


– Reduce # of search candidates:
• Only search for those that are likely to produce small errors.
• Predict possible remaining candidates, based on previous search result
– Simplify the error measure (DFD) to reduce the computation involved
for each candidate
• Classical fast algorithms
– Three-step
– 2D-log
– Conjugate direction
• Many new fast algorithms have been developed since then
– Some suitable for software implementation, others for VLSI
implementation (memory access, etc)

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 69


Pop Quiz

• What is the assumption of EBMA approach?

• Suppose you use a block size BxB, your search range is (-R to R)
in both directions, and you only search for integer MVs. How many
addition and multiplication you need to do for each block?

• Suppose your image size is MxN. How many addition and


multiplication you need to do for the entire image? Does it depend
on the block size?

• How do the above numbers change if you use half-pel accuracy


search?

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 70


Pop Quiz (w/ Answer)

• What is the assumption of EBMA approach?


– All pixels in a block follow the same motion
• Suppose you use a block size BxB, your search range is (-R to R)
in both directions, and you only search for integer MVs. How many
addition and multiplication you need to do for each block?
– (2R+1)^2 B^2
• Suppose your image size is MxN. How many addition and
multiplication you need to do for the entire image? Does it depend
on the block size?
– MN/B^2 * (2R+1)^2 B^2 = MN (2R+1)^2
– Independent of block size
• How do the above numbers change if you use half-pel accuracy
search?
– 4x: candidates change from (2R+1)^2 to (4R+1)^2

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 71


Multi-resolution Motion Estimation

• Problems with BMA


– Unless exhaustive search is used, the solution may not be global
minimum
– Exhaustive search requires extremely large computation
– Estimated motion in flat /non-corner block may not be accurate
(multiple MVs could lead to the same minimal error!)
– Motion may not be the same in a block
• Multiresolution approach
– Aim to solve the first 3 problems
– First estimate the motion in a coarse resolution over low-pass filtered,
down-sampled image pair
• Can usually lead to a solution close to the true motion field
– Then modify the initial solution in successively finer resolution within a
small search range
• Reduce the computation
– Can be applied to different motion representations, but we will focus
on its application to BMA

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 72


Hierarchical Block Matching Algorithm
(HBMA)

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 73


Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 74
Predicted anchor frame (29.32dB)
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 75
EBMA, Half-Pel (29.86dB)
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing
HBMA (29.32dB) 76
Computation Requirement of HBMA

• Assumption
– Image size: MxM; Block size: NxN at every level; Levels: L
– Search range:
• 1st level: R/2^(L-1) (Equivalent to R in L-th level)
• Other levels: R/2^(L-1) (can be smaller)
• Operation counts for EBMA
– image size M, block size N, search range R
– # operations: M 2 (2R + 1)2
• Operation counts at l-th level (Image size: M/2^(L-l))
(M / 2 ) (2R / 2
L -l 2 L -1
)
+1
2

• Total operation count

å (M / 2 ) (2 R / 2 )
L
L -l 2 L -1 2 1
+ 1 » 4 -( L - 2 ) 4 M 2 R 2
l =1 3
• Saving factor:
3 × 4( L-2) = 3( L = 2); 12( L = 3)

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 77


Pop Quiz

• What is the benefit of using multi-resolution search?

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 78


Pop Quiz (w/answers)

• What is the benefit of using multi-resolution search?


– Reduce complexity
– Enforce motion smoothness (often physically more accurate
solution)

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 79


VcDemo Example

VcDemo: Image and Video Compression Learning Tool


Developed at Delft University of Technology

http://insy.ewi.tudelft.nl/content/image-and-video-compression-learning-tool-vcdemo

Use the ME tool to show the motion estimation results with different parameter choices

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 80


Deep-Learning Based Method

• [PWCnet] Deqin Sun, Xiaodong Yang, Ming-Yu Liu, Jan Kautz,


PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and
Cost Volume. CVPR 2018. https://arxiv.org/abs/1709.02371

• [Flownet2] E. Ilg, N. Mayer, T. Saikia, M. Keuper, A. Dosovitskiy,


and T. Brox, “Flownet 2.0: Evolution of optical flow estimation with
deep networks.,” in CVPR . 2017, pp. 1647–1655, IEEE Computer
Society
• https://lmb.informatik.uni-freiburg.de/Publications/2017/IMKDB17/

• [FlowFields] C. Bailer, B. Taetz, and D. Stricker. Flow fields: Dense


correspondence fields for highly accurate large displacement
optical flow estimation. In IEEE Int. Conference on Computer
Vision (ICCV), 2015.

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 81
PWCnet: Matching in Feature Space

[PWCnet] Deqin Sun, Xiaodong Yang, Ming-Yu Liu, Jan Kautz, PWC-Net: CNNs for Optical
Flow Using Pyramid, Warping, and Cost Volume. CVPR 2018.
https://arxiv.org/abs/1709.02371

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 82


Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 83
Applications of Dense Motion Estimation

• Registration of two images with elastic deformation


– Medical applications
– Image morphing
• Video coding
– Predicting the next frame with motion-compensated prediction
– Code the error only
• Frame interpolation
– Increasing frame rate
– Filling in lost / missing frames due to transmission loss
• Video filtering
– To remove noise
– Temporal filtering along motion trajectory

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 84


Motion-Compensated Frame Interpolation

From Katsaggelos’s Coursera Course on Image Processing, Lecture on motion estimation.

If f(x,y,t2) = f(x+u,y+v,t1), then f(x+u/2,x+v/2, t)=0.5*f(x,y,t2)+ 0.5 f(x+u,y+v,t1)


Practical issues. Ideally need to estimate the motion from t to t1, and from t to t2
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 85
Motion Compensated Temporal Filtering

From Katsaggelos’s Coursera Course on Image Processing, Lecture on motion estimation.

Should filter along the motion trajectory!


Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 86
Summary 1: General Methodology
• What causes 2D motion?
– Object motion projected to 2D
– Camera motion
– Optical flow vs. true 2D motion
• Constraints for 2D motion
– Optical flow equation
– Derived from constant intensity and small motion assumption
– Ambiguity in motion estimation (uniquely determined only near
corners)
• Estimation criterion:
– DFD (constant intensity)
– OF (constant intensity+small motion)
– Regularization (motion smoothness or other prior
knowledge/assumptions)
• Search method:
– Exhaustive search, gradient-descent, multi-resolution
– Least squares solution under optical flow equation

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 87


Summary 2: Motion Estimation Methods

• Pixel-based motion estimation (also known as optical flow estimation)


– Most accurate representation, but also most costly to estimate
– Need to put additional constraints on motion of neighboring pixels
– Lucas-Kanade method
• Assuming motion in the neighborhood is the same
– How to handle large motion: iterative refinement, multiresolution
– KLT tracker: apply LK method on feature points only
– Automatically yield fractional accuracy
• Block-based motion estimation, assuming each block has a constant
motion
– Good trade-off between accuracy and speed
– EBMA and its fast but suboptimal variant is widely used in video coding for
motion-compensated temporal prediction.
– HBMA can not only reduce computation but also yield physically more correct
motion estimates
• Deep learning method:
– Leveraging multiresolution representation, and matching in feature domain

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 88


Python Tools for Motion Estimation

• This site describes some Python tools for motion


estimation
– http://docs.opencv.org/2.4/modules/video/doc/motion_analysis_
and_object_tracking.html
• Some functions included:
– cv2.calcOpticalFlowPyrLK()
– This function computes the flow at a set of feature points, using
pyramid representation
– cv2.calcOpticalFlowFarneback()
– This function computes dense flow (at every pixel)
• KLT tracker
– https://github.com/TimSC/PyFeatureTrack (3rd party package)

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 89


Video capture/read/write in Python

• Please see following


• http://opencv-python-
tutroals.readthedocs.io/en/latest/py_tutorials/py_gui/py_
video_display/py_video_display.html

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 90


Other Techniques (Optional)

• Deformable block matching algorithm (DBMA) ([Wang 2002])


– To allow more complex motion within each block
– Can set up equation for motion parameters using optical flow equation
• Mesh-based motion estimation (aka spline-based) ([Wang 2002])
– Estimate MVs at grid points of a mesh, and interpolate MV at other points
– Spline-based interpolation enforce motion smoothness
• Global motion estimation (next lecture)
• Segmentation of moving objects from background (next lecture)
• Block motion estimation using Phase correlation (optional material)
• Diffeomorphism for deformable image registration (optional
material)
• Deep learning based methods

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 91


Reading Assignments

• [Szeliski2010] Richard Szeliski, Computer Vision: Algorithms and


Applications. 2021. Chap. 9. (We covered some of 9.1, 9.3 in this lecture.
You are encouraged to read 9.2, 9.4.)
• Optional reading:
• [Wang2002] Wang, et al, Digital video processing and communications.
– Chap 5: Sec. 5.1, 5.5
– Chap 6: Sec. 6.1-6.6, Apx. A, B.
• Sun, Deqing, Stefan Roth, and Michael J. Black. "Secrets of optical flow
estimation and their principles." In Computer Vision and Pattern
Recognition (CVPR), 2010 IEEE Conference on, pp. 2432-2439. IEEE,
2010.
• Hill, Derek LG, et al. "Medical image registration." Physics in medicine and
biology 46.3 (2001): R1.
http://research.mholden.org/publications/hill2001pmb.pdf

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 92


Motion Estimation and Segmentation Database

• http://vision.middlebury.edu/flow/data/
• http://people.csail.mit.edu/celiu/motionAnnotation/

• More recent
• MPI Sintel flow dataset (animation video)
– http://sintel.is.tue.mpg.de/
• KITTI flow dataset 2015
– http://www.cvlibs.net/datasets/kitti/eval_scene_flow.php?bench
mark=flow

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 93


Written Assignment (1)

1) Consider two successive frames shown below. Using the Lucas-Kanade method to
determine the optical flow vector of the center pixel. To determine the horizontal and
vertical gradient image, you could simply use difference of two horizontally and
vertically adjacent pixels. Your solution should show the horizontal, vertical and
temporal gradient image, the moment matrix and the final solution. You should use a
3x3 neighborhood block surrounding the center pixel.

Frame t Frame t+1


0 0 10 10 10 0 0 0 10 10

0 0 10 10 10 0 0 0 10 10

0 0 10 10 10 0 0 0 10 10

10 10 10 10 10 10 10 10 10 10

10 10 10 10 10 10 10 10 10 10

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 94


Written Assignments (2)

2) Would you be able to derive the optical flow using the LK method, if the two frames
look like the following? Why? What if you use a block matching method?

Frame t Frame t+1


0 0 10 10 10 0 0 0 10 10

0 0 10 10 10 0 0 0 10 10

0 0 10 10 10 0 0 0 10 10

0 0 10 10 10 0 0 0 10 10

0 0 10 10 10 0 0 0 10 10

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 95


Written Assignments (3)

3) Consider half-pel accuracy exhaustive search for motion estimation. Assume the
video frame size is MxN, block size is NxN, and motion search range is –R to R in both
horizontal and vertical directions. You use the sum of absolute error as your matching
criterion. What is the total number of multiplications and additions that you need use to
estimate the motion between every two frames? You can ignore the computation for
frame interpolation.
4) Consider HBMA. Support you use 3 resolution levels. Original image size is MxN.
You use a block size of NxN at all levels.
a) Suppose you want the effective search range to be –R to R at the original image resolution.
What should be the search range at the top level?
b) Suppose for the middle level, you use a search range of -2 to 2, and at the bottom level you use
a search range of -1 to 1. Suppose you use integer accuracy search in all levels, what will be
the complexity?

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 96


MATLAB Assignment (Optional)

1. [Wang2002] Prob. 6.12 (EBMA with integer accuracy)


2. [Wang2002] Prob. 6.13 (EBMA with half-pel accuracy)
3. [Wang2002] Prob. 6.15 (HBMA)
• Note: you can download sample video frames from the course webpage.
When applying your motion estimation algorithm, you should choose two
frames that have sufficient motion in between so that it is easy to observe
effect of motion estimation inaccuracy. If necessary, choose two frames
that are several frames apart. For example, foreman: frame 100 and frame
103.

• Try out many of the project ideas given in [Szeliski2010] 8.7.

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 97


Optional Material

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 98


Motion Estimation Through Phase Correlation

From Katsaggelos’s Coursera Course on Image Processing, Lecture on motion estimation.

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 99


Motion Estimation Through Phase Correlation

From Katsaggelos’s Coursera Course on Image Processing, Lecture on motion estimation.

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 100
What if different regions move differently?

• If we apply FT on entire images, we may see multiple


peaks
• Apply phase correlation methods at block level, but
need to apply FT on enlarged blocks that cover both the
original block and the new block positions

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 101
Deformable Image Registration

• In medical applications, we often need to align two images of the same


patient taken during different times, or between one image and a
reference (atlas).
• Find point-wise correspondence between two images, so that one can be
warped to the other.
• Essentially we need to find the dense motion field between the two images
• The registration often has to be done on a 3D volume, as corresponding
pixels may not be on the same slice

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 102
Diffeomorphism Mapping

• Diffeomorphism Mapping: continuous and invertible

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 103
From: http://campar.in.tum.de/twiki/pub/DefRegTutorial/WebHome/MICCAI_2010_Tutorial_Def_Reg_Darko.pdf
Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 104
Many ways to impose regularization à Different methods
From: http://campar.in.tum.de/twiki/pub/DefRegTutorial/WebHome/MICCAI_2010_Tutorial_Def_Reg_Darko.pdf

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 105
Example of Deformable Registration

From [Szeliski2010]

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 106
Demon’s Algorithms

• Basic Idea
– Iteratively update the motion vector at each pixel based on the image
gradient (image warped after iteration)
– Smooth the new motion field using a Gaussian kernel

Thirion, J-P. "Image matching as a diffusion process: an analogy with Maxwell's


demons." Medical image analysis 2.3 (1998): 243-260.
From: http://campar.in.tum.de/twiki/pub/DefRegTutorial/WebHome/MICCAI_2010_Tutorial_Def_Reg_Darko.pdf

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 107
Python for Demons’ Registration Algorithm

• Package : SimpleITK
• Install : conda install -c simpleitk simpleitk=0.10.0
• Detail
:http://insightsoftwareconsortium.github.io/SimpleITK-
Notebooks/66_Registration_Demons.html

Yao Wang, 2021 ECE-GY 6123: Image and Video Processing 108

You might also like