0% found this document useful (0 votes)
70 views8 pages

CV III-Unit Notes

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views8 pages

CV III-Unit Notes

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Constrained structure and motion

• The most general algorithms for structure from motion make no prior assumptions
about the objects or scenes that they are reconstructing.
• Lines and planes can provide information complementary to interest points and also
serve as useful building blocks for 3D modeling and visualization.
• Many lines and planes are either parallel or orthogonal to each other
Line-based techniques
• When lines are visible in three or more views, the trifocal tensor can be used to
transfer lines from one pair of images to another. The trifocal tensor can also be
computed on the basis of line matches alone.
• Another technique for matching 2D lines based on the average of 15 15 pixel
correlation scores evaluated at all pixels along their common line segment
intersection.
• An alternative to grouping lines into coplanar subsets is to group lines by parallelism.
Whenever three or more 2D lines share a common vanishing point, there is a good
likelihood that they are parallel in 3D. By finding multiple vanishing points in an
image and establishing correspondences between such vanishing points in different
images, the relative rotations between the various images (and often the camera
intrinsics) can be directly estimated
• Other techniques first finds lines and group them by common vanishing points in
each image . The vanishing points are then used to calibrate the camera, i.e., to
perform a “metric upgrade”. Lines corresponding to common vanishing points are
then matched using both appearance and trifocal tensors. These lines are then used
to infer planes and a block-structured model for the scene
Plane-based techniques
• In scenes that are rich in planar structures, it is possible to directly estimate
homographies between different planes, using either feature-based or intensity-
based methods. In principle, this information can be used to simultaneously infer the
camera poses and the plane equations, i.e., to compute plane-based structure from
motion.
• A better approach is to hallucinate virtual point correspondences within the areas
from which each homography was computed and to feed them into a standard
structure from motion algorithm
Dense Motion Estimation: Translational alignment
• The simplest way to establish an alignment between two images or image patches is
to shift one image relative to the other. Given a template image I0(x) sampled at
discrete pixel locations Xi = (xi,yi) , we wish to find where it is located in image I1(x). A
least squares solution to this problem is to find the minimum of the sum of squared
differences (SSD) function

• where u = (u, v) is the displacement and ei = I1(xi + u)-I0(xi) is called the residual
error. Here the assumption that corresponding pixel values remain the same in the
two images is often called the brightness constancy constraint.
• Color images can be processed by summing differences across all three color
channels, although it is also possible to first transform the images into a different
color space or to only use the luminance
• Robust error metrics. We can make the above error metric more robust to outliers by
replacing the squared error terms with a robust function p(ei)

• The robust norm p(e) is a function that grows less quickly than the quadratic penalty
associated with least squares.
• One such function, sometimes used in motion estimation for video coding because of
its speed, is the sum of absolute differences (SAD) metric

Spatially varying weights.


• we may want to partially or completely downweight the contributions of certain
pixels. For example, we may want to selectively “erase” some parts of an image from
consideration when stitching a mosaic where unwanted foreground objects have
been cut out. For applications such as background stabilization, we may want to
downweight the middle part of the image, which often contains independently
moving objects being tracked by the camera.
• All of these tasks can be accomplished by associating a spatially varying per-pixel
weight with each of the two images being matched. The error metric then becomes
the weighted (or windowed) SSD function,
• where the weighting functions w0 and w1 are zero outside the image boundaries.
Bias and gain (exposure differences)
• The two images being aligned were not taken with the same exposure. A simple
model of linear (affine) intensity variation between the two images is the bias and
gain model

• where is beta the bias and alpha is the gain


• Note that for color images, it may be necessary to estimate a different bias and gain
for each color channel to compensate for the automatic color correction performed
by some digital cameras
Hierarchical motion estimation
• An image pyramid is constructed and a search over a smaller number of discrete
pixels (corresponding to the same range of motion) is first performed at coarser
levels .
• The motion estimate from one level of the pyramid is then used to initialize a smaller
local search at the next finer level. Alternatively, several seeds (good solutions) from
the coarse level can be used to initialize the fine-level search.

Parametric Motion
parametric motion is used to model and analyze the movement of objects within images or video
sequences. This can involve several applications and techniques:

1. Motion Estimation: One of the primary applications is to estimate the motion of objects
between consecutive frames of a video. Parametric models can describe how the position of
an object changes over time. Common methods include:
o Optical Flow: Estimates the motion of objects by analyzing the pattern of apparent
motion of brightness patterns in an image. Optical flow algorithms often use
parametric models to estimate velocity fields.

o Feature Tracking: Uses keypoints or features detected in successive frames.


Parametric models, such as affine or homography transformations, are used to
describe how these features move.

2. Object Tracking: In tracking, parametric motion models help predict the future position of an
object based on its past positions. Different models can be used depending on the
complexity of the motion:

o Constant Velocity Model: Assumes the object moves at a constant speed and
direction. Useful for simple tracking scenarios.

o Constant Acceleration Model: Assumes that the object's velocity changes at a


constant rate. This can model more complex motions like accelerating or
decelerating objects.

o Kalman Filter: A recursive algorithm that uses a parametric motion model to predict
the state of a moving object and update the predictions based on new observations.
It is particularly useful for tracking objects in noisy environments.

3. Camera Motion: Parametric models are also used to understand and compensate for the
motion of the camera itself. This is crucial in applications like:

o Visual SLAM (Simultaneous Localization and Mapping): Involves tracking the


camera's movement and creating a map of the environment. Parametric motion
models describe how the camera moves through space.

o Stabilization: Reduces the effect of camera shake by estimating and compensating


for camera motion using parametric models.

4. 3D Reconstruction: In reconstructing 3D scenes from multiple 2D images, parametric models


can help estimate the 3D motion of objects or the camera. Techniques such as structure-
from-motion (SfM) use parametric motion models to relate the 3D scene structure to the
observed 2D images.

5. Model-Based Tracking: When tracking specific objects, such as faces or vehicles, parametric
models can represent the shape and motion of these objects. For example, models based on
the appearance of a face (like Active Appearance Models) can be used to track facial
expressions and movements.

In summary, parametric motion models in computer vision provide a framework for understanding
and predicting the movement of objects and cameras, enabling a range of applications from tracking
and stabilization to 3D reconstruction and beyond.

Spline-based Motion
Spline-based motion in computer vision refers to using spline functions to model and analyze motion
or trajectories of objects within images or video sequences. Splines are flexible mathematical
functions used to create smooth curves and surfaces, and they can be particularly useful for
describing complex, smooth motion paths.
Key Concepts:

1. Splines:

o Definition: A spline is a piecewise-defined polynomial function that can be used to


approximate or interpolate data points. The most common types are cubic splines, B-
splines, and Bézier curves.

o Cubic Splines: These are piecewise cubic polynomials that ensure smoothness at the
points where the pieces connect (known as knots). They are widely used in motion
modeling due to their smoothness and flexibility.

o B-Splines (Basis Splines): These provide a way to represent curves and surfaces with
a set of control points. B-splines are particularly useful for their local control and
smoothness properties.

o Bézier Curves: Defined by control points, these curves are used in graphics and
animation to model smooth paths.

2. Applications in Computer Vision:

o Object Tracking:

 Trajectory Modeling: Splines can model the trajectory of an object over


time, providing a smooth path that fits the observed data points. This is
useful for predicting future positions and understanding motion patterns.

 Interpolation: When tracking features or keypoints, splines can interpolate


between detected positions to create a continuous motion path.

o Motion Estimation:

 Optical Flow: Splines can be used to model the flow of pixels between
frames, helping to estimate the motion field. For instance, cubic splines
might be used to model the displacement of points across a sequence of
images.

o Camera Calibration and Rectification:

 Image Warping: When correcting for lens distortion or rectifying images,


splines can help model the transformation needed to align or correct the
images, ensuring smooth transitions and reducing artifacts.

o 3D Reconstruction:

 Surface Reconstruction: Splines are used to reconstruct surfaces from sparse


3D points by fitting smooth curves or surfaces to the data. This can be crucial
for creating detailed 3D models from multiple images.

o Animation and Synthesis:

 Path Animation: In computer graphics and vision-based animation, splines


can be used to create smooth motion paths for animated objects, providing
realistic and visually pleasing motion.

3. Advantages:
o Smoothness: Splines ensure smooth transitions between points, which is important
for accurately modeling continuous motion.

o Flexibility: They can represent a wide variety of shapes and motion patterns, making
them suitable for complex scenarios.

o Local Control: In B-splines, changes to control points affect only a local portion of the
curve, allowing for precise adjustments.

4. Challenges:

o Computational Complexity: While splines are powerful, they can be computationally


intensive, especially when dealing with large datasets or high-dimensional spaces.

o Parameter Tuning: Choosing the right type of spline and tuning its parameters can
require careful consideration to balance smoothness with the fidelity of the
representation.

Example Workflow in Object Tracking:

1. Feature Detection: Detect keypoints or features in consecutive frames.

2. Trajectory Fitting: Use splines to fit a smooth curve through the detected keypoints,
representing the object's motion path.

3. Motion Prediction: Use the spline to predict future positions of the object, improving
tracking accuracy.

4. Motion Correction: Adjust the tracking results based on the spline model to account for any
deviations or noise.

Optical flow
• optical flow algorithms compute an independent motion estimate for each pixel, i.e.,
the number of flow vectors computed is equal to the number of input pixels. The
general optical flow analog to Equation can thus be written as

• It is also possible to combine ideas from local and global flow estimation into a single
framework by using a locally aggregated (as opposed to single-pixel) Hessian as the
bright ness constancy term
• Another extension to the basic optical flow model is to use a combination of global
(para metric) and local motion models. For example, if we know that the motion is
due to a camera moving in a static scene (rigid motion), we can re-formulate the
problem as the estimation of a per-pixel depth along with the parameters of the
global camera motion
Assumptions:
• Brightness Constancy: The brightness of a point remains constant over time.
• Spatial Coherence: Neighboring pixels tend to have similar motion.
Applications:
• Object Tracking: Following moving objects in a scene.
• Video Compression: Reducing data by encoding motion instead of individual
frames.
• Scene Reconstruction: Understanding 3D structures from 2D video data.
• Robotics: Navigating and understanding the environment
Algorithms:
• Lucas-Kanade Method: Assumes a constant flow in a small neighborhood and solves
for motion vectors using least squares.
• Horn-Schunck Method: A global approach that considers smoothness of the flow
field and minimizes a cost function.
• Farneback Method: A dense optical flow algorithm that computes flow at all points
using polynomial expansion.
Types:
• Dense Optical Flow: Estimates flow for every pixel in the image, providing a
comprehensive view of motion.
• Sparse Optical Flow: Estimates flow for selected key points, often used for tracking
specific objects.
Challenges
• Occlusion: When objects overlap, the flow can become ambiguous.
• Lighting Changes: Variations in lighting can affect the brightness constancy
assumption.
• Large Motion: Rapid movements can lead to errors if the flow exceeds the
assumption of small displacements

Layered Motion
• In many situations, visual motion is caused by the movement of a small number of
objects at different depths in the scene. In such situations, the pixel motions can be
described more succinctly (and estimated more reliably) if pixels are grouped into
appropriate objects or layers
• Layered motion representations not only lead to compact representations but they
also exploit the information available in multiple video frames, as well as accurately
modeling the appearance of pixels near motion discontinuities. This makes them
particularly suited as a representation for image-based rendering
• To compute a layered representation of a video sequence, first estimate affine
motion models over a collection of non-overlapping patches and then cluster these
estimates using k-means. They then alternate between assigning pixels to layers and
recomputing motion estimates for each layer using the assigned pixels.
• Once the parametric motions and pixel-wise layer assignments have been computed
for each frame independently, layers are constructed by warping and merging the
various layer pieces from all of the frames together. Median f iltering is used to
produce sharp composite layers that are robust to small intensity variations, as well
as to infer occlusion relationships between the layers.
• You can see both the initial and final layer assignments for one of the frames, as well
as the composite flow and the alpha-matted layers with their corresponding flow
vectors overlaid
Frame interpolation
• Frame interpolation is a widely used application of motion estimation, often
implemented in hardware to match an incoming video to a monitor’s actual refresh
rate, where information in novel in-between frames needs to be interpolated from
preceding and subsequent frames. The best results can be obtained if an accurate
motion estimate can be computed at each unknown pixel’s location.
Transparent layers and reflections
• A special case of layered motion that occurs quite often is transparent motion, which
is usually caused by reflections seen in windows and picture frames.
• If the motions of the individual layers are known, the recovery of the individual layers
is a simple constrained least squares problem, with the individual layer images are
constrained to be positive and saturated pixels provide an inequality constraint on
the summed values

You might also like