Motion Representation CV
Motion Representation CV
Definition:
Motion parallax is a depth cue that our brain uses to perceive the distance of objects when we
move. It occurs because objects that are closer to us move faster across our field of view
compared to objects that are farther away.
Key Concepts:
1. Relative Motion:
○ When you move, objects at different distances from you appear to move at
different speeds. Closer objects seem to move quickly, while distant objects
appear to move slowly.
2. Depth Perception:
○ This difference in speed helps our brain understand the relative distance of
objects. It's a crucial depth cue, especially in environments where binocular
depth cues (like stereopsis) are not available.
How It Works:
Imagine you’re looking out of a car window while driving. As you move forward, you'll notice:
This difference in motion provides important information about the relative distances of these
objects.
Everyday Examples:
1. Driving a Car:
○ Objects close to the car (like road signs) appear to move faster than distant
objects (like buildings or mountains).
2. Walking and Looking Around:
○ When you walk and look around, nearby objects like a bench or a tree shift
position faster compared to distant objects like a house or the horizon.
Optical Flow in Motion Analysis
What is Optical Flow?
Optical Flow refers to the apparent motion of objects, surfaces, or edges in a visual scene
caused by the relative motion between the observer (camera) and the scene. It is a 2D vector
field that specifies the direction and speed of motion for each pixel in an image sequence.
Key Concepts:
1. Motion Detection:
○ Optical flow detects and analyzes the movement of objects between consecutive
frames.
2. Vector Field:
○ Represents the direction and speed (velocity) of motion for each pixel in the
image.
Imagine you have a video of a ball rolling across a flat surface. The goal is to determine how the
ball moves between consecutive frames using optical flow.
Step-by-Step Explanation:
1. Initial Frame:
○ You take a picture (frame) of the ball at time tt.
○ Let's say the ball is at position (x1,y1)(x_1, y_1) in this frame.
2. Next Frame:
○ After a short interval, you take another picture (frame) at time t+Δtt + \Delta t.
○ Now the ball has moved to a new position (x2,y2)(x_2, y_2) in this frame.
1. Feature Detection
● Identify key points in the image where significant information is present (e.g., edges,
corners, or blobs).
● Example:
○ In a video of a car moving across a road, the edges of the car or the corners of
windows can be features.
● Common Algorithms:
○ Harris Corner Detector: Finds corners in the image.
○ SIFT (Scale-Invariant Feature Transform): Detects distinctive features at
multiple scales.
○ SURF (Speeded-Up Robust Features): A faster alternative to SIFT.
2. Feature Matching
● Once features are matched, compute their movement (displacement) across frames.
● Example:
○ If the corner of a moving car's window shifts from (x1,y1) in frame 1 to (x2,y2)
in frame 2, the displacement vector is: Δx=x2−x1,Δy=y2−y1
1. Block Matching
● Description: Divide the image into small rectangular blocks and match each block
across frames.
● Use Case: Common in video compression (e.g., MPEG).
● Example:
○ In a football game video, you can match a small block around the ball to track its
motion.
● Description: Detect and describe image features at different scales and orientations.
These descriptors are used for robust matching even with scale and rotation changes.
● Use Case: Applications requiring robustness to scale, lighting, or viewpoint changes
(e.g., object tracking in 3D scenes).
● Example:
○ Matching landmarks (like building edges) in aerial drone footage.