0% found this document useful (0 votes)
17 views46 pages

Lecture 9.1 Motion & Video Analysis in Computer Vision 2025

Motion analysis involves techniques to detect, track, and interpret movement in video sequences, focusing on object dynamics and trajectory estimation. Key methods for motion detection include background subtraction, feature matching, template matching, and optical flow, each with specific applications and limitations. Advanced techniques like Gaussian Mixture Models and moving median approaches enhance detection robustness against environmental changes and noise.

Uploaded by

dunaziad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views46 pages

Lecture 9.1 Motion & Video Analysis in Computer Vision 2025

Motion analysis involves techniques to detect, track, and interpret movement in video sequences, focusing on object dynamics and trajectory estimation. Key methods for motion detection include background subtraction, feature matching, template matching, and optical flow, each with specific applications and limitations. Advanced techniques like Gaussian Mixture Models and moving median approaches enhance detection robustness against environmental changes and noise.

Uploaded by

dunaziad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

CPCS432

Lecture 9.1
a
Motion & Video
Analysis in
Computer Vision
Applying Deep Learning Algorithms
for Motion & Video Analysis

Dr. Arwa Basbrain


Motion Analysis

What is motion analysis ?


Motion analysis refers to the set of techniques used to detect, track, and
Motion Analysis

interpret movement in a sequence of images or video frames. It involves


understanding the motion dynamics, estimating trajectories, and making
predictions about object movements. The core goal is to extract information
about the object's velocity, direction, and sometimes even its intention or
interaction with other objects.

CPCS432 Lecture 9 6
Motion Analysis

Types of Motion in Visual Scenes:

Global Motion: Movement that affects the entire scene, often due to camera
motion (e.g., panning, tilting, rotating). Global motion is typically consistent
across all pixels and is represented by a coherent flow in one direction.

Local Motion: Refers to the movement of individual objects or parts of objects


within the scene. This movement is independent of global motion and is often
the focus in tracking specific objects or analysing their behaviours.

CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 7


Motion Detection

Motion Detection
Motion detection involves identifying moving objects in a scene, typically
Motion Detection

from a sequence of images or video frames. Motion analysis builds upon this
by determining the characteristics of movement, such as direction, speed, and
trajectory. Together, these concepts allow computers to interpret dynamic
scenes, making them essential for real-time applications.

There are several ways to detect motion in a video clip:


1. Background Subtraction
2. Feature Matching
3. Template Matching.
4. Optical Flow

CPCS432 Lecture 9 8
Motion Detection

Motion Detection Techniques


1. Background Subtraction
Motion Detection

2. Feature Matching
3. Template Matching.
4. Optical Flow

https://ijssaggu.github.io/mog/
CPCS432 Lecture 9 9
Motion detection techniques
Background Subtraction
• Background subtraction is a method where a static background image is subtracted
from each frame to isolate moving objects, effective when the background remains
unchanged.
• This method can be implemented with basic image processing techniques.

https://www.coursera.org/learn/object-tracking-and-motion-computer-vision/lecture/MkMFh/detecting-motion
CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 10
Motion detection techniques
Background Subtraction
Simple approach:
1. Estimate background for time t
2. Subtract estimated background from current input frame
3. Apply a threshold to the absolute difference to get the foreground mask
But of course, the question is, what's a good estimate of the background in a video sequence?
Assumes the background is whatever was in the previous frame

https://www.coursera.org/learn/object-tracking-and-motion-computer-vision/lecture/MkMFh/detecting-motion
Motion detection techniques
Background Subtraction (Frame Difference)
Frame Difference
Frame differencing is a basic method for detecting motion by comparing the current frame to the previous frame in a
video sequence.
• This technique isn’t true background subtraction; instead, it assumes the background is whatever was in the frame
immediately before. By subtracting (or differencing) the current image from the last one, frame differencing
highlights areas where there is movement, making it effective for capturing recent changes.
• This method is simple and useful for identifying motion in real-time applications, though it’s limited when dealing
with gradual or slow movements.
- Pixel values in the current frame are compared with those in the previous frame to identify differences. This
approach provides a straightforward method for detecting motion or changes between consecutive frames.
- - To determine whether a difference is substantial, a threshold is set. It is considered a change if the pixel value
difference exceeds this threshold.
- This threshold is usually chosen based on the specific application and the level of sensitivity required.

CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 12


Motion detection techniques
Background Subtraction (Frame Difference)
Frame Difference

CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 13


Motion detection techniques
Background Subtraction (Frame Difference)
Frame Difference

CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 14


Motion detection techniques
Background Subtraction (Frame Difference)

CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 15


Motion detection techniques
Background Subtraction (Frame Difference)

- When significant changes occur in the video sequence, such as moving objects like cars, these changes are
detected as motion.
- This is useful for applications like video surveillance or tracking moving objects.
- To make motion detection more robust, more sophisticated techniques can be used.
- For example, creating a background model based on the average of the first K frames helps filter out static
elements and provides a better reference for detecting real changes.
Limitations
- It tends to detect any change, including background noise, flickering, or minor variations in lighting.
- This can lead to false positives, as seen in the GIF, with leaves waving in the background.
- In practice, simple frame differencing can be a quick and easy way to detect motion, but it may require
additional processing and filtering to reduce false alarms and improve accuracy, especially in complex video
scenes.
CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 16
Motion detection techniques
Background Subtraction (Frame Difference)

Background is estimated to be:


Previous frame

Histogram
Average of previous K frames
Median of previous K frames
Moving Median of previous K frames
Gaussian Mixture Model (GMM)

CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 17


Motion detection techniques
Background Subtraction (Frame Difference)
Frame Difference (Average)

- A background modelling technique using the average of the first K frames in a video sequence as the background image.
- This technique provides a better reference for detecting changes in subsequent frames than the simple frame difference
approach mentioned earlier.
- The first step in this approach is to compute a background image by taking the average pixel values in the first K frames
of the video sequence. The resulting image represents the static or background scene without moving objects or changes.-

Ref: Lecture 11: Object Tracking Sejong University


CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 18
Motion detection techniques
Background Subtraction (Frame Difference)

- In the subsequent frames, each pixel's value is compared to the corresponding pixel in the background image. The idea
is to identify any substantial differences, which could indicate the presence of moving objects or changes in the scene.
Limitations of Average Background
- Sensitivity to Changes: While using the average background improves the simple frame difference technique, it is still
limited. It can be sensitive to changes in lighting, shadows, or variations in the scene. For example, if leaves are
waving in the background or other minor changes in lighting or the environment, these changes may be detected as
motion, leading to false positives.
Static Pixel Model: The approach uses a static model for each pixel, based on the initial average. This means that any
significant change in lighting or scene elements will not be effectively captured.

Ref: Lecture 11: Object Tracking Sejong University


CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 19
Motion detection techniques
Background Subtraction (Frame Difference)
Frame Difference (Median)

- The median is more robust because it represents the middle value of a data set, making it less sensitive to extreme values.
- Using the median as the background model can help mitigate the impact of outliers, such as changes in lighting or abrupt
variations in pixel values.
- In this technique, the first K frames of the video are used to compute the median value at each pixel.

Ref: Lecture 11: Object Tracking Sejong University


CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 20
Motion detection techniques
Background Subtraction (Frame Difference)
Frame Difference (Median)

- The median is chosen because it represents the middle value in a dataset. Unlike the average, which can be sensitive to
outliers and extreme values, the median is more robust.
- It is less affected by individual pixel value variations and can better capture the central tendency of the background.
- Compared to the average, the median background modelling approach offers more stability. It is less prone to false
positives caused by minor variations in pixel values, changes in lighting, or waving leaves.
- Despite the improvement provided by the median, an ideal background model should be adaptive.
- An adaptive model has memory and can adapt to changing scenes over time. This adaptability is important for accurately
capturing and distinguishing background and foreground elements.
- In dynamic environments, where lighting, objects, or environmental conditions may vary, an adaptive model can provide
better performance.

Ref: Lecture 11: Object Tracking Sejong University


CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 21
Motion detection techniques
Background Subtraction (Frame Difference)
Frame Difference (Moving Median)

- Moving Median is designed to create a more adaptive and robust background model for video processing.
- Unlike the initial background modelling approach that used the median of the first few frames, the moving median
computes the median using the most recent frames, allowing the background model to adapt slowly to changes in the
scene.
- The moving median technique is more adaptive compared to the static median. It can adjust to variations in the scene,
such as moving objects, changes in lighting, and other factors that affect the background. This adaptability helps
reduce false positives.

The most recent K frames Ref: Lecture 11: Object Tracking Sejong University
CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 22
Motion detection techniques
Background Subtraction (Frame Difference)
Gaussian Mixture Model (GMM)

- Simple change detection algorithms, like using the average or median can effectively detect changes in
pixel values over time.
- However, they may not be very resilient to uninteresting changes, such as changes caused by raindrops
or noise.
- To handle complex scenes and distinguish between interesting and uninteresting changes, more
sophisticated models are required.
- Gaussian mixture models (GMM) can be used to model the variation of intensities of colours at each
pixel in the image.
- A scenario involves counting passing cars or identifying cars violating traffic rules in a street scene.
- However, there are complicating factors, including a window with raindrops, snow, and bad weather.
These factors add complexity to the scene.
Ref: Lecture 11: Object Tracking Sejong University
CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 23
Motion detection techniques
Background Subtraction (Frame Difference)

- Focus on the representation of a single pixel within the red window and monitor it over a certain period of time. This
implies tracking changes in the pixel's intensity values to detect interesting events.
- The histogram contains several peaks, and these variations in intensity can be attributed to three main factors:
- Static Background (Road): The intensity variations due to the static background, which may change over time due to
factors like illumination changes.
- Noise: Variations in intensity caused by noise, including image noise and fluctuations due to snow passing through the
pixel.
- Moving Objects (e.g., Cars): Occasional moving objects (like cars) pass through the pixel, resulting in distinctive

histogramaround
zuren

Ref: Lecture 11: Object Tracking Sejong University


CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 24
Motion detection techniques
Background Subtraction (Frame Difference)

Original Video Previous frame Average of previous frames

Median of previous frames Moving Median of previous frames Gaussian Mixture Model (GMM)

CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 32


Motion Detection

Motion Detection Techniques


1. Background Subtraction
Motion Detection

2. Feature Matching
3. Template Matching.
4. Optical Flow

CPCS432 Lecture 9 33
Motion detection techniques
Feature Matching
Rather than focusing on every pixel, these methods track specific features that are more
likely to remain consistent over time.
It works similarly to image registration. You detect and extract features from an object
in one frame, then match those features in later frames. By doing so, the translation and
rotation of an object can be computed.

Examples include:
• SIFT (Scale-Invariant Feature Transform) and SURF (Speeded-Up Robust
Features): These are methods for detecting and matching key points in consecutive
frames, useful for tracking rigid or deformable objects.
• KLT (Kanade-Lucas-Tomasi) Tracker: A popular algorithm in object tracking that
selects and follows feature points with strong gradients over a sequence of frames.
Motion Detection

Motion Detection Techniques


1. Background Subtraction
Motion Detection

2. Feature Matching
3. Template Matching.
4. Optical Flow

https://theailearner.com/2020/12/12/template-matching-using-opencv/
CPCS432 Lecture 9 35
Motion detection techniques
Template Matching
Template matching, you select a portion of an
image and search the following frames for that
pattern of pixels. This method is especially
useful for stabilizing jittery video, where
orientation and lighting are consistent between
frames. You determine the motion by keeping
track of the template location in each frame.

https://www.coursera.org/learn/object-tracking-and-motion-computer-vision/lecture/MkMFh/detecting-motion
Motion Detection

Motion Detection Techniques


1. Background Subtraction
Motion Detection

2. Feature Matching
3. Template Matching.
4. Optical Flow

CPCS432 Lecture 9 37
Motion detection techniques
Optical Flow
Optical flow is a powerful technique to determine motion. It uses the differences between subsequent video frames
and the gradient of those frames to estimate a velocity vector for every pixel.

wie

>
- iti
Motion detection techniques
Optical Flow
• Thus, you don't need to first identify an object or static background. You can then annotate the video by adding
the velocity vectors to each frame.
• Objects moving right or left are easy to distinguish by the large arrows pointing in the direction of motion.
• Velocity arrows indicating motion towards or away from the camera are less obvious objects moving away from
the camera will have outlines getting smaller, so the edges will have velocities that converge objects moving
towards.
Motion detection techniques
Optical Flow
• There is a key constraint with optical flow.
- The illumination of the scene must be approximately constant because optical flow uses the
difference in pixel intensities between frames a shadow or change in lighting could appear as motion.
- This affects the other approaches to motion detection as well, but it is still possible to match features or a
template with some changes in illumination.
n Estimation
Motion

Motion Estimation
a
Motion estimation is the process of determining the Motion Vectors (MV) of objects or
Motion Estimation

regions in a scene between successive frames. A motion vector represents the change in
the position of a pixel or an object between frames, describing both the magnitude and
direction of movement. These motion vectors form a motion field that helps to describe
the overall dynamics of a scene.

CPCS432 Lecture 9 41
Motion Estimation
General Methodologies in Motion
Estimation

Two categories of approaches:


Feature based: finding corresponding features in two different images and then derive the entire
motion field based on the motion vectors at corresponding features.
– More often used for estimating global motion between the two images due to camera motion (or
view angle difference)
Intensity based: directly finding MV at every pixel or parameters that characterize the motion field
based on constant intensity assumption
– More often used for motion compensated prediction and filtering, required in video coding,
frame interpolation

CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 42


Motion Estimation
General Methodologies in Motion
Estimation

Three Problems in Motion Estimation


• How to represent the motion field?
• What criteria to use to estimate motion parameters?
• How to search motion parameters?

Ref: Yao Wang Tandon School of Engineering, New York University Yao Wang, 2021 ECE-GY 6123: Image and Video Processin
CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 44
Motion Estimation
Motion Representation

Global: Entire motion Pixel-based: One MV at each


field is represented by a pixel, with some smoothness
few global parameters constraint between adjacent
(affine, homograph) MVs. a= MV for each pixel

Block-based: Entire frame Region-based: Entire frame is


is divided into blocks, and divided into regions, each region
motion in each block is corresponding to an object or sub
characterized by a few object with consistent motion,
parameters (e.g. a constant represented by a few parameters.
MV) a= Motion parameters for each
a= MV for each block region
Ref: Yao Wang Tandon School of Engineering, New York University Yao Wang, 2021 ECE-GY 6123: Image and Video Processin
CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 45
Motion Estimation
Motion Representation

Mesh-Based: Cover an image with a mesh


(triangular or rectangular or irregular), and define
MVs at mesh nodes, interpolate MVs at other
pixels from nodal MVs.
Mesh is also known as a control grid. Interpolation
is done using spline interpolation. This method is
also known as spline-based method. Mostly used
for deformable registration in medical images

Ref: Yao Wang Tandon School of Engineering, New York University Yao Wang, 2021 ECE-GY 6123: Image and Video Processin
CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 46
n a
Motion Estimation Criterion
Motion estimation criterion refers to the mathematical and algorithmic framework used to
Motion Estimation

evaluate and optimize the estimation of motion vectors between consecutive frames of a video or
image sequence. These criteria form the basis for selecting the most accurate motion vectors that
minimize errors while capturing the actual displacement of objects or pixels.
Goals of Motion Estimation Criterion
– Accuracy: Ensure that the estimated motion vector matches the true motion as closely as
possible. r
– Efficiency: Achieve the goal with minimal computational resources, especially for real-time
applications. ge
– Robustness: Handle variations in lighting, noise, occlusions, and complex motion patterns. Sortt
– Compactness: Enable efficient storage and transmission, especially in video compression. -
Ref: Yao Wang Tandon School of Engineering, New York University Yao Wang, 2021 ECE-GY 6123: Image and Video Processin
war
CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 47
Motion Estimation
Motion Estimation Criterion
Common motion estimation criterion
A-Matching Criteria evaluate the similarity between blocks or pixels in two frames to find the best
correspondence.1-Sum of Absolute Differences (SAD): Measures the absolute difference in pixel
intensities between a block in the current frame and a candidate block in the reference
frame.
Advantages: Simple to compute. Works well for translational motion.
Disadvantages: Sensitive to lighting variations and noise.
Formula:
3-Normalized Cross-Correlation (NCC):
2- Sum of Squared Differences (SSD):
Measures the correlation between pixel
Squares the differences between pixel intensities,
intensities of the blocks.
emphasizing larger errors.
Advantages: Invariant to global intensity
Advantages: Penalizes larger intensity differences
variations (lighting changes).
more than SAD. Reduces the effect of minor errors.
Disadvantages: Computationally expensive
Disadvantages: Computationally heavier than SAD.
due to normalization.
Formula: Formula:
CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 48
Motion Estimation
Motion Estimation Criterion
Common motion estimation criterion
B-Gradient-Based Criteria These rely on optical flow principles and use image gradients to estimate
motion.
1-Optical Flow Constraint (Brightness Constancy): Assumes that pixel intensity remains
constant between frames:

Taylor series expansion yields: Where:

Spatial gradients:
Temporal gradient
Solves for (𝑢,𝑣) the motion vector.
2-Horn-Schunck Method: Adds a smoothness constraint to ensure neighbouring pixels have
similar motion:

Advantages: Provides a dense motion field.


Disadvantages: Computationally intensive.
CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 49
Motion Estimation
Motion Estimation Criterion

Common motion estimation criterion

C-Block Matching Algorithm (BMA): A popular approach in video compression standards like MPEG
and H.264.
Evaluates candidate motion vectors by comparing blocks in the current frame to those in the reference
frame using matching criteria like SAD or SSD.

CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 50


Motion Estimation
How to search motion parameters?

How to search motion parameters?


Search Patterns:
• Full Search: Compares all possible motion vectors in the search window (high accuracy but
computationally expensive).
• Diamond Search: Reduces search computations by focusing on high-similarity regions.
• Hierarchical Search: Uses a coarse-to-fine approach for efficiency.

CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 51


Motion Estimation
Deep Learning in Motion Estimation
Modern deep learning approaches have transformed motion estimation, introducing new methods that
overcome many of the limitations faced by traditional techniques.

1-Optical Flow Networks: Networks like FlowNet and PWC-Net use CNNs trained on large datasets to
estimate optical flow directly, achieving high accuracy and robustness to noise and complex motions.
These networks perform dense motion estimation by learning from real-world scenarios, making them
highly effective for applications like autonomous driving.

2-Recurrent Neural Networks (RNNs): RNNs and LSTMs are used to capture temporal dependencies,
allowing better predictions of motion patterns over time. They are particularly useful for sequential data
and can help improve the accuracy of motion predictions by learning historical patterns.

3-Attention Mechanisms and Transformers: Attention-based networks can focus on specific parts of a
scene, enabling accurate motion estimation even in crowded or cluttered environments. These
mechanisms help networks prioritize areas of interest, which can improve efficiency and accuracy in
tracking applications.
CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 52
Motion Estimation
Applications

Applications of Motion Estimation


• Video Compression: Motion estimation criteria optimize the encoding of motion vectors and residual
errors to achieve efficient compression.
• Object Tracking: Matching and gradient-based criteria are used to track objects across frames in
surveillance and autonomous systems. >
- curs/security
• 3D Reconstruction: Gradient-based methods like optical flow are key in extracting 3D scene
structure from motion.
• Video Stabilization: Matching criteria are employed to correct camera motion and stabilize video
sequences.

CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 53


Motion Estimation
Challenges
~
disadvantages
Motion estimation is a complex problem with several challenges:
• Occlusion Handling: When objects in the foreground occlude background objects, estimating the motion vectors for
those occluded areas becomes difficult.
• Aperture Problem: The aperture problem occurs when the motion of an edge or a small region of an object is
ambiguous. For instance, only motion perpendicular to the edge is visible, making it challenging to estimate the
complete motion vector.
• Lighting and Environmental Changes: Variations in lighting, shadows, or reflections can create misleading motion
estimates, as brightness constancy may no longer hold.
• Complex Motions and Non-Rigid Deformations: Non-rigid objects, like people, can exhibit complex, non-linear
movements that do not align well with rigid body models, complicating the motion estimation process.
• Real-Time Constraints: Applications like video streaming, autonomous driving, and surveillance demand high-speed,
real-time motion estimation, placing pressure on both algorithm efficiency and computational resources.
CPCS432 Lecture 9 Dr. Arwa Basbrain 5/11/2024 54

You might also like