0% found this document useful (0 votes)

14 views44 pages

CNN For Object Tracking

The document discusses the use of Convolutional Neural Networks (CNNs) for object tracking, focusing on various approaches including Siamese networks and their evolution. It outlines the challenges in object tracking, such as appearance variations and the need for effective classifiers, and presents methods like fully-convolutional Siamese networks and region proposal networks for improved tracking performance. The document concludes with advancements in Siamese networks that enhance localization and efficiency in tracking tasks.

Uploaded by

Đặng Minh Hoàng

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views44 pages

CNN For Object Tracking

Uploaded by

Đặng Minh Hoàng

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

CNNs for Object Tracking

Seunghoon Hong
Course logistics (1)
● Assignment 2 is out
○ Deadline: Midnight June 7th
Course logistics (2)
● The instructions for paper presentation will be released by today
● Please read the instructions VERY CAREFULLY before you start
● Important deadlines (applied strictly; no late submission)
○ May 29: Paper bidding
○ June 7: Prepare presentation video and quiz
○ June 12: Watch presentations and solve quizzes
Recap: approaches in single object tracking
● Probabilistic tracking
○ Formulate the localization task as a sequential probabilistic inference problem
○ Given a probability of the initial target location, propagate it over the remaining frames

● Discriminative tracking
○ Classify the object from the distractors at every frame
○ Can be considered as sequential binary object detection (class = target, background)
Recap: Probabilistic tracking
● Sequential Bayesian filtering via Markov Chain Monte Carlo sampling

where
Recap: Discriminative tracking
Recap: Correlation filtering for discriminative tracking
● Solving a ridge regression via circulant matrices and discrete Fourier transform

Closed form solution.

If X is a circulant matrix,

computation is extremely efficient due to element-wise multiplication

Today’s agenda
● CNNs for (single) object tracking
● Approaches based on Siamese networks
○ Fully-Convolutional Siamese Networks for Object Tracking
○ High Performance Visual Tracking with Siamese Region Proposal Network
○ SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks
○ Fast Online Object Tracking and Segmentation: A Unifying Approach
Revisiting discriminative tracking
● Objective in inference
○ Given the current model of the target, identify the target at the current frame

model candidates
Revisiting challenges in object tracking

● Modeling severe appearance variations of the target

○ Illumination change, occlusion, deformation, rotation, …
Revisiting challenges in object tracking

● Modeling severe appearance variations of the target

○ Illumination change, occlusion, deformation, rotation, …

● Learning to recognize the target

○ The ground-truth for target is only given at the initial frame → one-shot classification!

Different targets in every videos!

Modeling target appearance
● Pre-trained CNN as a general feature extractor
Modeling target appearance
● Pre-trained CNN as a general feature extractor

Employing pre-trained feature alone

gives us descent improvement!

Hong et al., Online Tracking By Learning Discriminative Saliency Map With Convolutional Neural Network
Modeling a discriminator for target
● How can we learn weights for target classification?
Modeling a discriminator for target
● How can we learn weights for target classification?

● Online learning
○ Train a classifier on-the-fly using the ground-truth and tracking results (self-supervised)

Hong et al., Online Tracking By Learning Discriminative Saliency Map With Convolutional Neural Network
Modeling a discriminator for target
● How can we learn weights for target classification?

● Online learning
○ Train a classifier on-the-fly using the ground-truth and tracking results (self-supervised)
○ Problems
■ The model can easily overfit
■ Online update of the classifier is prune to drift (in case of temporal misclassification)
■ Using the pre-trained feature may not appropriate for tracking
(e.g. inaccurate localization due to translation-invariance, never trained for modeling
temporal variations, etc)

Can we pre-train the classifier for tracking?

Pre-training the classifier for tracking?
● Actually, offline training and online deployment is a standard concept in CNN
○ E.g., image classification

Training Testing
Pre-training the classifier for tracking?
● Actually, offline training and online deployment is a standard concept in CNN
○ E.g., image classification

● However, in tracking, the classifier (w) cannot be transferred across videos

○ The definition of foreground/background are different in every videos!
(i.e., tracking targets are different in all videos)

● Can we design a classifier transferable across different targets?

Exemplar-based classifier
● Reformulating the classifier parameterization

x: candidates

Frame #0 Frame #100

Exemplar-based classifier
● Reformulating the classifier parameterization

● Now, we can pre-train the classifier parameters ψ across different videos

○ The classifier weights w are determined adaptively depending on target z
○ If we train this model with various videos, it learns-to-encode arbitrary target
such that the similarity with the ground-truth candidate (x*) is higher than the rest (x)
○ It is transferable across different videos and targets!
Exemplar-based classifier
● Reformulating the classifier parameterization

If ψ=Φ, then we call this as

Siamese network
Fully-convolutional Siamese network

Bertinetto et al., Fully-Convolutional Siamese Networks for Object Tracking

Fully-convolutional Siamese network

Target extracted from

the initial frame

Frame at #t

Bertinetto et al., Fully-Convolutional Siamese Networks for Object Tracking

Fully-convolutional Siamese network
The fully-convolutional
Siamese network is used to extract
features of the target and in frames

Bertinetto et al., Fully-Convolutional Siamese Networks for Object Tracking

Fully-convolutional Siamese network

Considering the feature of the target

as a filter, run convolution on entire
feature map of the frame

Bertinetto et al., Fully-Convolutional Siamese Networks for Object Tracking

Fully-convolutional Siamese network

The output is the score map of

the target densely computed
in every locations

Bertinetto et al., Fully-Convolutional Siamese Networks for Object Tracking

Training
● Learning with videos (finally!)
○ Training dataset: ImageNet Video dataset
○ For each video, sample two frames with sufficient time interval (T)
○ Use one frame to extract the target (z), and the other as a candidate frame (x)
○ Considering the ground-truth location of the target as c, build a soft ground-truth
frame #t

frame #(t+T)
Inference
● Use the initial frame to extract the target (z), and fix it for the rest frames
○ Online update of the target φ(z) is straightforward, but did not get the gain
● Handling scale variation
○ Construct image pyramid of x in multiple scales 1.025 * {−2,−1,0,1,2}
○ Search the best scale with the maximum score
Result
Result
● State-of-the-art performance despite the simplicity
● ● Super-fast!
State-of-the-art (real-time
performance despitespeed)
the simplicity
●
Summary: fully-convolutional Siamese network
● Discriminative tracking via exemplar classifier
○ Use the target at initial frame as a convolution filter = adaptable classifier
○ The entire model is pre-trained end-to-end and transferable across videos
○ Can be deployed to videos with arbitrary target in testing time
● Fully-convolutional network allows Siamese network
○ Both the target classifier and frame-level feature extractor share the same parameters
○ Produces a score map via filtering, which allows super-efficient examination of samples
● Fast, and reasonably accurate
○ Real-time performance (60~80 fps)
Later innovations in Siamese-FC
● Accurate localization through region-proposal network
● Efficient parameterization with deep network
● Mask prediction for further accurate localization
Efficient and accurate modeling of box configuration
● In Siamese-FC, only the scale variation is modeled via image pyramid
● If we want to model variations in more scales and aspect-ratio,
exhaustive search based on image pyramid is not efficient
Siamese network with region proposal
● Efficient search over scale+aspect ratio through region-proposal network

Li et al., High Performance Visual Tracking with Siamese Region Proposal Network
Siamese network with region proposal
● Efficient search over scale+aspect ratio through region-proposal network
k: # of proposals (anchors)
Siamese network with region proposal
● Efficient search over scale+aspect ratio through region-proposal network
The target (template) generates k number of filters
for different bounding boxes

Li et al., High Performance Visual Tracking with Siamese Region Proposal Network
Result

Li et al., High Performance Visual Tracking with Siamese Region Proposal Network
Improving Siamese-RPN
● Efficient parameterization via depth-wise convolution

Li et al., SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks
Improving Siamese-RPN
● Efficient parameterization via depth-wise convolution
● Exploiting very deep network with skip connections

Li et al., SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks
Mask prediction for better localization Just additional mask branch on top
of Siamese-RPN!

Valmadre et al., End-to-end representation learning for Correlation Filter based tracking
Mask prediction for better localization Every pixels predict a binary mask

Valmadre et al., End-to-end representation learning for Correlation Filter based tracking
Result
● Accuracy in terms of bounding box

Valmadre et al., End-to-end representation learning for Correlation Filter based tracking
Result

Valmadre et al., End-to-end representation learning for Correlation Filter based tracking

Angel: Detail Lesson Plan
No ratings yet
Angel: Detail Lesson Plan
14 pages
Simultaneous Equations 1
No ratings yet
Simultaneous Equations 1
8 pages
50 Most Important CNN Interview Questions
No ratings yet
50 Most Important CNN Interview Questions
18 pages
Computer Vision Used in Self-Driving Cars
No ratings yet
Computer Vision Used in Self-Driving Cars
30 pages
Techniques of Integration
No ratings yet
Techniques of Integration
17 pages
Dot Net..nrt
0% (1)
Dot Net..nrt
37 pages
G5 Q2W2 DLL MATH MELCs
No ratings yet
G5 Q2W2 DLL MATH MELCs
13 pages
Optimal Design and Performance Analysis of Hydraulic Ram Pump System
No ratings yet
Optimal Design and Performance Analysis of Hydraulic Ram Pump System
16 pages
One-Shot Image Classification: Adv. Computer Vision Term Project Presentation
No ratings yet
One-Shot Image Classification: Adv. Computer Vision Term Project Presentation
20 pages
Galilean Transformations - Lorentz Transformations - Lorentz Length Contraction - Lorentz Time Dilation
No ratings yet
Galilean Transformations - Lorentz Transformations - Lorentz Length Contraction - Lorentz Time Dilation
16 pages
eBook Tặng 1 - Em Tự Tin Vào Lớp 1 Với Mighty Math Singapore - Full - 224 Trang
No ratings yet
eBook Tặng 1 - Em Tự Tin Vào Lớp 1 Với Mighty Math Singapore - Full - 224 Trang
226 pages
Object Detection Using Convolutional Neural Network Transfer Learning
No ratings yet
Object Detection Using Convolutional Neural Network Transfer Learning
11 pages
Communications in Computer and Information Science 298
No ratings yet
Communications in Computer and Information Science 298
614 pages
Properties of Areas
No ratings yet
Properties of Areas
20 pages
PDC TR-06-02 Rev 1 SBEDS Users Guide DistribA
No ratings yet
PDC TR-06-02 Rev 1 SBEDS Users Guide DistribA
95 pages
Part 2
No ratings yet
Part 2
225 pages
Deep Residual Learning
No ratings yet
Deep Residual Learning
80 pages
Physics
No ratings yet
Physics
68 pages
1.3 Translational Equilibrium Statics
No ratings yet
1.3 Translational Equilibrium Statics
55 pages
Transformers in Single Object Tracking: An Experimental Survey
No ratings yet
Transformers in Single Object Tracking: An Experimental Survey
32 pages
SWATH-USV Innovative USV With SWATH Hull For Superior Operability in Sea States and AUV Support - Brizzolara 2010
No ratings yet
SWATH-USV Innovative USV With SWATH Hull For Superior Operability in Sea States and AUV Support - Brizzolara 2010
22 pages
What Is Cluster Analysis?: Dmitriy (Dima) Gorenshteyn
No ratings yet
What Is Cluster Analysis?: Dmitriy (Dima) Gorenshteyn
54 pages
Computer Organization Hamacher Instructor Manual Solution - Chapter 3
No ratings yet
Computer Organization Hamacher Instructor Manual Solution - Chapter 3
46 pages
Visual Object Tracking
No ratings yet
Visual Object Tracking
42 pages
Cviii 2024 Ws
No ratings yet
Cviii 2024 Ws
98 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
67 pages
W11 Lecture ITS69204 Image Recognition
No ratings yet
W11 Lecture ITS69204 Image Recognition
44 pages
Unit 5
No ratings yet
Unit 5
18 pages
2022 Visual Object Tracking A Survey
No ratings yet
2022 Visual Object Tracking A Survey
42 pages
Cviii 2024 Ws
No ratings yet
Cviii 2024 Ws
45 pages
Introduction To Artificial Intelligence: by Tanu Dixit CS-3 Year
No ratings yet
Introduction To Artificial Intelligence: by Tanu Dixit CS-3 Year
33 pages
Parametric Equations and Polar Coordinates: Dr. Lê Xuân Đ I
No ratings yet
Parametric Equations and Polar Coordinates: Dr. Lê Xuân Đ I
32 pages
NNDL Unit 5
No ratings yet
NNDL Unit 5
21 pages
Navigational Aids Chief Mate F.G. Phase 2 Question Papers Till Nov24 A5gzkf
No ratings yet
Navigational Aids Chief Mate F.G. Phase 2 Question Papers Till Nov24 A5gzkf
92 pages
(SOTA) Deep Learning in Multi-Object Detection and Tracking State of The Art
No ratings yet
(SOTA) Deep Learning in Multi-Object Detection and Tracking State of The Art
30 pages
Siamfc++: Towards Robust and Accurate Visual Tracking With Target Estimation Guidelines
No ratings yet
Siamfc++: Towards Robust and Accurate Visual Tracking With Target Estimation Guidelines
12 pages
Tianyu Yang Learning Dynamic Memory ECCV 2018 Paper
No ratings yet
Tianyu Yang Learning Dynamic Memory ECCV 2018 Paper
16 pages
5 Major Computervision Technique
No ratings yet
5 Major Computervision Technique
10 pages
Assignment-6 STC-DL
No ratings yet
Assignment-6 STC-DL
17 pages
Ilchae Jung Real-Time MDNet ECCV 2018 Paper
No ratings yet
Ilchae Jung Real-Time MDNet ECCV 2018 Paper
16 pages
Tulane University Sea-Level Rise Study
No ratings yet
Tulane University Sea-Level Rise Study
11 pages
Xingping Dong Triplet Loss With ECCV 2018 Paper
No ratings yet
Xingping Dong Triplet Loss With ECCV 2018 Paper
16 pages
SiamCAR Journal2022
No ratings yet
SiamCAR Journal2022
17 pages
Fast Online Object Tracking and Segmentation: A Unifying Approach
No ratings yet
Fast Online Object Tracking and Segmentation: A Unifying Approach
13 pages
Efficient Visual Tracking With Stacked Channel-Spatial Attention Learning
No ratings yet
Efficient Visual Tracking With Stacked Channel-Spatial Attention Learning
13 pages
Object Detection With Deep Learning - A Review Summary
No ratings yet
Object Detection With Deep Learning - A Review Summary
11 pages
Lesson 3 - Analysis of State Transitions
No ratings yet
Lesson 3 - Analysis of State Transitions
13 pages
Reliability in Pavement Design: Paola Dalla Valle, Nick Thom
No ratings yet
Reliability in Pavement Design: Paola Dalla Valle, Nick Thom
15 pages
Dimp Iccv2019
No ratings yet
Dimp Iccv2019
14 pages
Review of Object Tracking Algorithms in Computer V
No ratings yet
Review of Object Tracking Algorithms in Computer V
6 pages
Obstacle Detection and Classification Using Deep Learning For Tracking in High-Speed Autonomous Driving
No ratings yet
Obstacle Detection and Classification Using Deep Learning For Tracking in High-Speed Autonomous Driving
6 pages
Wang SPM-Tracker Series-Parallel Matching For Real-Time Visual Object Tracking CVPR 2019 Paper
No ratings yet
Wang SPM-Tracker Series-Parallel Matching For Real-Time Visual Object Tracking CVPR 2019 Paper
10 pages
Smart Shopping System IEEE PAPER TYK EDI Group 9unique
No ratings yet
Smart Shopping System IEEE PAPER TYK EDI Group 9unique
6 pages
Sensors 22 01585
No ratings yet
Sensors 22 01585
15 pages
(!) Chen One-Shot Adversarial Attacks On Visual Tracking With Dual Attention CVPR 2020 Paper
No ratings yet
(!) Chen One-Shot Adversarial Attacks On Visual Tracking With Dual Attention CVPR 2020 Paper
10 pages
APMOPS (SMOPS) 2008 First Round With Answers
No ratings yet
APMOPS (SMOPS) 2008 First Round With Answers
6 pages
Object Detectionwith Convolutional Neural Networks
No ratings yet
Object Detectionwith Convolutional Neural Networks
12 pages
1.convolutional Neural Networks For Image Classification
No ratings yet
1.convolutional Neural Networks For Image Classification
11 pages
CNNTracking TNN10 Human
No ratings yet
CNNTracking TNN10 Human
14 pages
15a. Caretium NB-201 PDF
No ratings yet
15a. Caretium NB-201 PDF
2 pages
Detect To Track and Track To Detect
No ratings yet
Detect To Track and Track To Detect
10 pages
Self-Supervised Deep Correlation Tracking
No ratings yet
Self-Supervised Deep Correlation Tracking
10 pages
SAT5
No ratings yet
SAT5
17 pages
Real Time Object Detection and Tracking Using Deep Learning and Opencv
No ratings yet
Real Time Object Detection and Tracking Using Deep Learning and Opencv
4 pages
Week 6
No ratings yet
Week 6
8 pages
Deep Learning
No ratings yet
Deep Learning
9 pages
Object Detection Using Deep CNNs Trained On Synthetic Images
No ratings yet
Object Detection Using Deep CNNs Trained On Synthetic Images
8 pages
Fast CNN-Based Object Tracking Using Localization Layers and Deep Features Interpolation
No ratings yet
Fast CNN-Based Object Tracking Using Localization Layers and Deep Features Interpolation
6 pages
Siamadt: Siamese Attention and Deformable Features Fusion Network For Visual Object Tracking
No ratings yet
Siamadt: Siamese Attention and Deformable Features Fusion Network For Visual Object Tracking
18 pages
P17111204047 - Andini Ibriliyanti - 3B - Epiinfo
No ratings yet
P17111204047 - Andini Ibriliyanti - 3B - Epiinfo
7 pages
Siamrcr: Reciprocal Classification and Regression For Visual Object Tracking
No ratings yet
Siamrcr: Reciprocal Classification and Regression For Visual Object Tracking
7 pages
Mate Szarvas Pedestrian Detection With Convolutional Neural Networks IV 2005 Final PDF
No ratings yet
Mate Szarvas Pedestrian Detection With Convolutional Neural Networks IV 2005 Final PDF
6 pages
1 s2.0 S0020025520309890 Main
No ratings yet
1 s2.0 S0020025520309890 Main
26 pages
Tracking Holistic Object Representations
No ratings yet
Tracking Holistic Object Representations
17 pages
Progressive Representation Learning For Real-Time UAV Tracking
No ratings yet
Progressive Representation Learning For Real-Time UAV Tracking
8 pages
Case Problem 3
No ratings yet
Case Problem 3
5 pages
Nonlinear Solid Mechanics A Continuum Ap PDF
No ratings yet
Nonlinear Solid Mechanics A Continuum Ap PDF
2 pages
1 s2.0 S1877050917329113 Main
No ratings yet
1 s2.0 S1877050917329113 Main
6 pages
Research On Object Tracking Based On Siamese Network
No ratings yet
Research On Object Tracking Based On Siamese Network
5 pages
Tracking by Instance Detection - A Meta-Learning Approach
No ratings yet
Tracking by Instance Detection - A Meta-Learning Approach
10 pages
1 Realtimeobjectdetection
No ratings yet
1 Realtimeobjectdetection
6 pages
Batch Normalized Siamese Network Deep Learning Based Image Similarity Estimation
No ratings yet
Batch Normalized Siamese Network Deep Learning Based Image Similarity Estimation
5 pages
Department of Education Division of Cebu Province
No ratings yet
Department of Education Division of Cebu Province
5 pages
Object Detection Using CNN
No ratings yet
Object Detection Using CNN
5 pages
Irjet V7i61094
No ratings yet
Irjet V7i61094
3 pages
2020 Chen SiamBOMB
No ratings yet
2020 Chen SiamBOMB
3 pages
Vitamin Deficiency Detection (Base Paper)
No ratings yet
Vitamin Deficiency Detection (Base Paper)
3 pages
Conference Paper
No ratings yet
Conference Paper
3 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Scanline Rendering: Exploring Visual Realism Through Scanline Rendering Techniques
From Everand
Scanline Rendering: Exploring Visual Realism Through Scanline Rendering Techniques
Fouad Sabry
No ratings yet