Adversarial Self-Attack Defense and Spatial-Temporal Relation Mining for Visible-Infrared Video Person Re-Identification

Li, Huafeng; Xu, Le; Zhang, Yafei; Tao, Dapeng; Yu, Zhengtao

Computer Science > Computer Vision and Pattern Recognition

arXiv:2307.03903 (cs)

[Submitted on 8 Jul 2023 (v1), last revised 11 Aug 2023 (this version, v3)]

Title:Adversarial Self-Attack Defense and Spatial-Temporal Relation Mining for Visible-Infrared Video Person Re-Identification

Authors:Huafeng Li, Le Xu, Yafei Zhang, Dapeng Tao, Zhengtao Yu

View PDF

Abstract:In visible-infrared video person re-identification (re-ID), extracting features not affected by complex scenes (such as modality, camera views, pedestrian pose, background, etc.) changes, and mining and utilizing motion information are the keys to solving cross-modal pedestrian identity matching. To this end, the paper proposes a new visible-infrared video person re-ID method from a novel perspective, i.e., adversarial self-attack defense and spatial-temporal relation mining. In this work, the changes of views, posture, background and modal discrepancy are considered as the main factors that cause the perturbations of person identity features. Such interference information contained in the training samples is used as an adversarial perturbation. It performs adversarial attacks on the re-ID model during the training to make the model more robust to these unfavorable factors. The attack from the adversarial perturbation is introduced by activating the interference information contained in the input samples without generating adversarial samples, and it can be thus called adversarial self-attack. This design allows adversarial attack and defense to be integrated into one framework. This paper further proposes a spatial-temporal information-guided feature representation network to use the information in video sequences. The network cannot only extract the information contained in the video-frame sequences but also use the relation of the local information in space to guide the network to extract more robust features. The proposed method exhibits compelling performance on large-scale cross-modality video datasets. The source code of the proposed method will be released at this https URL.

Comments:	11 pages,8 figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2307.03903 [cs.CV]
	(or arXiv:2307.03903v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2307.03903

Submission history

From: Huafeng Li [view email]
[v1] Sat, 8 Jul 2023 05:03:10 UTC (7,580 KB)
[v2] Mon, 17 Jul 2023 09:08:49 UTC (7,764 KB)
[v3] Fri, 11 Aug 2023 09:15:27 UTC (7,829 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Adversarial Self-Attack Defense and Spatial-Temporal Relation Mining for Visible-Infrared Video Person Re-Identification

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Adversarial Self-Attack Defense and Spatial-Temporal Relation Mining for Visible-Infrared Video Person Re-Identification

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators