Dual Attention Mechanisms Based Auto-Encoder for Video Anomaly Detection
J Gu, J Zeng, G Ji - International Conference on Adaptive and Intelligent …, 2022 - Springer
J Gu, J Zeng, G Ji
International Conference on Adaptive and Intelligent Systems, 2022•SpringerVideo anomaly detection refers to the identification of abnormal behaviors that do not
conform to normal patterns. Reconstruction of video frames based on auto-encoder is the
current mainstream video anomaly detection method. If frames have higher reconstruction
error than the threshold, these frames will be treated as the anomalous frames. However,
auto-encoders lack attention to global information and channel dependence. The attention
mechanism enables the neural network to accurately focus on input-related elements and …
conform to normal patterns. Reconstruction of video frames based on auto-encoder is the
current mainstream video anomaly detection method. If frames have higher reconstruction
error than the threshold, these frames will be treated as the anomalous frames. However,
auto-encoders lack attention to global information and channel dependence. The attention
mechanism enables the neural network to accurately focus on input-related elements and …
Abstract
Video anomaly detection refers to the identification of abnormal behaviors that do not conform to normal patterns. Reconstruction of video frames based on auto-encoder is the current mainstream video anomaly detection method. If frames have higher reconstruction error than the threshold, these frames will be treated as the anomalous frames. However, auto-encoders lack attention to global information and channel dependence. The attention mechanism enables the neural network to accurately focus on input-related elements and becomes an important part of the neural network. In order to focus the feature of both channel and spatial dimensions, we propose dual attention mechanisms based auto-encoder (DAMAE) for video anomaly detection. After each down-sampling, the feature map is operated by two kinds of attention processing. The feature map is divided into specific groups. Every individual group can autonomously enhance its learnt expression and suppress possible noise. By fusing channel attention and spatial attention, DAMAE is able to capture the pixel-level pairwise relationship and channel dependence. Compared with traditional auto-encoder in the process of each up-sampling, the feature with channel attention and spatial attention can reconstruct the normal pattern of the video better. Experimental results show that our method is superior to other advanced methods, which proves the effectiveness of our method.
Springer
Showing the best result for this search. See all results