Unlocking the potential of reverse distillation for anomaly detection

X Liu, J Wang, B Leng, S Zhang - … of the AAAI Conference on Artificial …, 2025 - ojs.aaai.org
X Liu, J Wang, B Leng, S Zhang
Proceedings of the AAAI Conference on Artificial Intelligence, 2025ojs.aaai.org
Abstract Knowledge Distillation (KD) is a promising approach for unsupervised Anomaly
Detection (AD). However, the student network's over-generalization often diminishes the
crucial representation differences between teacher and student in anomalous regions,
leading to detection failures. To address this problem, the widely accepted Reverse
Distillation (RD) paradigm designs the asymmetry teacher and student network, using an
encoder as teacher and a decoder as student. Yet, the design of RD does not ensure that …
Abstract
Knowledge Distillation (KD) is a promising approach for unsupervised Anomaly Detection (AD). However, the student network's over-generalization often diminishes the crucial representation differences between teacher and student in anomalous regions, leading to detection failures. To address this problem, the widely accepted Reverse Distillation (RD) paradigm designs the asymmetry teacher and student network, using an encoder as teacher and a decoder as student. Yet, the design of RD does not ensure that the teacher encoder effectively distinguishes between normal and abnormal features or that the student decoder generates anomaly-free features. Additionally, the absence of skip connections results in a loss of fine details during feature reconstruction. To address these issues, we propose RD with Expert, which introduces a novel Expert-Teacher-Student network for simultaneous distillation of both the teacher encoder and student decoder. The added expert network enhances the student's ability to generate normal features and optimizes the teacher's differentiation between normal and abnormal features, reducing missed detections. Additionally, Guided Information Injection is designed to filter and transfer features from teacher to student, improving detail reconstruction and minimizing false positives. Experiments on several benchmarks prove that our method outperforms existing unsupervised AD methods under RD paradigm, fully unlocking RD’s potential.
ojs.aaai.org
Showing the best result for this search. See all results