3D Masked Autoencoders With Application To Anomaly Detection in Non-Contrast Enhanced Breast MRI
3D Masked Autoencoders With Application To Anomaly Detection in Non-Contrast Enhanced Breast MRI
[email protected]
1
Helmholtz Munich, Germany
2
Technical University of Munich, Germany
3
IBM Research AI, Israel
4
Tel-Aviv University, Israel
5
King’s College London, United Kingdom
1 Introduction
Annotation of medical data requires expert knowledge or labor-intensive test-
ing methods, leading to high curation costs. Therefore, labeled medical imaging
datasets are typically several orders of magnitude smaller than datasets generally
encountered in computer vision. Deep learning networks require large amounts
of data to be trained, making deployment of models in the medical domain cum-
bersome [4]. Self-supervised learning aims at model development in the absence
of labeled examples and has the power to overcome those limiting factors [23].
A pretraining task is utilized to induce prior knowledge into the model, which
2 Lang et al.
will then be fine-tuned for the respective downstream task of interest. One lead-
ing self-supervised approach is the masked autoencoder (MAE) [13], which was
developed on natural imaging data. MAE is a transformer based autoencoder
(AE) model that randomly removes a high fraction of its input patches, with
the intention to recover the uncorrupted images as a self-supervised task.
In addition to constraints in data acquisition, medical datasets are also of-
ten highly imbalanced, featuring a skewed proportion of healthy and unhealthy
examples. Anomaly detection (AD) models are designed to identify rare, uncom-
mon elements that differ significantly from normal cases. In the medical domain,
such models are employed to distinguish abnormal patterns of unhealthy exam-
ples from normal patterns of healthy cases.
Self-supervised anomaly detection combines both training strategies, aiming
to identify abnormal cases without the requirement for labeled examples. This
can be achieved by reconstruction-based methods [3,10]. Models are trained to
recover their input, while restrictions on the architecture are applied. Such re-
strictions can be imposed by information bottlenecks [5] or the alteration of input
images by application of noise [15,27] or removal of image parts [29,28]. During
training, normal examples are shown to the model. In this way, the model is only
able to reconstruct image parts stemming from the normal distribution reason-
ably well, while abnormal image parts result in higher error rates that can be
utilized to generate anomaly maps during test time. Schwarz et al. [21] modified
MAE to perform anomaly detection on natural imaging data. Most AD models
in the area of medical imaging have been developed on MRI of the brain, e.g.
[1,2,15]. Further areas of application include, e.g. chest X-ray, optical coherence
tomography (OCT) and mammography [25].
We aim at model development on breast MRI, which is the most sensitive
breast cancer imaging method [16], applied for tumor staging but also cancer
screening. DCE-MRI refers to the acquisition of images before, during and after
intravenous injection of contrast media, which improves the signal intensity of
neoangiogenically induced vascular changes that allows for better detection of
lesions [26]. However, long scan times and high costs limit widespread use of
the technique, leading different studies to investigate the ability to abbreviate
contrast enhanced breast MRI protocols [16]. We demonstrate the capability of
self-supervised models for anomaly detection on non-contrast enhanced breast
MRI, which reduces the number of required image sequences dramatically and
therefore results in even faster image acquisition. Moreover, no intravenous in-
jection of contrast media is needed, which is known to be able to cause side
effects [12].
Contribution In this work we remodel MAE and extended and further develop
the approach of Schwarz et al. [21], enabling self-supervised anomaly detection
on 3D multi-spectral medical imaging data. To do so, we advance the definition
of input patches and positional embedding of the ViT architecture and refine
the random masking strategy of He et al. [13]. We then train the model on
non-contrast enhanced breast MRI. During training only healthy, non-cancerous
MAEMI for Anomaly Detection in Breast MRI 3
breast MRIs are shown to the model, aiming to identify breast lesions as anoma-
lies during test time. To the best of our knowledge, we are the first to make the
following contributions:
2 Related Work
on one specific task is very unlikely to succeed reasonably well in another prob-
lem setting. In contrast to that, the ability of MAE based anomaly detection to
succeed in modified settings has been proven by [21], achieving state of the art
(SOTA) performance on few- and zero-shot problems.
Identification of lesions in breast MRI has only been performed by supervised
models so far. Maicas et al. [17] trained a deep Q-network for breast lesion
detection, Ayatollahi et al. modified RetinaNet and Herent et al. [14] utilized
a 2D ResNet50. Notably, all of those approaches were trained on DCE-MRI
data, relying on injection of contrast media. Whereas, we perform self-supervised
anomaly detection on non-contrast enhanced MRI.
3 Dataset
4 Method
A scheme of our model can be seen in Figure 1. We modified the ViT architecture
MAEMI for Anomaly Detection in Breast MRI 5
Fig. 1. DCE-MRI imaging vs. MAEMI. For DCE-MRI, several MRIs before, dur-
ing and after injection of contrast media are acquired. MAEMI uses different random
masks for generation of pseudo-healthy recovered images. Both methods construct er-
ror maps by calculation of the mean squared error difference between each of the
post-contrast/reconstructed images with the pre-contrast/ uncorrupted image.
were then summed up and convolved with the same minimum filter as before:
1
E NFS + E FS ∗ min3×3×2 ,
E= (3)
2
for generation of a final MR image level anomaly map.
Metrics We used voxel wise area under the receiver operating characteristics
curve (AUROC) and average precision (AP) as performance measures. Only
voxels lying inside the breast tissue segmentation mask were taken into account
for computation, as injection of contrast media leads also to an uptake in tissue
lying outside the breast area, ref. Figure 5 the Supplemental Materials. However,
for AP one has to consider the large imbalance between normal and abnormal tis-
sue labels, leading to an expected small baseline performance. Moreover, ground
truth annotations were only given in the form of bounding boxes, depicting only
a rough delineation of tumor tissue with several ground truth true positive (TP)
scores involved that should in fact be true negative (TN). This has an higher
impact on AP than on AUROC, as TP scores are involved in precision and recall
but not in the false positive rate (FPR) of the ROC, which also takes TN labels
into account.
5 Results
ViT-patch size and masking ratio have been varied for hyperparameter tuning.
The best performing model featured a masking ratio of 90% and a ViT-patch
size of 8 × 8 × 2, AUROC and AP results are shown in Table 1. Example results
are shown in Figure 2. Mean baseline performance of the AP measure, given by
the number of voxels inside the bounding box divided by the number of voxels
lying inside the breast tissue segmentation mask, was given by 0.046.
Ablation Studies We studied the influence of the patch size and masking ratio
on model performance. Figure 3 presents the dependency of AUROC and AP
on the masking ratio for a fixed ViT-patch size of 8 × 8 × 2. Dependency on
ViT-patch size for a fixed masking ratio of 90% is given in Table 2. The Nvidia
RTX A6000 used for training, featuring 48 GB of memory, only allowed for a
smallest size of 8 × 8 × 2. Therefore, the slice dimension of ViT-patches was fixed
at a value of 4 pixels, probing only different axial sizes.
MAEMI for Anomaly Detection in Breast MRI 7
Fig. 2. Example results. The first two columns show the non-contrast enhanced images
used as an input to the anomaly detection model, and the last two columns present
subtraction images generated by DCE-MRI and anomaly detection maps generated by
MAEMI, respectively. For patients in rows A, B and C, anomaly maps show superior
performance over subtraction images. For patient D, both methods are able to identify
the pathology. For patient E, our model only detects the borders of the pathology,
while the subtraction image identifies the lesion.
MAEMI for Anomaly Detection in Breast MRI
Fig. 3. Ablation study on the masking ratio for a fixed ViT-patch size of 8 × 8 × 2.
High masking ratios lead to better performance, with an optimum reached at 90%.
Afterwards, performance suffers from a steep decline.
6 Discussion
Data Use Declaration All data used for this study is publicly available from
The Cancer Imaging Archive [19,7] under the CC BY-NC 4.0 license.
Supplementary Material
Fig. 4. Reconstruction examples. The left block shows axial slices of T1 non-fat satu-
rated MRI-patches and the right block T1 fat saturated slices. The first column shows
unaltered MRI-patches, the second column the masked model input and the third col-
umn the MRI-patches recovered by MAEMI. Examples represent a masking ratio of
90% (for the whole 3D patch) and a ViT-patch size of 8 × 8 × 2.
MAEMI for Anomaly Detection in Breast MRI
Fig. 5. Subtraction images and anomaly maps were multiplied with segmentation
masks to remove anomalies lying obviously outside of the breast tissue. This is mainly
needed as contrast agent is also taken up in organs outside the breast. The left column
shows the raw subtraction/anomaly map, and the right column the raw maps multi-
plied with the segmentation mask of the image in the upper left corner. Performance
metrics were only calculated for voxels lying inside the segmentation mask, limiting
the influence of trivial predictions, i.e. voxels that represent air do not containing any
anomalies.
Lang et al.
References
1. Baur, C., Denner, S., Wiestler, B., Navab, N., Albarqouni, S.: Autoencoders for
unsupervised anomaly segmentation in brain MR images: a comparative study.
Medical Image Analysis 69, 101952 (2021)
2. Bercea, C.I., Wiestler, B., Rueckert, D., Albarqouni, S.: Federated disentangled
representation learning for unsupervised brain anomaly detection. Nature Machine
Intelligence 4(8), 685–695 (2022)
3. Bergmann, P., Löwe, S., Fauser, M., Sattlegger, D., Steger, C.: Improving unsuper-
vised defect segmentation by applying structural similarity to autoencoders. arXiv
preprint arXiv:1807.02011 (2018)
4. Ching, T., Himmelstein, D.S., Beaulieu-Jones, B.K., Kalinin, A.A., Do, B.T., Way,
G.P., Ferrero, E., Agapow, P.M., Zietz, M., Hoffman, M.M., et al.: Opportunities
and obstacles for deep learning in biology and medicine. Journal of The Royal
Society Interface 15(141), 20170387 (2018)
5. Chow, J.K., Su, Z., Wu, J., Tan, P.S., Mao, X., Wang, Y.H.: Anomaly detection
of defects on concrete structures with the convolutional autoencoder. Advanced
Engineering Informatics 45, 101105 (2020)
6. Chris, L.: 3D-Breast-FGT-and-Blood-Vessel-Segmentation. https://github.com/
mazurowski-lab/3D-Breast-FGT-and-Blood-Vessel-Segmentation (2022)
7. Clark, K., Vendt, B., Smith, K., Freymann, J., Kirby, J., Koppel, P., Moore, S.,
Phillips, S., Maffitt, D., Pringle, M., et al.: The Cancer Imaging Archive (TCIA):
maintaining and operating a public information repository. Journal of digital imag-
ing 26, 1045–1057 (2013)
8. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner,
T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is
worth 16x16 words: Transformers for image recognition at scale. arXiv preprint
arXiv:2010.11929 (2020)
9. Feichtenhofer, C., Fan, H., Li, Y., He, K.: Masked autoencoders as spatiotemporal
learners. arXiv preprint arXiv:2205.09113 (2022)
10. Gong, D., Liu, L., Le, V., Saha, B., Mansour, M.R., Venkatesh, S., Hengel, A.v.d.:
Memorizing normality to detect anomaly: Memory-augmented deep autoencoder
for unsupervised anomaly detection. In: Proceedings of the IEEE/CVF Interna-
tional Conference on Computer Vision. pp. 1705–1714 (2019)
11. Goyal, P., Dollár, P., Girshick, R., Noordhuis, P., Wesolowski, L., Kyrola, A., Tul-
loch, A., Jia, Y., He, K.: Accurate, large minibatch sgd: Training imagenet in 1
hour. arXiv preprint arXiv:1706.02677 (2017)
12. Hasebroock, K.M., Serkova, N.J.: Toxicity of MRI and CT contrast agents. Expert
opinion on drug metabolism & toxicology 5(4), 403–416 (2009)
13. He, K., Chen, X., Xie, S., Li, Y., Dollár, P., Girshick, R.: Masked autoencoders are
scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition. pp. 16000–16009 (2022)
14. Herent, P., Schmauch, B., Jehanno, P., Dehaene, O., Saillard, C., Balleyguier, C.,
Arfi-Rouche, J., Jégou, S.: Detection and characterization of MRI breast lesions
using deep learning. Diagnostic and interventional imaging 100(4), 219–225 (2019)
15. Kascenas, A., Pugeault, N., O’Neil, A.Q.: Denoising autoencoders for unsupervised
anomaly detection in brain MRI. In: International Conference on Medical Imaging
with Deep Learning. pp. 653–664. PMLR (2022)
16. Leithner, D., Moy, L., Morris, E.A., Marino, M.A., Helbich, T.H., Pinker, K.: Ab-
breviated MRI of the breast: does it provide value? Journal of Magnetic Resonance
Imaging 49(7), e85–e100 (2019)
MAEMI for Anomaly Detection in Breast MRI
17. Maicas, G., Carneiro, G., Bradley, A.P., Nascimento, J.C., Reid, I.: Deep reinforce-
ment learning for active breast lesion detection from DCE-MRI. In: Medical Image
Computing and Computer Assisted Intervention- MICCAI 2017: 20th International
Conference, Quebec City, QC, Canada, September 11-13, 2017, Proceedings, Part
III. pp. 665–673. Springer (2017)
18. Prabhakar, C., Li, H.B., Yang, J., Shit, S., Wiestler, B., Menze, B.: ViT-AE++: Im-
proving Vision Transformer Autoencoder for Self-supervised Medical Image Rep-
resentations. arXiv preprint arXiv:2301.07382 (2023)
19. Saha, A., Harowicz, M., Grimm, L., Weng, J., Cain, E., Kim, C., Ghate, S., Walsh,
R., Mazurowski, M.: Dynamic contrast-enhanced magnetic resonance images of
breast cancer patients with tumor locations. The Cancer Imaging Archive (2021)
20. Saha, A., Harowicz, M.R., Grimm, L.J., Kim, C.E., Ghate, S.V., Walsh, R.,
Mazurowski, M.A.: A machine learning approach to radiogenomics of breast can-
cer: a study of 922 subjects and 529 DCE-MRI features. British journal of cancer
119(4), 508–516 (2018)
21. Schwartz, E., Arbelle, A., Karlinsky, L., Harary, S., Scheidegger, F., Doveh, S.,
Giryes, R.: MAEDAY: MAE for few and zero shot AnomalY-Detection. arXiv
preprint arXiv:2211.14307 (2022)
22. Somepalli, G., Wu, Y., Balaji, Y., Vinzamuri, B., Feizi, S.: Unsupervised anomaly
detection with adversarial mirrored autoencoders. In: Uncertainty in Artificial In-
telligence. pp. 365–375. PMLR (2021)
23. Sun, C., Shrivastava, A., Singh, S., Gupta, A.: Revisiting unreasonable effectiveness
of data in deep learning era. In: Proceedings of the IEEE international conference
on computer vision. pp. 843–852 (2017)
24. Tian, Y., Pang, G., Liu, Y., Wang, C., Chen, Y., Liu, F., Singh, R., Verjans,
J.W., Carneiro, G.: Unsupervised anomaly detection in medical images with
a memory-augmented multi-level cross-attentional masked autoencoder. arXiv
preprint arXiv:2203.11725 (2022)
25. Tschuchnig, M.E., Gadermayr, M.: Anomaly detection in medical imaging-a mini
review. In: Data Science–Analytics and Applications: Proceedings of the 4th In-
ternational Data Science Conference–iDSC2021. pp. 33–38. Springer (2022)
26. Turnbull, L.W.: Dynamic contrast-enhanced MRI in the diagnosis and management
of breast cancer. NMR in Biomedicine: An International Journal Devoted to the
Development and Application of Magnetic Resonance In Vivo 22(1), 28–39 (2009)
27. Wyatt, J., Leach, A., Schmon, S.M., Willcocks, C.G.: AnoDDPM: Anomaly De-
tection With Denoising Diffusion Probabilistic Models Using Simplex Noise. In:
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog-
nition (CVPR) Workshops. pp. 650–656 (June 2022)
28. Yan, X., Zhang, H., Xu, X., Hu, X., Heng, P.A.: Learning semantic context from
normal samples for unsupervised anomaly detection. In: Proceedings of the AAAI
Conference on Artificial Intelligence. vol. 35, pp. 3110–3118 (2021)
29. Zavrtanik, V., Kristan, M., Skočaj, D.: Reconstruction by inpainting for visual
anomaly detection. Pattern Recognition 112, 107706 (2021)