Classification and Uncertainty Quantification of Corrupted Data using Semi-Supervised Autoencoders

Joppich, Philipp; Dorn, Sebastian; De Candido, Oliver; Utschick, Wolfgang; Knollmüller, Jakob

doi:10.3390/psf2022005012

Computer Science > Machine Learning

arXiv:2105.13393 (cs)

[Submitted on 27 May 2021 (v1), last revised 20 Apr 2023 (this version, v2)]

Title:Classification and Uncertainty Quantification of Corrupted Data using Semi-Supervised Autoencoders

Authors:Philipp Joppich, Sebastian Dorn, Oliver De Candido, Wolfgang Utschick, Jakob Knollmüller

View PDF

Abstract:Parametric and non-parametric classifiers often have to deal with real-world data, where corruptions like noise, occlusions, and blur are unavoidable - posing significant challenges. We present a probabilistic approach to classify strongly corrupted data and quantify uncertainty, despite the model only having been trained with uncorrupted data. A semi-supervised autoencoder trained on uncorrupted data is the underlying architecture. We use the decoding part as a generative model for realistic data and extend it by convolutions, masking, and additive Gaussian noise to describe imperfections. This constitutes a statistical inference task in terms of the optimal latent space activations of the underlying uncorrupted datum. We solve this problem approximately with Metric Gaussian Variational Inference (MGVI). The supervision of the autoencoder's latent space allows us to classify corrupted data directly under uncertainty with the statistically inferred latent space activations. Furthermore, we demonstrate that the model uncertainty strongly depends on whether the classification is correct or wrong, setting a basis for a statistical "lie detector" of the classification. Independent of that, we show that the generative model can optimally restore the uncorrupted datum by decoding the inferred latent space activations.

Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2105.13393 [cs.LG]
	(or arXiv:2105.13393v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2105.13393
Journal reference:	hysical Sciences Forum. 2022; 5(1):12
Related DOI:	https://doi.org/10.3390/psf2022005012

Submission history

From: Sebastian Dorn [view email]
[v1] Thu, 27 May 2021 18:47:55 UTC (158 KB)
[v2] Thu, 20 Apr 2023 20:03:19 UTC (1,554 KB)

Computer Science > Machine Learning

Title:Classification and Uncertainty Quantification of Corrupted Data using Semi-Supervised Autoencoders

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Classification and Uncertainty Quantification of Corrupted Data using Semi-Supervised Autoencoders

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators