Pi-DUAL: Using Privileged Information to Distinguish Clean from Noisy Labels

Wang, Ke; Ortiz-Jimenez, Guillermo; Jenatton, Rodolphe; Collier, Mark; Kokiopoulou, Efi; Frossard, Pascal

Computer Science > Machine Learning

arXiv:2310.06600 (cs)

[Submitted on 10 Oct 2023 (v1), last revised 28 May 2024 (this version, v2)]

Title:Pi-DUAL: Using Privileged Information to Distinguish Clean from Noisy Labels

Authors:Ke Wang, Guillermo Ortiz-Jimenez, Rodolphe Jenatton, Mark Collier, Efi Kokiopoulou, Pascal Frossard

View PDF HTML (experimental)

Abstract:Label noise is a pervasive problem in deep learning that often compromises the generalization performance of trained models. Recently, leveraging privileged information (PI) -- information available only during training but not at test time -- has emerged as an effective approach to mitigate this issue. Yet, existing PI-based methods have failed to consistently outperform their no-PI counterparts in terms of preventing overfitting to label noise. To address this deficiency, we introduce Pi-DUAL, an architecture designed to harness PI to distinguish clean from wrong labels. Pi-DUAL decomposes the output logits into a prediction term, based on conventional input features, and a noise-fitting term influenced solely by PI. A gating mechanism steered by PI adaptively shifts focus between these terms, allowing the model to implicitly separate the learning paths of clean and wrong labels. Empirically, Pi-DUAL achieves significant performance improvements on key PI benchmarks (e.g., +6.8% on ImageNet-PI), establishing a new state-of-the-art test set accuracy. Additionally, Pi-DUAL is a potent method for identifying noisy samples post-training, outperforming other strong methods at this task. Overall, Pi-DUAL is a simple, scalable and practical approach for mitigating the effects of label noise in a variety of real-world scenarios with PI.

Comments:	Accepted ICML 2024
Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2310.06600 [cs.LG]
	(or arXiv:2310.06600v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2310.06600

Submission history

From: Ke Wang [view email]
[v1] Tue, 10 Oct 2023 13:08:50 UTC (14,712 KB)
[v2] Tue, 28 May 2024 13:15:02 UTC (9,328 KB)

Computer Science > Machine Learning

Title:Pi-DUAL: Using Privileged Information to Distinguish Clean from Noisy Labels

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Pi-DUAL: Using Privileged Information to Distinguish Clean from Noisy Labels

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators