Metric-DST: Mitigating Selection Bias Through Diversity-Guided Semi-Supervised Metric Learning

Tepeli, Yasin I.; de Wolf, Mathijs; Gonçalves, Joana P.

Computer Science > Machine Learning

arXiv:2411.18442 (cs)

[Submitted on 27 Nov 2024 (v1), last revised 28 Nov 2024 (this version, v2)]

Title:Metric-DST: Mitigating Selection Bias Through Diversity-Guided Semi-Supervised Metric Learning

Authors:Yasin I. Tepeli, Mathijs de Wolf, Joana P. Gonçalves

View PDF HTML (experimental)

Abstract:Selection bias poses a critical challenge for fairness in machine learning, as models trained on data that is less representative of the population might exhibit undesirable behavior for underrepresented profiles. Semi-supervised learning strategies like self-training can mitigate selection bias by incorporating unlabeled data into model training to gain further insight into the distribution of the population. However, conventional self-training seeks to include high-confidence data samples, which may reinforce existing model bias and compromise effectiveness. We propose Metric-DST, a diversity-guided self-training strategy that leverages metric learning and its implicit embedding space to counter confidence-based bias through the inclusion of more diverse samples. Metric-DST learned more robust models in the presence of selection bias for generated and real-world datasets with induced bias, as well as a molecular biology prediction task with intrinsic bias. The Metric-DST learning strategy offers a flexible and widely applicable solution to mitigate selection bias and enhance fairness of machine learning models.

Comments:	18 pages main manuscript (4 main figures), 7 pages of supplementary
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2411.18442 [cs.LG]
	(or arXiv:2411.18442v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2411.18442

Submission history

From: Joana Gonçalves [view email]
[v1] Wed, 27 Nov 2024 15:29:42 UTC (4,752 KB)
[v2] Thu, 28 Nov 2024 08:34:30 UTC (4,752 KB)

Computer Science > Machine Learning

Title:Metric-DST: Mitigating Selection Bias Through Diversity-Guided Semi-Supervised Metric Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Metric-DST: Mitigating Selection Bias Through Diversity-Guided Semi-Supervised Metric Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators