Uncertainty Estimation of Transformers' Predictions via Topological Analysis of the Attention Matrices

Kostenok, Elizaveta; Cherniavskii, Daniil; Zaytsev, Alexey

Computer Science > Machine Learning

arXiv:2308.11295 (cs)

[Submitted on 22 Aug 2023 (v1), last revised 17 Sep 2024 (this version, v3)]

Title:Uncertainty Estimation of Transformers' Predictions via Topological Analysis of the Attention Matrices

Authors:Elizaveta Kostenok, Daniil Cherniavskii, Alexey Zaytsev

View PDF HTML (experimental)

Abstract:Transformer-based language models have set new benchmarks across a wide range of NLP tasks, yet reliably estimating the uncertainty of their predictions remains a significant challenge. Existing uncertainty estimation (UE) techniques often fall short in classification tasks, either offering minimal improvements over basic heuristics or relying on costly ensemble models. Moreover, attempts to leverage common embeddings for UE in linear probing scenarios have yielded only modest gains, indicating that alternative model components should be explored.
We tackle these limitations by harnessing the geometry of attention maps across multiple heads and layers to assess model confidence. Our approach extracts topological features from attention matrices, providing a low-dimensional, interpretable representation of the model's internal dynamics. Additionally, we introduce topological features to compare attention patterns across heads and layers. Our method significantly outperforms existing UE techniques on benchmarks for acceptability judgments and artificial text detection, offering a more efficient and interpretable solution for uncertainty estimation in large-scale language models.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2308.11295 [cs.LG]
	(or arXiv:2308.11295v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2308.11295

Submission history

From: Elizaveta Kostenok [view email]
[v1] Tue, 22 Aug 2023 09:17:45 UTC (474 KB)
[v2] Mon, 16 Sep 2024 15:41:59 UTC (873 KB)
[v3] Tue, 17 Sep 2024 09:44:27 UTC (873 KB)

Computer Science > Machine Learning

Title:Uncertainty Estimation of Transformers' Predictions via Topological Analysis of the Attention Matrices

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Uncertainty Estimation of Transformers' Predictions via Topological Analysis of the Attention Matrices

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators