Transductive Learning for Textual Few-Shot Classification in API-based Embedding Models

Colombo, Pierre; Pellegrain, Victor; Boudiaf, Malik; Storchan, Victor; Tami, Myriam; Ayed, Ismail Ben; Hudelot, Celine; Piantanida, Pablo

Computer Science > Computation and Language

arXiv:2310.13998 (cs)

[Submitted on 21 Oct 2023]

Title:Transductive Learning for Textual Few-Shot Classification in API-based Embedding Models

Authors:Pierre Colombo, Victor Pellegrain, Malik Boudiaf, Victor Storchan, Myriam Tami, Ismail Ben Ayed, Celine Hudelot, Pablo Piantanida

View PDF

Abstract:Proprietary and closed APIs are becoming increasingly common to process natural language, and are impacting the practical applications of natural language processing, including few-shot classification. Few-shot classification involves training a model to perform a new classification task with a handful of labeled data. This paper presents three contributions. First, we introduce a scenario where the embedding of a pre-trained model is served through a gated API with compute-cost and data-privacy constraints. Second, we propose a transductive inference, a learning paradigm that has been overlooked by the NLP community. Transductive inference, unlike traditional inductive learning, leverages the statistics of unlabeled data. We also introduce a new parameter-free transductive regularizer based on the Fisher-Rao loss, which can be used on top of the gated API embeddings. This method fully utilizes unlabeled data, does not share any label with the third-party API provider and could serve as a baseline for future research. Third, we propose an improved experimental setting and compile a benchmark of eight datasets involving multiclass classification in four different languages, with up to 151 classes. We evaluate our methods using eight backbone models, along with an episodic evaluation over 1,000 episodes, which demonstrate the superiority of transductive inference over the standard inductive setting.

Comments:	EMNLP 2023
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2310.13998 [cs.CL]
	(or arXiv:2310.13998v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2310.13998

Submission history

From: Pierre Colombo [view email]
[v1] Sat, 21 Oct 2023 12:47:10 UTC (5,991 KB)

Computer Science > Computation and Language

Title:Transductive Learning for Textual Few-Shot Classification in API-based Embedding Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Transductive Learning for Textual Few-Shot Classification in API-based Embedding Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators