D2LLM: Decomposed and Distilled Large Language Models for Semantic Search

Liao, Zihan; Yu, Hang; Li, Jianguo; Wang, Jun; Zhang, Wei

Computer Science > Computation and Language

arXiv:2406.17262 (cs)

[Submitted on 25 Jun 2024]

Title:D2LLM: Decomposed and Distilled Large Language Models for Semantic Search

Authors:Zihan Liao, Hang Yu, Jianguo Li, Jun Wang, Wei Zhang

View PDF HTML (experimental)

Abstract:The key challenge in semantic search is to create models that are both accurate and efficient in pinpointing relevant sentences for queries. While BERT-style bi-encoders excel in efficiency with pre-computed embeddings, they often miss subtle nuances in search tasks. Conversely, GPT-style LLMs with cross-encoder designs capture these nuances but are computationally intensive, hindering real-time applications. In this paper, we present D2LLMs-Decomposed and Distilled LLMs for semantic search-that combines the best of both worlds. We decompose a cross-encoder into an efficient bi-encoder integrated with Pooling by Multihead Attention and an Interaction Emulation Module, achieving nuanced understanding and pre-computability. Knowledge from the LLM is distilled into this model using contrastive, rank, and feature imitation techniques. Our experiments show that D2LLM surpasses five leading baselines in terms of all metrics across three tasks, particularly improving NLI task performance by at least 6.45%. The source code is available at this https URL.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2406.17262 [cs.CL]
	(or arXiv:2406.17262v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2406.17262

Submission history

From: Zihan Liao [view email]
[v1] Tue, 25 Jun 2024 04:03:04 UTC (1,419 KB)

Computer Science > Computation and Language

Title:D2LLM: Decomposed and Distilled Large Language Models for Semantic Search

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:D2LLM: Decomposed and Distilled Large Language Models for Semantic Search

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators