SafeRAG: Benchmarking Security in Retrieval-Augmented Generation of Large Language Model

Liang, Xun; Niu, Simin; Li, Zhiyu; Zhang, Sensen; Wang, Hanyu; Xiong, Feiyu; Fan, Jason Zhaoxin; Tang, Bo; Song, Shichao; Wang, Mengwei; Yang, Jiawei

Computer Science > Cryptography and Security

arXiv:2501.18636 (cs)

[Submitted on 28 Jan 2025 (v1), last revised 23 Feb 2025 (this version, v2)]

Title:SafeRAG: Benchmarking Security in Retrieval-Augmented Generation of Large Language Model

Authors:Xun Liang, Simin Niu, Zhiyu Li, Sensen Zhang, Hanyu Wang, Feiyu Xiong, Jason Zhaoxin Fan, Bo Tang, Shichao Song, Mengwei Wang, Jiawei Yang

View PDF HTML (experimental)

Abstract:The indexing-retrieval-generation paradigm of retrieval-augmented generation (RAG) has been highly successful in solving knowledge-intensive tasks by integrating external knowledge into large language models (LLMs). However, the incorporation of external and unverified knowledge increases the vulnerability of LLMs because attackers can perform attack tasks by manipulating knowledge. In this paper, we introduce a benchmark named SafeRAG designed to evaluate the RAG security. First, we classify attack tasks into silver noise, inter-context conflict, soft ad, and white Denial-of-Service. Next, we construct RAG security evaluation dataset (i.e., SafeRAG dataset) primarily manually for each task. We then utilize the SafeRAG dataset to simulate various attack scenarios that RAG may encounter. Experiments conducted on 14 representative RAG components demonstrate that RAG exhibits significant vulnerability to all attack tasks and even the most apparent attack task can easily bypass existing retrievers, filters, or advanced LLMs, resulting in the degradation of RAG service quality. Code is available at: this https URL.

Subjects:	Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)
Cite as:	arXiv:2501.18636 [cs.CR]
	(or arXiv:2501.18636v2 [cs.CR] for this version)
	https://doi.org/10.48550/arXiv.2501.18636

Submission history

From: Zhiyu Li [view email]
[v1] Tue, 28 Jan 2025 17:01:31 UTC (1,952 KB)
[v2] Sun, 23 Feb 2025 10:46:28 UTC (2,797 KB)

Computer Science > Cryptography and Security

Title:SafeRAG: Benchmarking Security in Retrieval-Augmented Generation of Large Language Model

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Cryptography and Security

Title:SafeRAG: Benchmarking Security in Retrieval-Augmented Generation of Large Language Model

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators