0% found this document useful (0 votes)
244 views13 pages

OWASP Top 10 For LLMs 2023 Slides v1 - 0

The document outlines the Top 10 security risks for large language models, including prompt injection, insecure output handling, training data poisoning, model denial of service, sensitive information disclosure, insecure plugin design, excessive agency, overreliance on models, and model theft. The risks range from manipulation of models through crafted inputs to exploitation of downstream systems and introduction of vulnerabilities, biases, or degraded performance through poisoning of training data. Mitigations include access controls, input validation, separate training data, and human oversight.

Uploaded by

rfernandez2007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
244 views13 pages

OWASP Top 10 For LLMs 2023 Slides v1 - 0

The document outlines the Top 10 security risks for large language models, including prompt injection, insecure output handling, training data poisoning, model denial of service, sensitive information disclosure, insecure plugin design, excessive agency, overreliance on models, and model theft. The risks range from manipulation of models through crafted inputs to exploitation of downstream systems and introduction of vulnerabilities, biases, or degraded performance through poisoning of training data. Mitigations include access controls, input validation, separate training data, and human oversight.

Uploaded by

rfernandez2007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

OWASP Top 10 for LLM

VERSION 1.0
Published: August 1, 2023
| OWASP Top 10 for LLM v1.0

OWASP Top 10 for LLM


LLM01 LLM02 LLM03 LLM04 LLM05

Prompt Injection Insecure Output


Training Data
Model Denial of
Supply Chain
This manipulates a large language
Handling Poisoning Service Vulnerabilities
model (LLM) through crafty inputs, This vulnerability occurs when an LLM Training data poisoning refers to Attackers cause resource-heavy LLM application lifecycle can be
causing unintended actions by the LLM. output is accepted without scrutiny, manipulating the data or fine-tuning operations on LLMs, leading to service compromised by vulnerable
Direct injections overwrite system exposing backend systems. Misuse process to introduce vulnerabilities, degradation or high costs. The components or services, leading to
prompts, while indirect ones manipulate may lead to severe consequences like backdoors or biases that could vulnerability is magnified due to the security attacks. Using third-party
inputs from external sources. XSS, CSRF, SSRF, privilege escalation, or compromise the model’s security, resource-intensive nature of LLMs and datasets, pre- trained models, and
remote code execution. effectiveness or ethical behavior. unpredictability of user inputs. plugins add vulnerabilities.

LLM06 LLM07 LLM08 LLM09 LLM10

Sensitive Information Insecure Plugin


Excessive Agency Overreliance Model Theft
Disclosure Design LLM-based systems may undertake Systems or people overly depending on This involves unauthorized access,
LLM’s may inadvertently reveal LLM plugins can have insecure inputs actions leading to unintended LLMs without oversight may face copying, or exfiltration of proprietary
confidential data in its responses, and insufficient access control due to consequences. The issue arises from misinformation, miscommunication, LLM models. The impact includes
leading to unauthorized data access, lack of application control. Attackers excessive functionality, permissions, or legal issues, and security vulnerabilities economic losses, compromised
privacy violations, and security can exploit these vulnerabilities, autonomy granted to the LLM-based due to incorrect or inappropriate content competitive advantage, and potential
breaches. Implement data sanitization resulting in severe consequences like systems. generated by LLMs. access to sensitive information.
and strict user policies to mitigate this. remote code execution.
| OWASP Top 10 for LLM v1.0

EXAMPLES
LLM01
Direct prompt injections overwrite system prompts

Indirect prompt injections hijack the conversation context

A user employs an LLM to summarize a webpage containing an indirect

Prompt Injection prompt injection.

Attackers can manipulate LLM’s through PREVENTION

crafted inputs, causing it to execute the Enforce privilege control on LLM access to backend systems

Implement human in the loop for extensible functionality


attacker's intentions. This can be done Segregate external content from user prompts

directly by adversarially prompting the Establish trust boundaries between the LLM, external sources, and

extensible functionality.
system prompt or indirectly through

manipulated external inputs, potentially AT TACK SCEN ARIOS

leading to data exfiltration, social An attacker provides a direct prompt injection to an LLM-based support

chatbot
engineering, and other issues. An attacker embeds an indirect prompt injection in a webpage

A user employs an LLM to summarize a webpage containing an indirect

prompt injection.
| OWASP Top 10 for LLM v1.0

EXAMPLES
LLM02
LLM output is entered directly into a system shell or similar function,
resulting in remote code execution

Insecure Output JavaScript or Markdown is generated by the LLM and returned to a


user, resulting in XSS.

Handling PREVENTION
Apply proper input validation on responses coming from the model to
Insecure Output Handling is a vulnerability backend functions

that arises when a downstream component Encode output coming from the model back to users to mitigate
undesired code interpretations.
blindly accepts large language model (LLM)
output without proper scrutiny. This can ATTACK SCENARIOS

lead to XSS and CSRF in web browsers as An application directly passes the LLM-generated response into an
internal function responsible for executing system commands without
well as SSRF, privilege escalation, or remote proper validation

code execution on backend systems. A user utilizes a website summarizer tool powered by a LLM to generate a
concise summary of an article, which includes a prompt injection
An LLM allows users to craft SQL queries for a backend database through
a chat-like feature.
| OWASP Top 10 for LLM v1.0

EXAMPLES
LLM03

A malicious actor creates inaccurate or malicious documents

targeted at a model’s training data

The model trains using falsified information or unverified data which

Training Data is reflected in output.

Poisoning P R E V E NT I O N

Verify the legitimacy of targeted data sources during both the training and

Training Data Poisoning refers to fine-tuning stages

Craft different models via separate training data different use-cases


manipulating the data or fine-tuning process
Use strict vetting or input filters for specific training data or categories of

to introduce vulnerabilities, backdoors or data sources.

biases that could compromise the model’s


AT TAC K S C E N A R I O S

security, effectiveness or ethical behavior.


Output can mislead users of the application leading to biased opinions

This risks performance degradation, A malicious user of the application may try to influence and inject toxic

data into the model


downstream software exploitation and
A malicious actor or competitor creates inaccurate or falsified information

reputational damage. targeted at a model’s training data

The vulnerability Prompt Injection could be an attack vector to this

vulnerability if insufficient sanitization and filtering is performed.


| OWASP Top 10 for LLM v1.0

EXAMPLES
LLM04

Posing queries that lead to recurring resource usage through high-

volume generation of tasks in a queue

Sending queries that are unusually resource-consuming

Model Denial of Continuous input overflow: An attacker sends a stream of input to the

LLM that exceeds its context window.

Service
P R E V E NT I O N

Model Denial of Service occurs when an Implement input validation and sanitization to ensure input adheres to

defined limits, and cap resource use per request or step


attacker interacts with a Large Language
Enforce API rate limits to restrict the number of requests an individual

Model (LLM) in a way that consumes an user or IP address can make

Limit the number of queued actions and the number of total actions in a

exceptionally high amount of resources.


system reacting to LLM responses.

This can result in a decline in the quality of

AT TA C K S C E N A R I O S
service for them and other users, as well as

Attackers send multiple requests to a hosted model that are difficult and
potentially incurring high resource costs.
costly for it to process

A piece of text on a webpage is encountered while an LLM-driven tool is

collecting information to respond to a benign query

Attackers overwhelm the LLM with input that exceeds its context window.
| OWASP Top 10 for LLM v1.0

EXA M PLES
LLM05
Using outdated third-party packages

Fine-tuning with a vulnerable pre-trained model

Training using poisoned crowd-sourced data

Supply Chain Utilizing deprecated, unmaintained models

Lack of visibility into the supply chain is.

Vulnerabilities
PREVENTI ON

Supply chain vulnerabilities in LLMs can Vet data sources and use independently-audited security systems

Use trusted plugins tested for your requirements


compromise training data, ML models, and
Apply MLOps best practices for own models

deployment platforms, causing biased Use model and code signing for external models

Implement monitoring for vulnerabilities and maintain a patching policy


results, security breaches, or total system
Regularly review supplier security and access.

failures. Such vulnerabilities can stem from


AT TACK SCEN A RI OS
outdated software, susceptible pre-trained
Attackers exploit a vulnerable Python library
models, poisoned training data, and
Attacker tricks developers via a compromised PyPi package

insecure plugin designs. Publicly available models are poisoned to spread misinformation

A compromised supplier employee steals IP

An LLM operator changes T&Cs to misuse application data.


| OWASP Top 10 for LLM v1.0

EXAMPLES
LLM06
Incomplete filtering of sensitive data in responses
Overfitting or memorizing sensitive data during training

Sensitive Information Unintended disclosure of confidential information due to errors.

Disclosure PREVENTION
Use data sanitization and scrubbing techniques
Implement robust input validation and sanitization
LLM applications can inadvertently disclose Limit access to external data sources

sensitive information, proprietary Apply the rule of least privilege when training models
Maintain a secure supply chain and strict access control.
algorithms, or confidential data, leading to
unauthorized access, intellectual property ATTACK SCENARIOS

theft, and privacy breaches. To mitigate Legitimate user exposed to other user data via LLM
Crafted prompts used to bypass input filters and reveal sensitive data
these risks, LLM applications should Personal data leaked into the model via training data increases risk.

employ data sanitization, implement


appropriate usage policies, and restrict the
types of data returned by the LLM.
| OWASP Top 10 for LLM v1.0

EXAMPLES
LLM07
Plugins accepting all parameters in a single text field or raw SQL or
programming statements

Insecure Plugin Authentication without explicit authorization to a particular plugin


Plugins treating all LLM content as user-created and performing

Design
actions without additional authorization.

PREVENTION
Plugins can be prone to malicious requests Enforce strict parameterized input and perform type and range checks

leading to harmful consequences like data Conduct thorough inspections and tests including SAST, DAST, and IAST
Use appropriate authentication identities and API Keys for authorization
exfiltration, remote code execution, and and access control
Require manual user authorization for actions taken by sensitive plugins.
privilege escalation due to insufficient
access controls and improper input ATTACK SCENARIOS
validation. Developers must follow robust Attackers craft requests to inject their own content with controlled

security measures to prevent exploitation, domains


Attacker exploits a plugin accepting free-form input to perform data
like strict parameterized inputs and secure exfiltration or privilege escalation

access control guidelines. Attacker stages a SQL attack via a plugin accepting SQL WHERE clauses
as advanced filters.
| OWASP Top 10 for LLM v1.0

EXAMPLES
LLM08
An LLM agent accesses unnecessary functions from a plugin
An LLM plugin fails to filter unnecessary input instructions

Excessive Agency A plugin possesses unneeded permissions on other systems


An LLM plugin accesses downstream systems with high-privileged
identity.
Excessive Agency in LLM-based systems is
PREVENTION
a vulnerability caused by over-functionality,
Limit plugins/tools that LLM agents can call, and limit functions
excessive permissions, or too much implemented in LLM plugins/tools to the minimum necessary
autonomy. To prevent this, developers need Avoid open-ended functions and use plugins with granular functionality
Require human approval for all actions and track user authorization
to limit plugin functionality, permissions, Log and monitor the activity of LLM plugins/tools and downstream
and autonomy to what's absolutely systems, and implement rate-limiting to reduce the number of undesirable
actions.
necessary, track user authorization, require
human approval for all actions, and ATTACK SCENARIOS

implement authorization in downstream An LLM-based personal assistant app with excessive permissions and
autonomy is tricked by a malicious email into sending spam. This could be
systems. prevented by limiting functionality, permissions, requiring user approval, or
implementing rate limiting.
| OWASP Top 10 for LLM v1.0

EXAMPLES
LLM09
LLM provides incorrect information
LLM generates nonsensical text

Overreliance LLM suggests insecure code


Inadequate risk communication from LLM providers.

Overreliance on LLMs can lead to serious PREVENTION

consequences such as misinformation, Regular monitoring and review of LLM outputs


Cross-check LLM output with trusted sources
legal issues, and security vulnerabilities.
Enhance model with fine-tuning or embeddings
It occurs when an LLM is trusted to make Implement automatic validation mechanisms
Break tasks into manageable subtasks
critical decisions or generate content Clearly communicate LLM risks and limitations
without adequate oversight or validation. Establish secure coding practices in development environments.

ATTACK SCENARIOS
AI fed misleading info leading to disinformation
AI's code suggestions introduce security vulnerabilities
Developer unknowingly integrates malicious package suggested by AI.
| OWASP Top 10 for LLM v1.0

EXAMPLES
LLM10
Attacker gains unauthorized access to LLM model
Disgruntled employee leaks model artifacts

Model Theft Attacker crafts inputs to collect model outputs


Side-channel attack to extract model info
Use of stolen model for adversarial attacks.

LLM model theft involves unauthorized


PREVENTION
access to and exfiltration of LLM models,
Implement strong access controls, authentication, and monitor/audit
risking economic loss, reputation damage, access logs regularly
and unauthorized access to sensitive data. Implement rate limiting of API calls
Watermarking framework in LLM's lifecycle
Robust security measures are essential to Automate MLOps deployment with governance.

protect these models.


ATTACK SCENARIOS

Unauthorized access to LLM repository for data theft


Leaked model artifacts by disgruntled employee
Creation of a shadow model through API queries
Data leaks due to supply-chain control failure
Side-channel attack to retrieve model information.
| OWASP Top 10 for LLM v1.0

Key Reference Links


Arxiv: Prompt Injection attack against LLM-integrated Application MITRE: Tay Poisonin
Defending ChatGPT against Jailbreak Attack via Self-Reminde Backdoor Attacks on Language Models: Can We Trust Our Model’s
GitHub: OpenAI Chat Markup Languag Weights
Arxiv: Not what you’ve signed up for: Compromising Real-World Arxiv: Poisoning Language Models During Instruction Tunin
LLM-Integrated Applications with Indirect Prompt Injectio ChatGPT Data Breach Confirmed as Security Firm Warns of
AI Village: Threat Modeling LLM Application Vulnerable Component Exploitatio
OpenAI: Safety Best Practice What Happens When an AI Company Falls Victim to a Software
Snyk: Arbitrary Code Executio Supply Chain Vulnerabilit
Stanford: Training Dat OpenAI: Plugin Review Proces
CSO: How data poisoning attacks corrupt machine learning model Compromised PyTorch-nightly dependency chain
MITRE: ML Supply Chain Compromise

You might also like