LLM Security
LLM Security
Neumann
Sysploit
-Presenter A
-Presenter
(Details) B
(Details)
What is AI and Generative AI ?
● Backdoor Attacks
● Adversarial Attacks
● Model Inversion Attacks
● Distillation Attacks
● Hyperparameter Tampering
And so on …
Backdoor Attacks
Model inversion attacks are a class of attacks that specifically target machine
learning models with the aim to reverse engineer and reconstruct the input
data solely from the model outputs.
This becomes particularly alarming for models that have been trained on data
of a sensitive nature, such as personal health records or detailed financial
information. In such scenarios, malicious entities might potentially harness the
power of these attacks to infer private details about individual data points.
Scenario: A large language model (LLM) is trained on a massive dataset of
text and code, potentially containing private information like user
comments, emails, or even code snippets.
A trained LLM, used for creative tasks, could be stolen. Attackers might use
the stolen model and with some crawled auxiliary information, reconstruct
private user data that was used to train the language model. This could be
a serious privacy breach.
Mitigation Techniques