SANS Cloud Security Principles
SANS Cloud Security Principles
CLOUD SECURITY:
FIRST PRINCIPLES
AND FUTURE
OPPORTUNITIES
In partnership with:
Foreword
It is estimated that approximately half of enterprise workloads are in the public cloud
today.1 This is forecast to increase to more than 45% in three years1. Although this may
seem like a large number, cloud adoption has been slow and steady since the first
Whitepaper
public cloud services were made available some 20 years ago.
It’s been a journey of increased enterprise cloud adoption for many organizations. Every
security team and security professional also has been on a journey—a cloud security
journey to improve their capabilities, skills, and opportunities to keep up with the
changing business, technology, and threat landscape.
A DNS SECURITY
That’s why we are back, for the third year in a row, pairing leaders from the three major
cloud providers—Amazon Web Services (AWS), Google Cloud, and Microsoft Azure—with
independent technical experts from SANS Institute to give you insights to improve your
ARCHITECTURE
cloud security capabilities.
This book has chapters on your cloud security journey ranging from architecture to
threat detection to investigations. Along the way, the authors cover Secure by Design
AS SECOPS FORCE
principles, identity modernization, and how to update security best practices specifically
for the cloud.
MULTIPLIER
Additionally, no resource today would be timely without a discussion of AI hype,
challenges, and opportunities. This up-to-the-minute content includes an overview
of the typical generative AI (GenAI) application architecture, as well as the risks,
mitigations, and common security use cases for GenAI applications.
I hope you enjoy the content, and good luck on your cloud security journey!
Frank Kim
Written
Fellow andby John Pescatore
Curriculum Lead
SANS Institute
February 2023
1
“The race to cloud: Reaching the inflection point to long sought value,” www.
accenture.com/us-en/insights/cloud/cloud-outcomes-perspective
Table of Contents
Chapter 1: The Cloud Security Journey: Day One Chapter 4: Evolving Cloud Security with a Modern
Introduction 1 Approach
Prepared, Protected, and Ready to Proceed 6 Introduction: Why Bad Things Keep Happening
in Cloud 38
Old Things Done New Ways: Why Core Best
Chapter 2: Building Security from the Ground up
Practices Are Still Good, but Likely Need Updates 38
with Secure by Design
The Top 10 Things to Do for Sound Cloud Security 39
Introduction 7
Looking Ahead: Adaptation and Better Security
Understanding Secure by Design and
in the Cloud 47
Secure by Default 7
Embedding Secure by Design into Your
Security Strategy 8 Chapter 5: AI Security Challenges, Hype, and
Opportunities
Planning for the Short- and Long-Term 14
Introduction 48
Facilitating a Culture of Security 15
Terminology, Concepts, and Typical Architecture 48
Benefits 17
Risk Considerations for AI Applications 50
Key Considerations on the Secure by Design Path 18
Mitigation Strategies for Addressing GenAI Risks 52
Getting Started 19
Security Use Cases for GenAI Applications 53
Conclusion 21
Software Composition Analysis (SCA) 54
Static/Dynamic Application Security
Chapter 3: Identity Modernization
Testing (SAST/DAST) 54
Introduction 22
Policy as Code Development and Analysis 55
Challenges in Identity Management 23
Automated Abuse Case Testing 55
The Imperative for Identity Modernization 26
Conclusion 56
Zero Trust and Conditional Access Controls 30
Integrating Legacy Systems with Modern Identity
Solutions 32
Conclusion 36
i
Introduction
Let’s face it, the first day in any new position is inevitably challenging, stressful, even a
little scary. That’s the case for Katie M., who’s just stepped into an important new role
with Cyrene Life Sciences, a multinational pharmaceutical manufacturer headquartered
in the United States, where she’s now responsible for the security of all the systems,
applications, and data the company maintains in the cloud. As a senior-level security
professional with more than 15 years’ experience in the field—including, most recently,
five years as manager of Cyrene’s central security operations center (SOC)—Katie’s not
a newcomer to cloud computing. She can draw on a broad general understanding of
the cloud delivery model and its security implications, and she has more in-depth
knowledge of one of the “big three” cloud service providers (CSPs). Even so, she knows
she has a lot to learn, especially since the company is planning to increase both the
size and the complexity of its already extensive cloud operations, and she knows she
must hit the ground running on day one. That’s why three SANS Institute cloud security
experts have come together to lay out the most critical steps Katie—or any other
security professional in her position—will need to take on that all-important first day
on the job. These steps are presented in roughly the order she’ll need to take them,
but they’re all essential. And, she’ll need to address them all, at least at a basic level,
starting on day one.
• Identity and access management (IAM)—Ensuring that the right people have the
right access to the right resources for the right reasons
• Data security—Protecting the sensitive information the enterprise maintains
and manages, including personally identifiable information (PII), personal health
information (PHI), and intellectual property (IP)
• Asset management—Identifying, tracking, and securing all the enterprise’s
digital resources, including the types of compute services and cloud-native
services in use, the software and network infrastructure—and any on-premises
implementations that interact with the cloud—and whether network connectivity
exists between on-premises implementations and the cloud
• Regulatory compliance and corporate governance—Identifying any security
controls that have already been applied to the cloud environment based on
Cyrene’s regulatory, legal, and business requirements
Some of what Katie needs to do first is fairly straightforward. She needs to find out
what CSPs the company is currently working with, which ones host business-critical
applications, how large the overall cloud footprint is. For example, she needs to
know how many cloud accounts there are—and what identities have access to the
CSP. She also must determine whether there’s an associated organizational setup for
governance policy in place and identify any areas across the entire cloud footprint
where responsibility for compliance with industry standards may be shared with the
CSP. Most large enterprises have built their foundational cloud infrastructure mainly
on one of the big three—Amazon Web Services (AWS), Microsoft Azure, or Google
Cloud Platform (GCP)—and Cyrene is no exception. But the company does use other,
smaller providers, especially in locations where it needs local-language services.
And, crucially, the company’s CIO has expressed an interest in moving to a more
comprehensive multicloud environment, in the hope that the approach will better
address the company’s needs for scalability, cost savings, and application-specific
support. As Cyrene’s cloud infrastructure grows more complex and more heterogeneous,
securing that infrastructure—even with basic IAM capabilities like determining who
has rightful access to a given account—will inevitably become exponentially more
challenging. Understanding how identities can access the CSP will give Kate a gateway
into any identity-related threats that impact the cloud footprint where business-critical
applications are hosted.
Another important step for day one is to understand how data classification is being
conducted and, specifically, what form of tagging or labeling is being used. Defining
an enterprise-level data classification structure helps. This is the only way to identify
sensitive data, prioritize any investigation or security controls according to its associated
risks, and ensure that appropriate compliance and governance practices and protections
And, of course, Katie will need to understand which business-critical assets she has to
protect and where they’re located. For example, what geographical regions they’re in
and what types of computing are involved, everything from virtual machines (VMs) to
Kubernetes containers to serverless architecture.
These first steps will give Katie an overall understanding of the cloud-specific threats
and vulnerabilities Cyrene faces or is likely to face in the future, and the company’s
current-state ability to address those threats and vulnerabilities with the skills and
technologies it has in place. In addition, crucially, it will enable her to begin identifying
and prioritizing the gaps in those skills and technologies she’ll need to fill.
It will be important for her to evaluate the threat detection tooling of Cyrene’s CSPs,
identify how—and whether—the company is using those tools, and determine what
detection gaps need to be filled immediately. One considerable benefit of CSP-
provided threat detection tooling is that there is no need to manage log collection and
shipping because it’s integrated into the cloud platform itself. Each CSP approaches
the implementation differently, but the core concepts are essentially the same: Identify
which family of detections to turn on and in which account or region they should
operate. However, these detections inevitably have gaps that make it necessary for SOCs
to create and deploy custom detections from logs.
The logs that Katie will decide on first are those that can be collected without interfering
with Cyrene’s product teams:
• Management API logs—These are the most important logs, detailing user and
resource authentication, identifying interactions with the cloud’s management
API, and tracking changes to cloud-managed resources like access privileges. This
means that attacks that engage with the cloud API—for example, creating new
accounts or destroying cloud resources—can be detected in the logs.
• Cloud storage access logs—These logs record create/read/write/delete actions
performed on data inside a cloud storage container. They can be used for threat
detection to detect unusual activity with high-risk stores, especially for data
loss detection.
• Network logs—Network traffic data recorded in these logs provides metadata
about the network’s interactions with data (for example, timestamp, destination
and source port and IP address, amount of data exchanged, and network interface
involved). This network flow data is especially useful when security professionals
already know what they’re looking for because it makes it possible to find open
ports, uncover traffic patterns, and identify the parts of the network that have
interacted with a suspicious endpoint. Katie must determine whether the CSP’s
threat detection tools cover the network traffic detections needed.
Next, Katie will start looking at logs that may require interaction with product
teams to collect:
• Host and container logs—CSP’s hosts also provided detailed metrics that can be
forwarded into the cloud’s log management tool. This can be useful in detecting
server-side attacks.
• Database logs—These logs can capture queries and database management
activities. Although most database threats can be detected by capturing the logs
of the application in front of the database, there may be focused monitoring of
the most critical databases.
• Orchestration logs—Katie will likely find a combination of VMs and containers
running the applications at Cyrene. Orchestration logs provide insight into how
an entire cluster is operating and can be used to detect attackers’ manipulations.
However, CSPs are increasing their built-in detections of container orchestrators,
so identifying gaps will be necessary. Katie may want to focus detection on
deployment or manipulation outside the company’s established paved paths.
This is where the understanding that Katie has developed of Cyrene’s cloud and security
environment will come into play. Without it, she and her team are likely to waste
precious time and resources chasing down blind alleys. They’ll have a better sense of
what to look for because they’ll know what resources and services the company uses
and what “normal” looks like across its cloud environment. The key areas she’ll need to
consider on day one will be:
• Logging—Katie will need to find ways to leverage logs for investigative purposes,
which will require understanding the content of the logs, as well as how to
effectively search through them, whether with cloud-native tooling or with a
dedicated SIEM platform. These logs should enable Katie to identify a broad range
of concerning activity, like suspicious log-in attempts, abnormal service account
usage, and other indicators of compromise. The ability to take raw logs and extract
meaningful insights from them will allow the security organization to rapidly
triage and respond to security incidents and other events. These logs also can
be leveraged for threat hunting to identify potential vulnerabilities before they
can be exploited, or to catch threat activity before the threat actor can further
their attack.
Just as in the first two steps, Katie will need to use the knowledge she’s gained from
this assessment to determine what her current-state resources are in terms of people,
processes, and technology. She also must identify and prioritize the gaps in those
resources that need to be addressed most urgently. When hiring for DFIR roles, she’ll
have to look not only for people with incident response (IR) experience, but also for
individuals who have a basic understanding of the cloud computing delivery model.
The cloud presents new concepts and new challenges not faced by incident responders
of days past, and without that basic understanding of the cloud, general IR experience
simply won’t be enough.
i
Introduction
System design often prioritizes performance, functionality, and user experience over
security. This approach yields vulnerabilities that can be exploited in the product
and across the supply chain. Achieving a better outcome requires a significant shift
toward integrating security measures into every stage of development, from inception
through deployment.
As the threat landscape continues to evolve, the A total of 26,447 critical vulnerabilities were disclosed in 2023, surpassing
the previous year by more than 1,500.1
concept of Secure by Design (SbD) is gaining
importance in the effort to mitigate vulnerabilities Insecure design is ranked as the number four critical web application
security concern on the Open Web Application Security Project (OWASP)
early, minimize risks, and recognize security as a
Top 10.2
core business requirement. SbD aims to reduce
Supply chain vulnerabilities are ranked fifth on the OWASP Top 10 for Large
the burden of cybersecurity and break the cycle Language Model (LLM) Applications.3
of constantly creating and applying updates by
developing products that are foundationally secure.
The Cybersecurity and Infrastructure Security Agency (CISA), National Security Agency
(NSA), Federal Bureau of Investigation (FBI), and international partners including the Five
Eyes (FVEY) intelligence alliance have adopted the SbD mindset and are evangelizing
it to help encourage proactive security practices and avoid creating target-rich
environments for threat actors.
This chapter explores what SbD actually means and discusses its benefits, cultural
aspects, key considerations, and action items that can set you on the path to
successfully embedding SbD into your security strategy.
_________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
1
“2023 Threat Landscape Year in Review: If Everything Is Critical, Nothing Is,” January 2024, https://blog.qualys.com/vulnerabilities-threat-
research/2023/12/19/2023-threat-landscape-year-in-review-part-one
2 “OWASP Top Ten,” https://owasp.org/www-project-top-ten/
3 “OWASP Top 10 for Large Language Model Applications,” https://owasp.org/www-project-top-10-for-large-language-model-applications/
4 “Secure by Design Pledge,” www.cisa.gov/securebydesign/pledge
Both ensure that security is inherent. Together, they work to establish a solid foundation
for proactive security, build trust with customers, and increase the level of difficulty for
threat actors seeking to exploit products and systems.
Secure by Design offers more flexibility to help protect resources and withstand threats
that originate outside of architectural components. It allows you to use products with
different options and settings, so the outcome aligns with your risk tolerance level.
With SbD, the security of architectural components that products are built around
cannot be altered without changing their fundamental design or setup. SbD principles
can be applied to components ranging from IT workloads to services, microservices,
libraries, and beyond.
Another way to think of SbD is to consider the topology of a space, such as a house. An
SbD setup should have only closed, finite rooms in the configuration space (house) that
do not allow access to an infinite space (outside of the house) except through well-
defined and carefully controlled ingress and egress points. This absence of configuration
space options facilitates added security. If you don’t make design principles accessible
to builders, then they’re creating IT workloads in a secure environment.
When software is in the cloud, SbD helps eliminate access points. Identity and access
management (IAM) is your first line of defense, as IAM misconfigurations can lead to
misconfigurations and unsecure usage elsewhere. An example of an SbD approach in an
IAM system for distinct principals (IAM users, federated users, IAM roles, or applications)
is to rely on testable outcomes that make them atomic. Because IAM is inherently based
on the “default deny” principle that either explicitly allows or implicitly denies access,
SbD helps you lay the foundation of a secure IAM setup for builders and operators
within the cloud environment as part of an overarching, centralized IAM system that is
accompanied by centralized logging. New design elements should automatically inherit
the secure setup; otherwise, they shouldn’t work.
Commonly used SDLC models such as waterfall, spiral, and agile don’t address software
security in detail, so secure development practices need to be added to set foundational
security. Additionally, in a cloud environment, infrastructure is also code that should fall
under the purview of the SDLC.
The National Institute of Standards and Technology (NIST) Secure Software Development
Framework (SSDF), also known as SP 800-218, can support efforts to strengthen the
security of your SDLC. The SSDF describes a set of high-level practices based on
established standards, guidance, and secure software development practice documents
from organizations such as SAFECode, BSA, and OWASP. The framework is divided into
four groups (see Figure 1) that are designed to help prepare the organization’s people,
processes, and technology to perform secure software development, protect software
from tampering and unauthorized access, produce well-secured software with minimal
security vulnerabilities in its releases, and respond to residual vulnerabilities. Although
it’s not a checklist—and the degree to which you choose to implement the practices
depends on your organization’s requirements—it can help you adopt well-understood
best practices and ensure team members across all phases of the development pipeline
assume responsibility for security.
Continuous integration and continuous delivery (CI/CD) pipelines that help automate
the software delivery process are substantial contributors to SbD environments, as
they include a comprehensive set of checks to be run—such as firewall settings, OS
configurations, libraries used, security-related reviews, and software components used—
before a target configuration is implemented.
The second area of automation includes detective systems, which can identify
noncompliant components or configurations. Misconfigurations generally shouldn’t
happen within SbD setups, as they are largely prevented through the design and
preventive controls in the implementation. Additionally, it is important to note that a
vulnerability in a system due to either the design or the implementation may not pose
an immediate problem, if the design includes defense-in-depth elements that protect
the overall system despite any individual flaws. Nevertheless, if a detective system finds
something that doesn’t adhere to the design, it’s a signal that the design needs to be
______________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
If the design needs to be open to enable some business processes, a layered defense
can address threats and either prevent a breach or limit potential damage.
_________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
6
“Securing generative AI: What matters now,” May 2024, www.ibm.com/downloads/cas/2L73BYB4?mod=djemCybersecruityPro&tpl=cs
Integrating an AI/ML bill of materials (AI/ML BOM) and cryptography bill of materials
(CBOM) into BOM processes can help you catalog security-relevant information, and
gain visibility into model components and data sources. Additionally, frameworks and
standards such as the NIST AI Risk Management Framework (AI RMF 1.0), the HITRUST AI
Assurance Program, and ISO/IEC 42001 can facilitate the incorporation of trustworthiness
considerations into the design, development, and use of AI systems.
Threat modeling in the design phase fosters a culture of secure design and
implementation, which in today’s landscape includes infrastructure, configuration
management, and application code. Threat modeling exercises conducted during
design allow development and security teams to conceptualize and then document
potential threats before code has been written, and can save time and money by
avoiding rework or escalations later in the development process. Threat models should
be treated as living documents that can be integrated into your SDLC processes and
evolve with experience and learnings, as well as the overall product evolution and threat
landscape over time.
During a threat modeling exercise, mitigation activity should focus on both the design
and the available technology.
Rethinking threats to the login process itself can lead to either network-based control of
the communication as discussed, or to the login process being embedded in a different
environment, which helps to prevent the threat. The cloud- and IAM-based login design
can also provide you with benefits through the scalability or responsibility shift that
comes with the cloud. Keeping the IAM system secure is a critical task for the cloud
service provider (CSP) and part of its core business functions. The centralized logging
and monitoring capabilities provided by the CSP’s IAM system can help you ensure that
only authorized users have access to sensitive data and resources.
The SbD nature of the created space (through automation or systems on the borders)
keeps new resources of the same type in the safe state and free from inherited threats.
Resources with unwanted configurations, states, or dependencies cannot be deployed.
However, the design’s assumptions should be challenged as often as your commercial
requirements allow. Each vulnerability scan with findings should be tested for an impact
on design aspects. New business requirements and compliance considerations also may
necessitate changes or add to your design targets.
Guiding Principles
There are three guiding security design principles to consider when applying SbD to
threat modeling:
2. Monitor the logical borders created by closed spaces to gather baselines. This
helps create enforcement automation through filter and approval technology.
Typical examples of borders include firewalls, web application firewalls (WAFs),
landing zones, system-call jails for preventing unwanted interaction with the
operating system (such as creating users), IAM settings, closed-source libraries,
and software repositories.
3. Design your spaces to allow changes to flow within. You can achieve this by
defining behavior outcome at the design level and having your builders make
decisions along this path in the design space you create.
An SbD approach can help you meet requirements across people, process, and
technology. From a people perspective, you can provide mandatory, role-specific training
on secure coding best practices. From a technology perspective, your design can be set
You can get there by defining your compliance requirements from the domains to
the controls, and then iteratively working backward to meet them. Instead of linking
the controls directly to TOMs, such as architectural components, configurations,
processes, and procedures, the goal is to link them to design principles, which are
then implemented in TOMs outside of the target system. The TOMs are then part of
the surrounding design and unchangeable configurations, rather than part of the
workload itself.
Consider zero trust, for example. If your target system is in an environment that enables
communication only after context-based authentication and authorization verify the
identity of the actor attempting access, you can meet a set of technical compliance
requirements around access and user management and facilitate a finite and closed
space by allowing only approved connections.
On the other hand, supporting builders with a design phase in which the main security
and technical compliance aspects are handled can make them feel safe, because the
design helps prevent insecure configurations. This is typically the situation in cloud
systems, where builders only get access to approved cloud services over approved
authorization and authentication paths, with prepared logging and reporting (landing
zones). The what, who, and where is defined and controlled by preventive automation,
such as a centrally managed IAM system and infrastructure as code (IaC).
Landing zones with built-in policy guardrails may initially add friction to the developer
workflow and decrease agility during a period of adaptation. However, the freedom
provided within well-defined constraints will ultimately pay off in terms of both better
security outcomes and the agility needed to help you achieve business goals.
Another short-term aspect of the targeted closed space of an SbD setup to consider is
how to deal with new technology, testing, and exploration. Testing is critical to verify the
quality and reliability of applications. This requires teams to create test suites and mock
data to maximize code coverage in lower-level environments. Doing so creates a secure
space for development and testing, which is often managed by version control systems
and continuous delivery pipelines.
To make products and services more sustainable, the SbD approach requires a balance
between the usability of the product, fundamental parameters, and malleability of the
design. Designs should take long-term effects into consideration by calling out functions
and their outcome and risk surface but staying out of technical details.
The design should provide builders with clear guidance and eliminate risk decisions
that can be made without undergoing a security approval process. Because builders may
not always realize when they are making a risk-based decision during the development
process, the design should prevent them from having to make them in the first place.
Additionally, builders can sometimes address challenges with the borders of the created
space through the design. For example, it may be easier to design an intake process
for code libraries and other elements of the supply chain to get software components
from existing outside repositories rather than create new components that may also
introduce new risks. It’s important to note that this intake process would be subject to
threat modeling as part of your SbD approach.
The Owner
Business process owners define the target behavior and, therefore, the design space. In
many cases, business requirements are found to be broader than expected. Regulatory
considerations, commercial aspects ranging from costs to time to market, and long-term
strategies for asset development should be incorporated into the design so that its
parameters can be modeled as input data for technical integration.
Business and IT workload owners are also key players in the process of working
backward from the most secure design to a desired working point that balances
security needs with commercial aspects. Informed decisions should be made, with clear
documentation and defined risk takers.
The Supervisors
Supervisory functions, such as design review boards or auditors that define and
potentially manage the data exchange between systems, address the integration
feasibility of the design through technical or organizational measures. They implement
preventive controls to set the borders of the finite space defined by the design and can
use detective controls to guide the design into the future. Additionally, the automated
implementation of those controls, through configuration management systems, can
generate evidence and documentation to prove the compliance status of the overall IT
workload. In cloud environments, where all configurations are known, the automation of
evidence generation is particularly important. Because these controls are running on the
principal of SbD to create finite configuration spaces, the desired result is a technically
compliant state.
Benefits
A robust SbD approach establishes a solid foundation that reduces risks and yields
security benefits for your development teams—and your business.
Scalability
Operations inside an SbD setup allow you to quickly scale,
without reiterating security settings. This is particularly beneficial
in environments where the demand cannot be predicted
precisely up front. An SbD approach helps create well-architected
landing zones that are both scalable and secure. Automation
through code pipelines, including automatic code checks against
attempts to open the room, are a key element here. These
pipelines also add to detective controls to help identify attempts
(or the need) to cross the borders of the design. Systems that
execute these pipelines concentrate risk through the need to
be able to execute everything that is required. Therefore, they
should be outside of the target design in their own closed SbD
room. This provides a starting point your organization can use to
efficiently launch and deploy workloads and applications with
confidence in your security and infrastructure environment.
Repeatability
Having prepared spaces also allows you to repeat setups in an
agile way. With an SbD approach, you can build products and
services that are designed to be foundationally secure using
a repeatable mechanism (see Figure 3) that can anchor your
development life cycle.
Figure 3. SbD as the Anchor of Your Development Life Cycle
Sustainability
A solid SbD approach includes built-in feedback loops through detective controls that
facilitate sustainability by enabling you to analyze data and leverage insights to enhance
the security of your products, services, or processes. If the design considers future
developments in the technology—such as cryptography changes, for example—following
them should be possible by design. This leads to products and services with a longer
lifetime, with potentially fewer changes and iterations, and a stable interaction surface.
In a cloud environment, log data can be used to create detective controls. Detection
of anomalies, and failed attempts to configure things outside the closed space should
lead to documentation (through tickets) and can drive the creation of new controls that
can further advance detection. This creates a flywheel effect that can help you tighten
security and cut off open treats. The closed space remains closed while taking care of
the dynamics on its borders.
Manageability
Manageability functions such as logging, reporting, and gathering data for compliance
purposes can generally be built into the design and don’t need to be rethought.
Included preventive controls will automatically generate the data needed to
keep the IT workload under control. A predesigned operating setup for compute
instances, for example, can include backup and restore, logging, access management,
patch management, inventory management, and telemetry data functions that
are automatically rolled out. Nowadays, you can orchestrate these things through
automated systems and document them with detective controls.
Design Changes
Threats to new and existing technologies are constantly evolving. SbD preventive
controls, such as firewalls and runtime security agents, can help security teams respond
quickly to an intrusion. Consider a scenario in which your security team has discovered
Implementation Efforts
The path to an SbD approach includes an additional step between requirements and
practical implementation. You will need time to qualify design-level mitigations, and
traditional conversations and documentation may be required to establish the “why”
and “how” behind your approach.
In addition to setting aside time for the design phase, focus on standardizing
approaches that can be reused by others. You also should choose your tech stacks
carefully, keeping an eye on complexity and related security overhead. Consider
leveraging cloud services to address problems, instead of building everything
on your own.
The tool landscape also should be approached through an SbD lens. Free selection of
tools, such as programming languages, and methodologies in software development
can create more dependencies (and open spaces) than the risk appetite for the
deployment allows. Time to market and implementation considerations might outweigh
security concerns related to the selected language or programming framework with its
dependencies, which is exactly what you should consider with care.
Using services with a higher integration level, such as cloud services, can reduce
your implementation efforts. Critical capabilities, such as IAM and connectivity, are
already designed and have security built in, which helps provide you with a closed risk
environment by design.
Getting Started
Taking a new approach to security can be daunting, especially for organizations used to
focusing on “check the box” compliance exercises. Five key action items that can help
you avoid frustration and set you on the path to successfully implementing SbD include:
• Identify your core SbD pillars—Evaluate a matrix of your technical domains with
the security domains that are called out by your business processes. Technical
domain examples include logging and security domain integrity while design
examples include mandatory checksums and authenticated encryption. Each node
in the matrix will require a decision to be made regarding how to address the
associated security domains.
Identity Modernization
i
Introduction
Identity is a cornerstone of modern IT security, serving as the backbone for
authentication, authorization, and access control across cloud and hybrid environments.
Whether in a small business or a large enterprise, identity determines who has access
to what, ensuring data integrity and regulatory compliance. As organizations embrace
digital transformation and generative AI (GenAI) technologies, the traditional approaches
to identity management are no longer sufficient. Legacy systems, primarily built on
Active Directory, face significant challenges in adapting to modern security practices,
leading to vulnerabilities and operational complexities.
The concept of zero trust has emerged as a guiding principle, emphasizing that no
user or device should be trusted by default. This shift requires organizations to adopt
modern identity solutions that provide granular and dynamic control over access while
integrating with existing legacy infrastructure. Microsoft Entra ID (formerly Azure AD),
with its suite of advanced features, offers a pathway to identity modernization, enabling
organizations to implement Zero Trust Network Access (ZTNA) and protect both modern
and legacy systems.
This chapter explores the key aspects of identity modernization, focusing on the
challenges posed by legacy systems, the benefits of modern identity solutions, and
the critical role of conditional access and zero trust. We will delve into the unique
challenges of integrating legacy systems, such as outdated protocols, misconfigurations,
and technical debt, and examine how Microsoft Entra Private Access addresses these
issues. By combining advanced security features with flexible integration, Entra ID
provides a robust framework for organizations to secure their identity infrastructure and
reduce the risk of unauthorized access.
Active Directory (AD) has been a foundational component of many enterprise networks
for more than 25 years, serving as a central repository for authentication, authorization,
and policy enforcement. However, the legacy nature of AD brings several challenges
that can hinder identity modernization and pose significant security risks. In this
section, we explore these challenges in detail, focusing on the outdated nature of AD,
misconfigurations, technical debt, stale objects, and the lack of regular assessments
or monitoring.
Legacy protocols like NTLM, now officially deprecated and older versions of Kerberos,
while once standard, are now considered security risks. Their continued use is often due
to compatibility requirements with legacy systems, creating vulnerabilities that can be
exploited by malicious actors. The risk is further compounded when outdated systems
cannot be upgraded due to software or hardware constraints, leaving organizations with
limited options for mitigation.
Small and medium-sized enterprises (SMEs) particularly struggle to deal with legacy
systems, often relying heavily on on-premises AD. This reliance is exacerbated by a
rapidly dwindling technical workforce capable of managing these systems securely.
The talent shortage means fewer skilled professionals are available to implement and
maintain robust security measures. Financial constraints further limit the ability of
SMEs to invest in necessary upgrades or additional security tools, making them more
vulnerable to cyber threats. As a result, SMEs face significant challenges in maintaining
secure and compliant IT environments, often having to balance operational needs
against security risks in ways that larger organizations might not.
Misconfiguration
Even in environments where systems are generally kept current and patched,
misconfigurations can still pose significant risks. AD is a complex system with many
moving parts and ensuring that every component is configured correctly requires
expertise and constant attention. Misconfigurations can lead to unintended access,
security gaps, or vulnerabilities that malicious actors can exploit.
A common issue is the failure to apply current recommended practices, often due
to a lack of personnel with the necessary knowledge or a lack of time to implement
changes. For example, administrators may leave default settings intact, such as allowing
anonymous LDAP binds or enabling unconstrained delegation, which can lead to
privilege escalation and unauthorized access.
Technical Debt
Technical debt accumulates as systems evolve through mergers, acquisitions, and
other organizational changes. This debt can manifest as complex configurations, such
as multiple domain trusts, which should ideally undergo consolidation. However, for
various reasons, consolidation often does not happen, leaving organizations with
complicated and potentially insecure AD environments.
Implementing robust policies and procedures for the life cycle management of AD
objects is crucial. This includes:
By raising the profile of poor life cycle management through these practices,
organizations can significantly reduce the security risks associated with AD. Effective
management of object life cycles ensures that stale and potentially harmful accounts
are quickly identified and remediated, maintaining a more secure and resilient IT
environment.
The final challenge with legacy AD systems is the lack of regular assessments or
monitoring. Continuous monitoring and regular security assessments are critical for
maintaining a secure identity management environment. However, many organizations
fall short in this area, either due to resource constraints or a lack of prioritization.
Security
Security is the cornerstone of identity modernization, and Microsoft Entra ID delivers a
robust framework for protecting against suspicious activities and unauthorized access.
The secure access management system is designed to counter emerging threats,
especially in the context of GenAI technologies like AI copilots and complex cloud
applications. Entra ID effectively assigns permissions and monitors access, ensuring that
identities are secure and supervised.
A critical aspect of modern identity management is the on-behalf-of (OBO) flow, where
a web API uses a different identity to call another web API. This OAuth-based delegation
requires careful handling to prevent security risks. Entra ID controls permissions and
Compliance
Compliance is a major concern for organizations operating in regulated industries or
working with government entities. Microsoft Entra ID is designed to meet stringent
compliance standards, such as the Federal Risk and Authorization Management Program
(FedRAMP), the General Data Protection Regulation (GDPR), the Health Insurance
Portability and Accountability Act (HIPAA), and others, ensuring that organizations can
maintain regulatory requirements. This compliance focus is particularly important for
industries like healthcare, finance, and government, where data security and privacy
are paramount.
Entra ID’s compliance features help organizations achieve and maintain certification
with various standards, reducing the risk of noncompliance and the associated
penalties. The solution’s built-in controls and audit capabilities allow organizations
to demonstrate compliance with ease, providing peace of mind in a complex
regulatory landscape.
Passwordless Authentication
Passwordless authentication is a significant advancement in identity management,
offering both convenience and enhanced security. Microsoft Entra ID supports
passwordless authentication methods, utilizing FIDO (Fast IDentity Online), which
incorporates the latest web authentication (WebAuthn) standard. This approach
eliminates the need for traditional passwords, reducing the risk of phishing and
credential theft.
However, SMEs often face significant challenges when it comes to adopting new
technologies like passwordless authentication. Limited capability to adapt to these
advancements can slow down adoption rates, primarily due to:
Innovation
Microsoft Entra ID is built on open standards, fostering innovation and facilitating
trustworthy interactions. By adopting open standards, the platform encourages
interoperability and reduces business inefficiencies associated with proprietary identity
management systems. This focus on innovation allows organizations to benefit from
Governance
Finally, Entra ID includes robust identity governance capabilities, ensuring that proper
access controls are in place. The platform offers life cycle workflows for onboarding and
offboarding users, managing role changes, and ensuring that access is appropriately
managed throughout an employee’s tenure. These governance features are essential for
maintaining a secure and compliant identity management environment.
Risk-Based Policies
Conditional Access introduces the ability to define risk-based policies that respond
to varying levels of risk. By integrating with Microsoft Entra ID, Conditional Access can
calculate a risk score for each user or sign-in attempt, allowing organizations to enforce
MFA or other security measures in high-risk scenarios. This approach helps mitigate
threats by adapting to the specific context of each access request, ensuring that users
and devices are authenticated based on their risk profile.
Risk-based policies are especially useful in environments with high user turnover
or remote work scenarios, where the risk of unauthorized access may be higher. By
This flexibility enables organizations to implement policies that align with their security
goals while minimizing the impact on user experience. By allowing more stringent
controls for high-risk situations and relaxing them for trusted scenarios, Conditional
Access supports a balanced approach to security and usability.
CAE provides an added layer of security by detecting anomalies and unusual behavior
in real time. This feature is critical in environments where users move between
different locations or devices, as it helps prevent unauthorized access due to changing
circumstances.
Risk controls are particularly valuable in preventing phishing attacks and other types
of social engineering. By adjusting security requirements based on real-time risk
assessments, organizations can better protect their resources without relying solely on
static security measures.
Baseline Security
Microsoft provides a baseline set of Conditional Access policies to customers, which
has been shown to reduce compromises by up to 80% when turned on. These baseline
policies offer a starting point for organizations to implement basic security measures,
including MFA, device compliance checks, and other common controls. By adopting
these baseline policies, organizations can quickly improve their security posture and
reduce the risk of security incidents.
The baseline policies serve as a foundation for building more complex and tailored
Conditional Access rules. They allow organizations to adopt a zero trust mindset without
extensive customization, providing an immediate boost to security while allowing for
future adjustments as needed.
Entra Private Access enables organizations to quickly and securely connect remote users
from any device and network to private applications—whether on premises, across
clouds, or anywhere in between. This seamless connection eliminates the excessive
access and lateral threat movement commonly associated with legacy VPNs.
The ability to enforce per-app access controls based on Conditional Access policies is
a game-changer for legacy applications. It means organizations can apply the same
level of security and oversight to legacy systems as they do to modern cloud-based
applications. This uniform approach to security helps reduce the risk of unauthorized
access and lateral threat movement.
Integrating legacy systems with modern identity solutions like Microsoft Entra Private
Access is a critical step toward achieving a zero trust strategy. By providing secure access
to private apps, adaptive controls, and granular app segmentation, Entra ID bridges the
gap between legacy and modern security. This integration not only protects existing
investments in legacy infrastructure but also paves the way for a more secure and
efficient future.
Organizations that embrace modern identity solutions while addressing the challenges
of legacy systems can significantly improve their security posture. By focusing on
identity as the foundation of network security, implementing ZTNA, and leveraging
Conditional Access, businesses can ensure that they are equipped to navigate the
evolving threat landscape.
The traditional approaches to identity management, often rooted in legacy systems like
Active Directory, can no longer keep pace with the demands of modern security. As a
result, organizations must embrace modern identity solutions that prioritize zero trust
principles, adaptive security, and seamless integration with existing infrastructure.
The emphasis on risk-based policies, flexible access rules, and innovative security
controls ensures that organizations can maintain a high level of security without
compromising user experience. This balance between security and usability is crucial as
organizations strive to enhance productivity and foster a seamless digital environment.
In this chapter, we’ve explored the unique challenges of legacy systems and the
solutions offered by modern identity platforms. By adopting a zero trust mindset
and leveraging Conditional Access, organizations can effectively secure their identity
infrastructure while adapting to the evolving threat landscape.
Technology has the potential to address many of the challenges associated with
legacy authentication systems. Advanced solutions like Microsoft Entra ID offer robust,
modern authentication methods, including passwordless options that significantly
enhance security and user experience. By adopting these technologies, organizations
can reduce the risks associated with outdated protocols and improve compliance with
regulatory standards.
However, although technology provides powerful tools to mitigate the risks of legacy
authentication, the importance of a well-thought-out architecture and careful
implementation cannot be overstated. Simply adopting new technologies without
considering the broader context of the organization’s IT environment can lead to new
vulnerabilities and operational challenges.
i
Introduction: Why Bad Things Keep Happening in Cloud
Over the course of the past several years, we’ve seen some alarming, but ultimately
expected trends from attackers. They’re continuing to target end users and credentials,
as well as known vulnerabilities in web apps and network services, but … there’s
something new. It’s becoming obvious that attackers and more sophisticated campaigns
are focusing on cloud deployments more than ever.
The evidence for this is mounting. In its 2024 Threat Detection As cloud usage grows and evolves, attackers are paying
Report, Red Canary research found that cloud account compromise more attention.
was the fourth most prevalent MITRE ATT&CK® technique used by
threat actors in 2023 (a 16x increase from 2022).1 The “Five Eyes”
nations (US, UK, Australia, New Zealand, and Canada) also released a joint report
detailing evolving tactics for cloud attacks and compromise seen with the APT29 threat
actor operating out of Russia.2 This group, which goes by many names, is believed to be
a state-sponsored team that is responsible for the recent “Midnight Blizzard” attacks
against Microsoft and others. In short, the attackers have found that cloud deployments
are targets ripe for the picking, and we’re likely to see that trend continue in the next
several years.
To that end, organizations need to ask themselves whether their models of cloud
security are in line with current best practices. There are many advantages to cloud
native security controls and processes, and some of the traditional models of security
controls and practices may not be the most efficient or effective. It’s never a bad time
to take stock of security strategies and consider how we might improve and elevate our
capabilities, particularly with the advantages that leading cloud environments offer.
Old Things Done New Ways: Why Core Best Practices Are Still Good, but
Likely Need Updates
For most organizations born before cloud, a natural path to the cloud includes adapting
tried-and-true controls and processes to cloud environments. For some scenarios, this
actually can work well. For example, a “lift and shift” effort will likely mean you’re moving
legacy infrastructure and applications to a cloud hosting environment, with traditional
databases, operating systems, and workload models. Existing patching and configuration
management models will probably be easily adapted into cloud environments with
minor updates (image management, cloud service offerings and permissions, etc.).
Things will ultimately work well enough, even if not every benefit of the cloud will be
realized in the short term.
_________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
A second question could be: “Are we doing things as efficiently and effectively as we
could?” And lastly: “Are we missing something important by not doing security the
cloud-centric way?”
Let’s explore some of the most important cloud controls and categories of security and
look at classic models of implementation and management, more cloud native models,
and how they differ.
___________________________________________________________________________________________________________________________________________________________________________________________________________________________________
3 “Leading through change: 5 steps for executives on the cloud transformation path,” March 2024,
https://cloud.google.com/transform/leading-through-change-5-steps-for-executives-on-the-cloud-transformation-path
4 “Cloud compromises: Lessons learned from Mandiant investigations in 2023,” www.youtube.com/watch?v=Fg13kGsN9ok
1. Central DevOps and cloud engineering—This team should manage the DevOps
pipeline (code, builds, validation, and deployment). Security tools, like static
code assessment and dynamic web scanning, should ideally be integrated with
automation. This should be a multidisciplinary team that includes developers
and infrastructure specialists who have adapted their skills to infrastructure as
code (IaC) and more software-defined environments.
3. Identity and access management (IAM) —The most mature governance models
include a separate IAM team that manages directory service integration,
federation, and single sign-on (SSO), as well as policy and role definitions within
SaaS, PaaS, and IaaS environments. If this is not a definitive team, there should
be at least a small number of IT operations and/or DevOps engineers focused on
this for a significant amount of time.
To ensure cohesion across teams, there should exist a Cloud Governance Committee
(or Cloud Center of Excellence)5 that includes representatives from all of these areas,
as well as dotted-line representation from legal, compliance, audit, and technology
leadership. Without a dedicated focus on cloud governance and oversight of what the
organization’s standards and processes are, it’s highly likely that “shadow cloud” will
crop up as different teams start deploying resources and “experimenting” (not always
with risk in mind!) with various cloud services. Executive support for a central committee
and coordination process is crucial. Communicating the appropriate cloud deployment
processes and approach also is important so everyone knows how best to develop and
deploy cloud assets to approved providers.
_________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
• Determine the business needs for a cloud service—This should include costs,
savings, and pros and cons of insourcing and outsourcing. When evaluating
the different types of CSP offerings, keep in mind the types of responsibility for
different computing elements, as this will significantly impact the security and
governance of cloud services for most organizations. Any outsourced elements will
be much more difficult to evaluate on a regular basis in most cases, and this also
may impact compliance posture.
• Determine policy and compliance—This should be done with input from legal
and audit teams. For this reason, most large organizations will undoubtedly want
to ensure that legal and audit teams are represented within the governance
model chosen.
Once these factors have been collectively brought together as a formal requirements
definition for cloud services, providers can be evaluated based on these needs.
Contracts, control responsibilities, and auditing can then be hashed out accordingly,
which leads to a larger cloud security discussion.
However, a challenge that may impact security operations and monitoring is the pace
of deployments in DevOps pipelines. With developers making frequent changes and
new workloads starting frequently, it may be easier for security operations to keep track
Newer, more modern models of workload configuration and patching should shift
toward tearing down workloads that don’t meet desired patching and configuration
requirements and replacing them with new workloads based on updated images. Some
of the assessments also need to shift from assessing running workloads to including the
assessment step into a build pipeline.
Today, in the cloud, privileged identity management is often much more integrated
into the cloud platform itself and was designed for automation from the start. Cloud
privileged identity management also comprises identity relationship and entitlements
mapping and risk analysis, cloud IAM and configuration through cloud security
posture management (CSPM) solutions, privileged user management, and just-in-time
access management, as well as SSO and federation for identities. Cloud infrastructure
entitlement management (CIEM), a whole new technology category, has emerged. And
here’s the great news: This is usually built in, and organizations just need to manage and
control policies and IAM groupings and federation effectively to significantly enhance
controls over privileged access.
To mitigate this issue, security and operations teams should review all security groups
and cloud firewall rule sets to ensure only the network ports, protocols, and addresses
needed are permitted to communicate. Rule sets should never allow access from
anywhere to administrative services running on ports 22 (Secure Shell) or 3389 (Remote
Desktop Protocol).
In the cloud, building a more resilient and available infrastructure is actually much
simpler. Leading providers offer a wide range of cloud regions and zones, fully
automated high availability HA and failover controls within load balancing and other
components, and a highly redundant and resilient cloud storage infrastructure (as
beginning points). Most provider SLAs are also equivalent or better than many hosting
providers’ data centers.
The following are examples of traditional best practices for secrets management that
should be considered when designing a secrets management strategy:
What’s great about the transition to cloud is that none of these data protection controls
go out the window at all. In fact, database and workload volume encryption are all
readily available, with workload volume encryption enabled by default. DLP is also more
accessible than ever for many organizations.
Cloud providers have the capability to implement encryption at scale fairly easily, and
accordingly, all data at rest within Google Cloud is encrypted by default. For some
organizations, this automatic encryption will prove sufficient for protecting data at rest
for both workloads shifted into the cloud and new cloud native application stacks.
For others, customer-generated encryption keys will be preferred or required for
compliance, and this is also easily managed through Cloud KMS. Google has added
additional encryption solutions for data processing in Compute Engine and BigQuery as
well. The Cloud External Key Manager service provides segregation for encryption keys
with external key storage in a third-party environment for Google Cloud Platform data,
and encryption policies for access and use still apply here.8
_________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
6 https://cloud.google.com/security/products/secret-manager
7 https://cloud.google.com/security/products/dlp?hl=en
8 https://cloud.google.com/kms/docs/ekm
• All data discovery is automated for BigQuery databases. This is important, as large
scale data storage environments like BigQuery are constantly changing and may
be difficult to monitor without automation.
• DLP can be implemented across a variety of storage types, including BigQuery,
Cloud Storage, and Datastore. Providing DLP across a range of different data
storage types in a cloud environment is critical for security professionals needing
comprehensive data protection coverage.
• Google Cloud DLP can help organizations flexibly protect data with policies to
classify, mask, tokenize, and transform data as desired.
Cloud DLP, for many organizations, may prove simpler to implement and maintain
than traditional on-premises DLP options—and at lower cost. Even organizations that
traditionally didn’t implement DLP due to cost and complexity can now affordably
discover, track, and protect data with this advanced capability.
Vulnerability management for cloud workloads and services is definitely an area that
needs to become more dynamic. Cloud workloads change much more rapidly, and
container and other images tend to now include a vast array of third-party packages
that may be vulnerable. Although some of the traditional vulnerability management
solutions have adapted relatively well to cloud, organizations are wise to look into
cloud-native scanning and repository analysis engines that can aid in providing
more effective continuous monitoring and vulnerability management of all types of
cloud assets.
The reality is that cloud log data and other events are being produced in enormous
quantities, and security teams need to recognize specific indicators quickly, see patterns
of events occurring, and spot events happening in the cloud environments where the
events are occurring. Sending logs and cloud telemetry and observability data to a
cloud SIEM makes a lot more sense today. Machine learning (ML) and AI also can easily
augment massive event data processing technology to build more intelligence detection
and alerting tactics. Google Security Operations is an excellent example of a massive
scale event management engine that leverages AI and ML capabilities.9
Automation has become another major focus area for cloud computing forensics
and incident response. Consider the following activities as potential opportunities to
implement automation:
_________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
9 https://chronicle.security/
Although a newer concept, we also definitely see significant growth in ML and AI, both
for business use cases and security analytics.
In all, these types of security controls and services are simply a natural evolution
that reflects the nature of PaaS and IaaS software-defined cloud platforms and
infrastructure. Security operations in large, distributed cloud environments will need
to adapt to accommodate more dynamic deployments and changes, new services and
workloads, and a significantly greater reliance on automation. In the next year and
beyond, it’s likely that all these trends will grow and mature significantly.
Given the attack trends we see, however, organizations should be honest in assessing
their current controls and processes for cloud security and risk management. If you’re
still relying on those from your on-premises data center environments, you’re likely out
of touch with cloud security best practices today.
i
Introduction
Nearly two decades ago, the public cloud introduced a powerful tool with countless
opportunities and underestimated risks. Today, that hot new tool is generative AI
(GenAI). Although GenAI enables organizations to solve new problems and reduce the
resources necessary to do so, it also enables attackers to leverage new attack vectors.
This is often because organizations do not understand the intricate details of how GenAI
works. At the same time, the security industry sees promise in GenAI to help improve
their operations and tooling. However, although GenAI is highly promising in many cases,
it is useless or counterproductive in some others.
To understand the security risks associated with hosting or building a GenAI application,
let us first dissect how one is typically built.
• Large language model (LLM)—These models serve as the core engine for
generating output. An example is a model that generates text.
• Vector database—A database employed to store specialized knowledge bases
tailored for the model’s use, often referred to as VectorDB.
• Retrieval-augmented generation (RAG)—Combines the retrieval of relevant
documents from a large library (a VectorDB) with the generation capabilities of
LLM models to produce more accurate and contextually informed responses.
• Embedding models—These models facilitate the retrieval of pertinent data from
the VectorDB.
• Agents—These are programs that enable communication between a GenAI
application and the external world.
Designing GenAI applications begins with models capable of autonomously creating new
content, such as text or images, from extensive training data. Notable examples include
LLMs, like GPT-3, GPT-4, and BERT, which are trained on vast datasets to understand
and generate human-like language. These models can serve as foundational elements
for various GenAI applications. They can be hosted either locally or in the cloud using
providers like OpenAI, Hugging Face AWS, Azure, or Google Cloud.
The next critical component is a specialized dataset tailored to the application’s specific
task. For example, a chatbot designed to address medical queries about patients’
symptoms and conditions necessitates access to patient data stored in a database. RAG
GenAI applications typically use vector databases for this purpose, efficiently storing,
retrieving, and manipulating the necessary vector data. This is similar to how search and
machine learning (ML) applications operate.
Finally, the application may require access to external components, such as a search
engine. This access is facilitated through agents. An “agent” refers to a program or
system capable of interacting with external services and taking actions to achieve
specific goals. Through agents, GenAI applications can access various tools such as
Google search, a terminal, or external APIs, allowing interaction with other parts of
the application or the environment. For example, there are agents that can read logs
and device conclusions, find answers using Google search, and execute commands
(see Figure 1).
1 www.langchain.com/
2 https://huggingface.co/
• Data risks
• LLM risks
• Application risks
Data Risks
As discussed earlier, the specific requirements of a GenAI application dictate the level
and nature of the access it needs. A GenAI application requiring access to corporate
resources can provide substantial advantages, but it also poses certain risks.
Data Poisoning
In addition to concerns about exposing data to external parties, unauthorized access to
the data stored in a VectorDB and used by the models also presents risks. Unauthorized
changes to the data could alter the behavior of the models, resulting in malformed
outputs. This is especially critical if a decision-making process relies on the output of
a GenAI application. An example of this is a GenAI application that helps a purchasing
committee to compare different RFPs. Unauthorized access to the data (the RFPs) could
lead the GenAI application to favor a specific candidate.
LLM Risks
LLMs are central to GenAI applications, and decisions about their usage, deployment,
and communication methods entail significant risks. Key vulnerabilities include
tampering with prompt components (leading to instructions poisoning and prompt
injection) and deploying malicious or untrusted third-party models, which can result
in data manipulation, compromised response integrity, sensitive data leaks, and
unintended behavior.
• Instructions
• Data fetched from a VectorDB
• User’s query or request
The combination of these elements creates the prompt, which determines the responses
generated by the LLM.
If any of these components are tampered with, it can affect the expected output. We
have addressed the aspects of data poisoning in previous sections. However, if an
attacker gains access to any of the three mentioned components, they not only can
manipulate the data but also the instructions or prompt sent to the model. This could
lead to data exfiltration, manipulation of results, or misuse of the access granted to the
GenAI application.
For example, consider an assistant with access to corporate emails. An attacker could
send a carefully crafted email intended for consumption by the AI assistant to trigger a
specific action. By tricking the AI assistant, the attacker could compel it to expose data
from other accessible components, such as a user’s calendar, thereby gaining insights
into someone’s schedule.
Malicious Models
The proliferation of third-party models shared on platforms like Hugging Face brings
a significant risk of encountering malicious entities. Deploying these models exposes
organizations to various threats, including data breaches, altered outputs, compromised
systems, reputational harm, and compliance violations. Malicious models may illicitly
access sensitive data, produce misleading results, or harbor vulnerabilities that enable
unauthorized access.
This risk was exemplified in a recent incident reported by “The Hacker News” in March
2024. The discovery of more than 100 malicious AI and ML models on platforms like
Hugging Face underscored the severity of the threat.3
Untrusted Models
Malicious models demonstrate how LLM marketplaces are yet another supply chain.
It is unlikely that organizations are going to extensively vet all the third-party models
they use. This is especially unlikely in the early days of GenAI as organizations are
just starting to understand its fundamental concepts. As a result, they are once again
trusting contributions from strangers on the internet to meet critical business needs.
Even if the developers of these models mean well, they can still inadvertently introduce
risk and bias.
_________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
3 “Over 100 Malicious AI/ML Models Found on Hugging Face Platform,” The Hacker News, March 2024, https://thehackernews.com/2024/03/over-100-
malicious-aiml-models-found-on.html
Application Risks
In addition to the GenAI-specific security risks already discussed, we still have the
traditional risk associated with running a multicomponent application.
Access Keys
Authentication keys are necessary for communication between the various components
used by GenAI applications. The security of these components depends on safeguarding
access keys from unauthorized exposure. The compromise of VectorDB access keys
carries significant consequences as they are crucial for maintaining the integrity of the
GenAI application’s knowledge base. Breach of these keys could corrupt the knowledge
base, resulting in inaccurate outcomes or compromised data integrity. Such incidents
not only undermine the reliability of the GenAI application but also jeopardize the
trustworthiness of its outputs.
Furthermore, the keys associated with model providers, such as OpenAI, are critical assets
that require protection. Breaching these keys could provide attackers with unauthorized
access to the service, posing a threat to the confidentiality, integrity, and availability of the
GenAI models. They also could use the organization’s premium plans with the service for
malicious purposes while leaving the organization on the hook for the bill.
Additionally, agents rely on keys to access services such as search engines and CSP
APIs. Moreover, using CSP GenAI services, such as AWS Bedrock, involves the use
of CSP keys across the application. Exposing CSP access keys poses a severe risk,
potentially compromising the entire cloud account. This could lead to unauthorized
access and misuse of cloud resources, which, in turn could result in data breaches or
financial losses.
HashiCorp Vault offers a robust solution for secret management, with features like
dynamic secrets, encryption as a service, and access control policies. Similarly, AWS
Secret Manager and AWS SSM Parameter Store are AWS-native services tailored for
securely storing and managing secrets, offering integration with other AWS services and
robust security features.
Azure Key Vault is Microsoft’s cloud-based service for securely storing and managing
cryptographic keys, certificates, and secrets. Its features include hardware security
module (HSM) protection and role-based access control (RBAC), ensuring the
confidentiality and integrity of stored keys.
Using key vaults, developers can centralize the management of keys and secrets to
reduce the risk of unauthorized access and improve operational efficiency. Additionally,
keys stored in key vaults can be dynamically retrieved at runtime, allowing for easy
integration into applications without the need for hardcoding.
Applications running in the cloud can access that cloud provider’s AI service without
needing long-lived credentials. For example, an application running in the Amazon
Elastic Kubernetes Service (EKS) can use temporary, automatically rotating credentials to
assume an identity and access management (IAM) role with the necessary permissions
to use Amazon Bedrock. This is much better than using long-lived access key pairs for
an IAM user. Multicloud environments complicate this further as Microsoft Azure cannot
rotate an AWS IAM principal’s credentials without extensive permissions. This can be
resolved with the Workload Identity Federation tool, which is both powerful and often
hard to use. For more information, refer to a previously published webcast from SANS on
this subject.4
_________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________
It is unclear how GenAI can improve this process. One area to consider is false-positive
reduction. SAST has earned a bad reputation for generating a massive number of false
positives that take time to triage. It is conceivable that the user could use an LLM to ask
questions about a finding using natural language. However, because these checks are
fairly rudimentary, we do not think this would be particularly useful.
DAST tools work very differently, interacting with a live application to look for suspicious
responses. However, the same principles apply. Like SAST, DAST does straightforward
checks that do not appear to be augmented with natural language. For example, DAST
These challenges are opportunities for GenAI. Most security professionals would prefer
to define their security requirements using natural language, not brackets and braces.
GenAI can help them translate the former into the latter. Similarly, the user could
provide GenAI with an existing policy and have it explain what it accomplishes in a way
they can understand. At the same time, the user should not take these results at face
value because of the prevalence of GenAI hallucinations.
We believe GenAI has the potential to revolutionize this area. Automated test cases
are arguably the hardest to write, the most contextual, and the least commonly
implemented controls we have mentioned. In our opinion, even implementing IaC is
orders of magnitude easier than developing automated tests for abuse cases. These
test cases are more or less defined using application code, and very few security
professionals are comfortable developing reliable applications using Java(Script),
Python, and similar languages. As such, security professionals can benefit even more by
working with GenAI to develop them.
Like with IaC, this approach has issues. Thankfully, these are the same issues that exist
with all code. Developers frequently fail to account for edge-cases in their code. They
also may uncritically copy and paste bad code, and it does not really matter if this
Conclusion
As organizations increasingly rely on GenAI applications, securing these systems will
become critically important. The security community must proactively prepare by gaining
a deep understanding of GenAI technologies and anticipating the associated security
risks. By starting early, we can develop robust strategies to protect these applications
and ensure their safe and effective use.
GenAI is not magic. It is not the solution to all security woes. It is a tool. Like any other
tool, it can be useful or detrimental. The security industry must think critically about this
tool’s costs and benefits. There is no doubt that GenAI is a remarkable innovation that
will change the industry in ways that we do not yet fully comprehend. We encourage you
to improve your comprehension by continuing to research this topic and using GenAI
platforms with caution. This is the same advice we would have given you in 2006 if you
were looking to explore the cloud, and it is the same advice we will give you in 2042 for
whatever revolutionary technology is the zeitgeist of that year.
i
i
i
i
i
i
Access more free educational
content from SANS at:
Checkout upcoming
and on demand webcasts:
SANS Webcasts
Cloud
Security: First
Principles
and Future
OpportunitiesChapter 5: i