0% found this document useful (0 votes)

34 views48 pages

Acaces2019 Proc Arch Sec Part-3

Uploaded by

narendra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views48 pages

Acaces2019 Proc Arch Sec Part-3

Uploaded by

narendra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

Processor Architecture Security

Part 3: Securing Caches, Buffers,

TLBs, and Directories

Jakub Szefer
Assistant Professor
Dept. of Electrical Engineering
Yale University
(These slides include some prior slides by Jakub Szefer and Shuwen Deng from HOST 2019 Tutorial)
ACACES 2019 – July 14th - 20th, 2019
Slides and information available at: https://caslab.csl.yale.edu/tutorials/acaces2019/
ACACES Course on Processor Architecture Security
1
© Jakub Szefer 2019
Logical Isolation and Memory Hierarchy

• Programs are separated by different address spaces

• Page tables define virtual to physical mapping
• Page tables define kernel vs. user pages App. A App. B

Logical isolation “policy” is in the page tables,

while the processor hardware enforces the policy

• Attackers wanting to break the logical isolation

Kernel
focus on the memory hierarchy
• Hardware attacks then focus on caches, TLBs, etc.
to try to cross the isolation boundary and extract information

ACACES Course on Processor Architecture Security

Most units in the memory hierarchy have been shown to be vulnerable to timing attacks:
• Caches
• Cache Replacement Logic
• Load, Store, and Other Buffers
• TLBs
• Directories
• Prefetches
• Coherence Bus and Coherence State
• Memory Controller and Interconnect

Emoji Image:
https://www.emojione.com/emoji/2668
ACACES Course on Processor Architecture Security
© Jakub Szefer 2019 3
Securing the Memory Hierarchy

• To prevent timing attacks, “secure” versions of different units in the memory hierarchy have been
proposed and evaluated
• Most defenses leverage ideas of partitioning and randomization as means
of defeating the attacks
• Of course can always turn off the different units to eliminate the attacks
• E.g. disable caches to remove cache timing attacks
• This creates possibly large impact on performance
• Some defenses use fuzzy time or add random delays
• Attacker can always get a good timing source, so fuzzy time does not work well
• Random delays simply create more noise, but don’t address root causes of the timing attacks

• Most researchers have focused on secure caches (18 different designs to date!)
• Less studied are TLBs, Buffers, Directories
• Most are related to caches, so secure cache ideas are applied to these

ACACES Course on Processor Architecture Security

• Software defenses are possible (e.g. page coloring or “constant time” software)

• But require software writers to consider timing attacks, and to consider all possible
attacks, if new attack is demonstrated previously written secure software may no
longer be secure

• Root cause of timing attacks are caches themselves

• Correctly functioning caches can leak critical secrets like encryption keys when
the cache is shared between victim and attacker
• Need to consider about different levels for the cache hierarchy,
different kinds of caches, and cache-like structures

• Secure processor architectures also are affected by timing attacks on caches

• E.g., Intel SGX is vulnerable to some Spectre variants
• E.g., cache timing side-channel attacks are possible in ARM TrustZone
• Secure processors must have secure caches

ACACES Course on Processor Architecture Security

© Jakub Szefer 2019 6
Secure Cache Techniques
Deng, S., Xiong, W., Szefer, J., “Analysis of Secure Caches and Timing-Based Side-Channel Attacks”, 2019

• Numerous academic proposals have presented different secure cache architectures that aim to
defend against different cache-based side channels.
• To-date there are 18 secure cache proposals
• They share many similar, key techniques

Secure Cache Techniques:

• Partitioning – isolates the attacker and the victim
• Randomization – randomizes address mapping or data brought into the cache
• Differentiating Sensitive Data – allows fine-grain control of secure data

Goal of all secure caches is to minimize interference

between victim and attacker or within victim themselves

ACACES Course on Processor Architecture Security

© Jakub Szefer 2019 7
Different Types of Interference Between Cache Accesses
Deng, S., Xiong, W., Szefer, J., “Analysis of Secure Caches and Timing-Based Side-Channel Attacks”, 2019

Where the interference happens

• External-interference vulnerabilities
• Interference (e.g., eviction of one party’s data from
ISA-level Microarchitecture-level
the cache or observing hit of one party’s data) happens {probe} {miss}
ld
between the attacker and the victim issue s0 s2
slow
• Internal-interference vulnerabilities

latency ↔ cache
{hit} access

hit or miss
s1 fast{force
• Interference happens within the victim’s process itself evict}
access s
s5
Memory reuse conditions
6

{bypass} {replace}
ld
• Hit-based vulnerabilities return s4
{return data}
s3

• Cache hit (fast)

• Invalidation of the data when the data is in the cache (slow)
• More operation needed (e.g., write back the dirty data)
• Miss-based vulnerabilities
• Cache miss (slow)
• Invalidation of the data when the data is in the cache (fast) 8
Partitioning

• Goal: limit the victim and the attacker to be able to only access a limited set of cache blocks
• Partition among security levels: High (higher security level) and Low (lower security level)
or even more partitions are possible
• Type: Static partitioning v.s. dynamic partitioning
• Partitioning based on:
• Whether the memory access is victim’s or attacker’s
• Where the access is to (e.g., to a sensitive or not memory region)
• Whether the access is due to speculation or out-of-order load or store,
or it is a normal operations
• Partitioning granularity:
• Cache sets
• Cache lines
• Cache ways

ACACES Course on Processor Architecture Security

• Partitioning usually targets external interference, but is weak at defending

internal interference:
• Interference between the attack and the victim partition becomes impossible,
attacks based on these types of external interference will fail
• Interference within victim itself is still possible
• Wasteful in terms of cache space and degrades system performance
• Dynamic partitioning can help limit the negative performance and space impacts
• At a cost of revealing some side-channel information when adjusting the
partitioning size for each part
• Does not help with internal interference
• Partitioning in hardware or software
• Hardware partitioning
• Software partitioning
• E.g. page-coloring Image: https://www.aerodefensetech.com/component
ACACES Course on Processor Architecture Security /content/article/adt/features/articles/20339
© Jakub Szefer 2019 10
Randomization

• Randomization aims to inherently de-correlate the relationship among the address and the
observed timing
Observed timing from
cache hit or miss
Information of victim's security
critical data's address Observed timing of flush
or cache coherence
operations
• Randomization approaches:
• Randomize the address to cache set mapping
• Random fill
• Random eviction
• Random delay
• Goal: reduce the mutual information from the observed timing to 0
• Some limitations: Requires a fast and secure random number generator, ability to predict the
random behavior will defeat these technique; may need OS support or interface to specify range
of memory locations being randomized; …
ACACES Course on Processor Architecture Security
© Jakub Szefer 2019 11
Differentiating Sensitive Data

• Allows the victim or management software to explicitly label a certain range of the data of
victim which they think are sensitive
• Can use new cache-specific instructions to protect the data and limit internal interference
between victim’s own data
• E.g., it is possible to disable victim’s own flushing of victim’s labeled data, and therefore
prevent vulnerabilities that leverage flushing
• Has advantage in preventing internal interference
• Allows the designer to have stronger control over security critical data
• How to identify sensitive data and whether this Set-associative cache
identification process is reliable are open
research questions
• Independent of whether a cache uses
partitioning or randomization sets

ACACES Course on Processor Architecture Security

© Jakub Szefer 2019
ways 12
Secure Caches
Deng, Shuwen, Xiong, Wenjie, Szefer, Jakub, “Analysis of Secure Caches and Timing-Based Side-Channel Attacks”, 2019

18 different secure caches exist in literature,

which use one or more of the below techniques
to provide the enhanced security:

• Partitioning-based caches
• Static Partition cache, SecVerilog cache, SecDCP cache, Non-Monopolizable (NoMo) cache,
SHARP cache, Sanctum cache, MI6 cache, Invisispec cache, CATalyst cache, DAWG cache,
RIC cache, Partition Locked cache

• Randomization-based caches
• SHARP cache, Random Permutation cache, Newcache, Random Fill cache, CEASER cache,
SCATTER cache, Non-deterministic cache

• Differentiating sensitive data

• CATalyst cache, Partition Locked cache, Random Permutation cache, Newcache, Random Fill
cache, CEASER cache, Non-deterministic cache

ACACES Course on Processor Architecture Security

© Jakub Szefer 2019 13
Static Partition (SP) Cache
He, Z., and Lee, R.. "How secure is your cache against side-channel attacks?", 2017.
Lee, R., et al., "Architecture for protecting critical secrets in microprocessors,” 2005.
• Basic design for partition based caches
• Statically partition the cache for victim and attacker
• Victim and attacker have different cache ways (or sets)
• No eviction of the cache line between different processes is allowed
• Data reuse can be allowed between processes
• Performance is degraded

Set-Associative Cache
L H

sets

ways
ACACES Course on Processor Architecture Security
© Jakub Szefer 2019 14
SecVerilog Cache
Zhang, D., Askarov, A., & Myers. "Language-based control and mitigation of timing channels”, 2012.

• Statically partitioned but allows data sharing

• Partitioned by different ways
• Different instructions are tagged with different labels (H and L)
• H instruction can read H and L partition
• L instruction can only read L partition
• On a read or write miss, H and L instruction can only modify their own partition
(except that data will be moved from H to L partition for L miss)
Set-Associative Cache
L H

sets h1

1.if (h1) [H]

2.h1=0 [L] Observe cache miss
ways
ACACES Course on Processor Architecture Security
© Jakub Szefer 2019 15
SecDCP Cache
Wang, Y., et al. "SecDCP: secure dynamic cache partitioning for efficient timing channel protection”, 2016.

• Build on the SecVerilog cache

• Dynamically partitioned
• Security classes H (High) and L (Low) security, or more
• Partitioned by different ways
• Adjust the ways assigned to L
• Percentage of cache misses for L instructions ⤋ L’s partition size ⤊
• When adjusting ways
• Change from L’s to H’s
• Cache line is flushed before reallocating
• Change from H’s to L’s
• H lines remain unmodified
• Reduce extra performance overhead and protect the confidentiality
• May leak timing information when changing from H’s to L’s

ACACES Course on Processor Architecture Security

© Jakub Szefer 2019 16
Non-Monopolizable (NoMo) Cache
Domnitser, L., et al. “Non-monopolizable caches: Low-complexity mitigation of cache side channel attacks”, 2012.

• Dynamically partitioned
• Process-reserved ways and unreserved ways
• 𝑁 : number of ways, 𝑀 : number of SMT threads, 𝑌 each thread’s exclusively
.
reserved blocks, 𝑌 ∈ [0, 𝑓𝑙𝑜𝑜𝑟(/)]. E.g.,
• NoMo-0: traditional set associative cache
.
• NoMo- 𝑓𝑙𝑜𝑜𝑟(/): partitions evenly for the different threads and no non-
reserved ways
• NoMo-1:

• When adjusting number of blocks assigned to each thread, 𝑌 blocks are

invalidated
ACACES Course on Processor Architecture Security
© Jakub Szefer 2019 17
Partitioning-Based Secure Caches vs. Attacks
Deng, S., Xiong, W., Szefer, J., “Analysis of Secure Caches and Timing-Based Side-Channel Attacks”, 2019

Effectiveness of the partitioning-based caches against attacks:

SP cache SecVerilog cache SecDCP cache NoMo cache

external miss-
✓ ✓ ~ ✓
based attacks

internal miss-
X X X X
based attacks

external hit-
X ✓ ✓ X
based attacks

internal hit-
X X X X
based attacks

ACACES Course on Processor Architecture Security

© Jakub Szefer 2019 18
SHARP Cache
Yan, M., et al. "Secure hierarchy-aware cache replacement policy (SHARP): Defending against cache-based side channel attacks”, 2017.

• Uses both partitioning and randomization scheme

• Mainly designed to prevent eviction based attacks
• Cache block augmented with the core valid bits
(CVB, similar to process ID) cache line CVB

• Replacement policy
• Cache hit is allowed among different processes
• Cache misses and data to be evicted following the order:
1. Data not belonging to any current processes
2. Data belonging to the same process
3. Random data in the cache set + an interrupt generated to the OS
• Eviction between different processes becomes random
• Disallow flush (clflush) in the R or X model
• Invalidation using cache coherence is still possible
ACACES Course on Processor Architecture Security
© Jakub Szefer 2019 19
Sanctum Cache
Costan, V., Ilia L., and Srinivas D., "Sanctum: Minimal hardware extensions for strong software isolation”, 2016.

• Sanctum
• Open-source minimal secure processor
• Provide strong provable isolation of software modules running concurrently and
sharing resources
• Isolate enclaves (Trusted Software Module equivalent) from each other and OS
• Sanctum cache is a modified cache
• Their changes cover L1 cache, TLB, and last-level cache (LLC)
• L1 cache and TLB
• Security monitor (software) flushes core-private cache lines to achieve isolation
• LLC
• Page-coloring-based cache partitioning ensure per-core isolation between OS
and enclaves
• Assign each enclave or OS to different DRAM address regions

ACACES Course on Processor Architecture Security

© Jakub Szefer 2019 20
CATalyst Cache
Liu, F., et al, "Catalyst: Defeating last-level cache side channel attacks in cloud computing”, 2016.

• Targets at LLC
• Uses Cache Allocation Technology (CAT) from Intel to do coarse-grained partitioning
• Available for some Intel processors
• Allocates up to 4 different Classes of Services (CoS) for separate cache ways
• Replacement of cache blocks is only allowed within a certain CoS.
• Partition the cache into secure and non-secure parts
• Uses software to do fine partition
• Secure pages not shared by more than one VM
• Pesudo-locking mechanism pins certain page frames (immediately bring back after eviction)
• Malicious code cannot evict secure pages

ACACES Course on Processor Architecture Security

© Jakub Szefer 2019 21
Relaxed Inclusion Caches (RIC)
Kayaalp, M., et al, "RIC: relaxed inclusion caches for mitigating LLC side-channel attacks”, 2017.

• Defends against eviction-based timing-based attacks

• Targets on LLC
• Cache replacement of inclusive cache
• For normal cache
• Eviction of data in the LLC will cause the same data in L1 cache to be invalidated
• Eviction-based attacks in the higher level cache possible
• Attacker is able to evict victim’s security critical cache line
• RIC cache
• Single relaxed-inclusion bit set
• Corresponding LLC line eviction will not cause the same line in the higher-level
cache to be invalidated
• Two kinds of data with the bit set
• Read-only data
• Threat private data
• Above two should cover all the critical data for ciphers
ACACES Course on Processor Architecture Security
© Jakub Szefer 2019 22
Partition Locked (PL) Cache
Wang, Z., and Lee, R.B., "New cache designs for thwarting software cache-based side channel attacks”, 2007.

• Dynamically partitioned each cache lines

• Cache line extended with process identifier (ID) and a locking bit (L)
• ID and L are controlled by extending load/store instruction
• ld.lock/ld.unlock & st.lock/st.unlock L ID Original cache line

• Replacement policy (D: brought in; R: replaced)

1. D.L = 0; R.L = 0
2. D.L = 1; R.L = 0
3. D.L = 1; R.L = 1; D.ID = R.ID
D.data=R.data

1. D.L = 0; R.L = 1
2. D.L = 1; R.L = 1; D.ID != R.ID

ACACES Course on Processor Architecture Security

© Jakub Szefer 2019 23
Partitioning-Based Secure Caches vs. Attacks (cont.)
Deng, S., Xiong, W., Szefer, J., “Analysis of Secure Caches and Timing-Based Side-Channel Attacks”, 2019

Effectiveness of the partitioning-based caches against attacks (cont.):

SHARP cache Sanctum cache CATalyst cache RIC cache PL cache

external miss-
✓ ✓ ✓ ✓ ✓
based attacks

internal miss-
X X ✓ ✓ X
based attacks

external hit-
X ✓ ✓ X X
based attacks

internal hit-
X X ✓ X X
based attacks

Uses number of
ACACES Course on Processor Architecture Security assumptions, such
© Jakub Szefer 2019 as pre-loading 24
Random Permutation (RP) Cache
Wang, Z., and Lee, R.B., "New cache designs for thwarting software cache-based side channel attacks”, 2007.

• Uses randomization
• De-correlates the memory address accessing and timing of the cache
• Adds process ID and protection bit (P) extended for each line P ID Original cache line

• A new permutation table (PT) is used:

• Store each cache set’s pre-computed permuted set number
• Number of tables depends on the number of protected processes

ACACES Course on Processor Architecture Security

© Jakub Szefer 2019 25
Random Permutation (RP) Cache (cont.)
Wang, Z., and Lee, R.B., "New cache designs for thwarting software cache-based side channel attacks”, 2007.

• Replacement policy
• Cache hits
• When both process ID and the address are the same
• Cache misses (D: brought in; R: replaced)
• D and R in the same process, have different protection bits
• Arbitrary data of a random cache set S’ is evcted
• D is accessed without caching
• D and R in the different processes
• D is stored in an evicted cache block of S’
• Mapping of S and S’ is swapped
• Other cases
• Normal replacement policy is used

ACACES Course on Processor Architecture Security

© Jakub Szefer 2019 26
Newcache Cache
Wang, Z., and Lee, R.B., "A novel cache architecture with enhanced performance and security”, 2008.

• Dynamically randomizes memory-to-cache mapping

• Maintains a ReMapping Table (RMT)
• Mapping between memory address and RMT
• As direct mapped
• Index bits of memory address used to look up
entries in the RMT
• Each cache line has RMT ID and a protection bit (P) Mapping memory space to the physical cache
• Cache Access
• Index miss
• Context RMT ID and index bit match
• Tag miss
• Tag matches
• Replacement policy similar to RP cache
• Except no normal replacement for any cache architecture
protected-data-related replacing
ACACES Course on Processor Architecture Security
© Jakub Szefer 2019 27
Random Fill (RF) Cache
Liu, F., and Lee, R.B., "Random fill cache architecture.”, 2014.

• De-correlates cache fills with the memory access

• Targets on hit-based attacks
• Multiple types of requests
• Normal data: “normal fill”
• Demand request: “nofill”
• Random fill request
• Look up the cache
• Get forwarded to miss queue on a miss
• “random fill” the address calculated by the random fill engine
• Random Fill Engine
• Generate an access within a neighborhood
• Two range registers (RR1 and RR2)
• (LowerBound, Range) or (LowerBound, UpperBound)
a) block diagram
• Window size can be customized b) random fill engine
ACACES Course on Processor Architecture Security
© Jakub Szefer 2019 28
CEASER Cache
Qureshi, M. K, "CEASER: Mitigating Conflict-Based Cache Attacks via Encrypted-Address and Remapping”, 2018.

• Mitigates conflict-based cache attacks

• When memory access tries to modify the cache state
• The address is encrypted using Low-Latency BlockCipher (LLBC)
• Randomize the cache set it maps
• Scatters the original, possible ordered addresses to different cache sets
• Decrease rate of conflict misses
• Encryption and decryption can be done within 2 cycles using LLBC
• Encryption key will be periodically changed to avoid key reconstruction
• Dynamically change the address remapping
• Improved work to be appeared @ISCA 2019

ACACES Course on Processor Architecture Security

© Jakub Szefer 2019 29
SCATTER Cache
“Scattercache: Thwarting cache attacks via cache set randomization,” M. Werner, et al., USENIX Security 2019

• Uses cache set randomization to prevent timing-based attacks

• A mapping function is used to translate memory address and process information to
cache set indices
• The mapping is different for each program or security domain
• The mapping function also calculates a different index for each cache way, in a similar
way to the skewed associative caches

ACACES Course on Processor Architecture Security

© Jakub Szefer 2019 30
Non Deterministic Cache
Keramidas, G., et al. "Non deterministic caches: A simple and effective defense against side channel attacks”, 2008.

• Uses cache access decay to randomize the relation between accessing and timing
• Counters control the decay of a cache block
• Local counter records the interval of its data activeness
• Increased on each global counter clock tick
• When reaching a predefined value
• Corresponding cache line is invalidated
• Non deterministic cache randomly sets local counter’s initial value
• Can lead to different cache hit and miss statistics
• May have larger performance degradation compared with other data-targeted
secure caches

ACACES Course on Processor Architecture Security

© Jakub Szefer 2019 31
Randomization-Based Secure Caches vs. Attacks
Deng, S., Xiong, W., Szefer, J., “Analysis of Secure Caches and Timing-Based Side-Channel Attacks”, 2019

Effectiveness of the randomization-based caches against attacks:

RP Newcac RF CEASER SCATTER Non-det.

cache he cache cache cache cache

external miss-
✓ ✓ X ✓ ✓ O
based attacks

internal miss-
X ✓ X ✓ ✓ O
based attacks

external hit-
✓ ✓ ✓ X ~ O
based attacks

internal hit-
X X ✓ X X O
based attacks

ACACES Course on Processor Architecture Security

© Jakub Szefer 2019 32
MI6 Cache
Bourgeat, T., et al. "MI6: Secure Enclaves in a Speculative Out-of-Order Processor”,2018.

• Speculation-related cache
• MI6
• Secure Enclaves in a Speculative Out-of-Order Processor
• Isolation of enclaves (Trusted Software Module equivalent) from each other and OS
• Combination of:
• Sanctum cache’s security feature
• Disabling speculation during the speculative execution of memory related operations

ACACES Course on Processor Architecture Security

© Jakub Szefer 2019 33
InvisiSpec Cache
Yan, M., et al. "Invisispec: Making speculative execution invisible in the cache hierarchy”,2018.

• Speculation-related cache
• A speculative buffer (SB) will store the unsafe speculative loads (USL) before
modifying the cache states
• Mismatch of data in the SB and the up-to-date values in the cache
• Squashed
• The core receives possible invalidation from the OS before checking of memory
consistency model
• No comparison is needed
• Targets on Spectre-like attacks

ACACES Course on Processor Architecture Security

© Jakub Szefer 2019 34
Dynamically Allocated Way Guard (DAWG) Cache
Kiriansky, V., et al. "DAWG: A defense against cache timing attacks in speculative execution processors”, 2018.

• Uses partitioning scheme

• Provides full isolation for hits, misses and metadata between the attacker and the victim
• Cache hits
• When both the cache address tag and domain_id (process ID) associated are the same
• Allows read-only cache lines to be replicated across different domains
• Cache misses
• Victim can only be chosen within the ways belonging to the same domain_id
• Replacement policy’s bits and metadata is updated within the domain selection
• Noninterference property
• Orthogonal to speculative execution
• Existing attacks such as Spectre Variant 1 and 2 will not work on a system equipped with
DAWG

ACACES Course on Processor Architecture Security

© Jakub Szefer 2019 35
Speculation-Related Secure Caches vs. Attacks
Deng, S., Xiong, W., Szefer, J., “Analysis of Secure Caches and Timing-Based Side-Channel Attacks”, 2019

Effectiveness of the speculation-related caches against attacks:

MI6 cache InivisiSpec cache DAWG cache

Normal Speculative Normal Speculative Normal Speculative

external miss-
✓ ✓ X ✓ ✓ ✓
based attacks

internal miss-
X ✓ X ✓ X X
based attacks

external hit-
✓ ✓ X ✓ ✓ ✓
based attacks

internal hit-
X ✓ X ✓ X X
based attacks

ACACES Course on Processor Architecture Security

© Jakub Szefer 2019 36
Secure Cache Performance
Deng, S., Xiong, W., Szefer, J., “Analysis of Secure Caches and Timing-Based Side-Channel Attacks”, 2019

Random Fill
SecVerilog

InvisiSpec

Newcache

SCATTER
CATalyst
Sanctum

CEASER

Non Det.
SecDCP

SHARP

DAWG
NoMo
SP*

RIC
MI6

RP
PL
average 3.5%, 3.5%
within
reduce slowdow 9% if 1% for 7%
L1 the
12.5% slowdo n of setting for perfor with
1.2% and 10%
better wn of 0.7% for impr 0.3%, the perfor - simpl
avr., L2 12 range
Perf.

1% - over 3%-4% - - Spectre SPEC oves 1.2% windo manc manc e

5% most % of the
SP from and 10% worst w size e e opti- bench
worst 4%- real
cache 74% to 0.5% for to be optimi miza- marks
7% miss
21% PARSE largest zation tion
rate
C
L1 0.56
mW, avera
<5%
Pwr.

- - - - - - - LLC - - - - ge - - -
power
0.61 1.5nj
mW
L1-SB
LLC-SB
Area 0.17
Area

- - - - - - - - - - - - - - -
(mm2) 6%
0.0174
0.0176

ACACES Course on Processor Architecture Security

• Balance tradeoff between performance and security

• Curse of quantitative computer architecture: focus on performance, area, power numbers, but no
easy metric for security – designers focus on performance, area, power numbers since they are
easy to show ”better” design, there is no clear metric to say deign is “more secure” than
another design

• Evaluation on simulation vs. real machines

• Simulation workloads may not represent real systems, performance impact of
security features is unclear
• Real systems (hardware) can’t be easily modified to add new features and
test security

• How to realize in commercial processors

• Many designs exist, but not in commercial processors

• Formal verification of the secure feature implementations

• Still limited work on truly showing design is secure
• Also, need more work on modelling all possible attacks, performance security
e.g. the 3-step model
ACACES Course on Processor Architecture Security
© Jakub Szefer 2019 38
Secure Buffers, TLBs, and Directories

ACACES Course on Processor Architecture Security

• Various buffers exist in the processor which are used

to improve performance of caches and TLBs
• Main types of buffers in caches:
• Line Fill Buffer (L1 cache ⟷ L2 cache)
• Load Buffer (core ⟷ cache)
• Store Buffer (core ⟷ cache)
• Write Combining Buffers (for dirty cache lines
before store completes)
• … (more could be undesclosed)
• Main types of buffers in TLBs:
• Page Walk Cache

ACACES Course on Processor Architecture Security

• Various buffers store data or memory translation based on the history of the code executed
on the processor
• Hits and misses in the buffers can potentially be measured and result in timing attacks
• This is different from recent MDS attacks, which abuse the buffers in another way: MDS attacks
leverage the fact that data from the buffers is sometimes forwarded without proper address
checking during transient execution
• Towards secure buffers
• No specific academic proposal (yet)
• Partitioning – can partition the buffers, already some are per hardware thread
• Randomization – can randomly evict data from the buffers or randomly bring in data,
may not be possible
• Add new instructions to conditionally disable some of the buffers

ACACES Course on Processor Architecture Security

• All timing-based channels in microarchitecture pose threats to system security,

and all should be mitigated
• TLBs are cache-like structures, which exhibit fast and slow timing based on the request type
and the current contents of the TLB
• Contents of the TLB is affected by past history of executions
• Can leak information about other processes

• Timing variations due to hits and misses exist in TLBs and can be leveraged to build
practical timing-based attacks:
• TLB timing attacks are triggered by memory translation requests,
not by direct accesses to data
• TLBs have more complicated logic, compared to caches,
for supporting various memory page sizes
• Further, defending cache attacks does not protect against TLB attacks

ACACES Course on Processor Architecture Security

• Random Fill Engine and RF TLB microarchitecture.

Processor
Processor
(1) Request from CPU (7) Response
(1) Request from CPU (3) Response to CPU
to CPU
Random Fill
Logic (5) Modify
Response
Random Fill
Engine

(4) Random
Fill
(2) Normal demand (6) No DCache
DCache address (3) Send Fill
Fill SecR
To the signal (2) Probe
Mux
RNG Random Fill
Generation
TLB buffer
TLB
sbase ssize
Page Table Walker
Page Table Walker
(a) (b)
(b)

Probe Random Fill No fill

ACACES Course on Processor Architecture Security
© Jakub Szefer 2019 43
A�ack Category
A�ack Category Vulnerability Type C C
TLB Evict+Probe
TLB Evict+Probe Vdd Vu A d (slow) 0 0
Secure TLBs TLB TLB Prime+Time
Prime+Time Add Vu Vd (slow) 0 0
TLB Flush+
TLB
Deng, S., et al., “Secure TLBs”, ISCA 2019.
Flush+ Reload
Reload Add Vu Aa (fast) 0 0
• Regular Set-Associative TLB Prime+Probe
TLB Prime+Probe
TLBs can preventAdd external
Vu d (slow)
Ahit-based 1
vulnerabilities 0and
vulnerabilities requiring TLB Evict+Time
TLB Evict+Time
getting hit for different
Vuu Vu (slow)
processes
A d 1 0
• Static-Partitioned TLB
TLB Internal
Internal
TLB Collision more
can Collision
prevent Addexternal Va (fast)
Vu miss-based 1
vulnerabilities 1than SA TLB
TLB Bernstein’s
TLB Bernstein’s A�ack
A�ack Vuu Va Vu (slow) 1 1
• Random-Fill TLB can prevent all types of vulnerabilities
SA TLB SP TLB RF TLB
A�ack Category
A�ack Category Vulnerability Type C C C
TLB Evict+Probe
TLB Evict+Probe Vdd
V Vuu Ad (slow) 0 0 0
TLB Prime+Time
TLB Prime+Time Add
A Vuu Vd (slow) 0 0 0
TLB Flush+
TLB Flush+ Reload
Reload Add
A Vuu Aa (fast) 0 0 0
TLB Prime+Probe
TLB Prime+Probe Add
A Vuu Ad (slow) 1 0 0
TLB Evict+Time
TLB Evict+Time Vuu
V Add Vu (slow) 1 0 0
TLB Internal
TLB Internal Collision
Collision AAdd Vuu Va (fast) 1 1 0
TLB Bernstein’s
TLB Bernstein’s A�ack
A�ack Vuu
V Vaa Vu (slow) 1 1 0

• Evaluated on a 3-step model for TLBs; model and list SA TLBattackSPtypes

of all TLB areRF
in TLB
the cited paper.
A�ack Category Vulnerability
ACACES CourseType CSecurity
on Processor Architecture
© Jakub Szefer 2019
C C 44
Secure Directories
Deng, S., et al., “Secure TLBs”, ISCA 2019.

• Directories are used for cache coherence to keep track of the state of the data in the caches
• By forcing directory conflicts, an attacker can evict victim directory entries, which in turn
triggers the eviction of victim cache lines from private caches
• SecDir re-allocates directory structure to create per-core private directory areas used in a
victim-cache manner called Victim Directories; the partitioned nature of Victim Directories
prevents directory interference across cores, defeating directory side-channel attack.

Intel Directory in Skylake CPUs Secure Directory (SecDir)

• Performance overhead of the different secure components

and the benchmarks used for the evaluation

Performance Overhead Benchmark

Secure Buffers n/a n/a

Secure TLBs [S. Deng, et al., 2019] For SR TLB: IPC 1.4%, MPKI 9% SPEC2006
SecDir [M. Yan, et al., 2019] few % (some benchmarks faster SPEC2006
some slower)

ACACES Course on Processor Architecture Security

• In response to timing attacks on caches, and other parts of the processor’s memory
hierarchy, many secure designs have been proposed
• Caches are most-researched, from which we learned about two main defense techniques:
• Partitioning
• Randomization
• The techniques can be applied to other parts of the processor: Buffers, TLBs, and Directories

• Most claim modest overheads of few % on SPEC2006 workloads

• Unclear of overhead on real-life applications

• Other parts of memory hierarchy are still vulnerable: memory bus contention, for example

ACACES Course on Processor Architecture Security

Jakub Szefer, ”Principles of Secure Processor

Architecture Design,” in Synthesis Lectures on
Computer Architecture, Morgan & Claypool
Publishers, October 2018.

https://caslab.csl.yale.edu/books/

ACACES Course on Processor Architecture Security

Onur Mutlu All Lecs 447
No ratings yet
Onur Mutlu All Lecs 447
503 pages
Final Sample Paper - CS604P
100% (1)
Final Sample Paper - CS604P
7 pages
William Stallings Computer Organization and Architecture 9 Edition
No ratings yet
William Stallings Computer Organization and Architecture 9 Edition
46 pages
Cortex A72 Mpcore TRM 100095 0003 06 en
0% (1)
Cortex A72 Mpcore TRM 100095 0003 06 en
575 pages
Cognizant Interview Questions: Click Here
No ratings yet
Cognizant Interview Questions: Click Here
34 pages
COA MODULE - Memory Organization
No ratings yet
COA MODULE - Memory Organization
43 pages
Ucam CL TR 630
No ratings yet
Ucam CL TR 630
144 pages
Aicte Exam Reforms Guidelines: by Prof. B.A.Khivsara
No ratings yet
Aicte Exam Reforms Guidelines: by Prof. B.A.Khivsara
49 pages
Cache Memory: A Safe Place For Hiding or Storing Things
100% (1)
Cache Memory: A Safe Place For Hiding or Storing Things
34 pages
Exam - Chinese PDF
100% (1)
Exam - Chinese PDF
81 pages
Week 10
No ratings yet
Week 10
59 pages
MCQs On Cache Memories
No ratings yet
MCQs On Cache Memories
5 pages
Security Architecture and Design
No ratings yet
Security Architecture and Design
33 pages
Semi-Invasive Attacks - A New Approach To Hardware Security Analysis PDF
No ratings yet
Semi-Invasive Attacks - A New Approach To Hardware Security Analysis PDF
144 pages
User Guide AMD Uprof v3.4 GA
No ratings yet
User Guide AMD Uprof v3.4 GA
164 pages
Solution For Chapter 4
100% (3)
Solution For Chapter 4
26 pages
11 Caches
No ratings yet
11 Caches
87 pages
4 Caches With Notes
No ratings yet
4 Caches With Notes
121 pages
03 Security
No ratings yet
03 Security
129 pages
OS Question Bank - All Modules - II ND Year
67% (3)
OS Question Bank - All Modules - II ND Year
8 pages
Memory Cache
No ratings yet
Memory Cache
18 pages
PT Security
No ratings yet
PT Security
30 pages
08 Caches
No ratings yet
08 Caches
78 pages
CS5204/EE5364 - Advanced Computer Architecture - Memory
No ratings yet
CS5204/EE5364 - Advanced Computer Architecture - Memory
67 pages
COA Unit 4 Computer Memory System RRP
No ratings yet
COA Unit 4 Computer Memory System RRP
66 pages
Os Question Bank
100% (1)
Os Question Bank
12 pages
Lecture 14
No ratings yet
Lecture 14
14 pages
Securing Processor Architectures
No ratings yet
Securing Processor Architectures
84 pages
Module4 CAche Performance
No ratings yet
Module4 CAche Performance
40 pages
ACA Final Project Presentation
No ratings yet
ACA Final Project Presentation
43 pages
Acaces2019 Proc Arch Sec Part-1
No ratings yet
Acaces2019 Proc Arch Sec Part-1
62 pages
Cache PPT
No ratings yet
Cache PPT
38 pages
10 Caches
No ratings yet
10 Caches
34 pages
Stanford Advanced Caches
No ratings yet
Stanford Advanced Caches
46 pages
Large and Fast: Exploiting Memory Hierarchy
No ratings yet
Large and Fast: Exploiting Memory Hierarchy
48 pages
Acaces2019 Proc Arch Sec Part-2
No ratings yet
Acaces2019 Proc Arch Sec Part-2
48 pages
Acaces2019 Proc Arch Sec Part-4
No ratings yet
Acaces2019 Proc Arch Sec Part-4
46 pages
CH13 DRAM Controller
No ratings yet
CH13 DRAM Controller
39 pages
MIPSfpga Hands-On Learning On A Commercial Soft-Core
No ratings yet
MIPSfpga Hands-On Learning On A Commercial Soft-Core
5 pages
Chapters
No ratings yet
Chapters
25 pages
COA Lecture 19-Address Translation PDF
No ratings yet
COA Lecture 19-Address Translation PDF
16 pages
Spectre and Meltdown
No ratings yet
Spectre and Meltdown
51 pages
Cse V Operating Systems Notes - Part2
No ratings yet
Cse V Operating Systems Notes - Part2
72 pages
Cache Presentation
No ratings yet
Cache Presentation
45 pages
Lecture 10
No ratings yet
Lecture 10
44 pages
Kernelsnitch Lukas Maar
No ratings yet
Kernelsnitch Lukas Maar
20 pages
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
No ratings yet
Characteristics Location Capacity Unit of Transfer Access Method Performance Physical Type Physical Characteristics Organisation
53 pages
The Hype Around The RISCV Hypervisor v11
No ratings yet
The Hype Around The RISCV Hypervisor v11
34 pages
Cache Memory: William Stallings, Computer Organization and Architecture, 9 Edition
No ratings yet
Cache Memory: William Stallings, Computer Organization and Architecture, 9 Edition
47 pages
Cache Memory: A Safe Place For Hiding or Storing Things
No ratings yet
Cache Memory: A Safe Place For Hiding or Storing Things
34 pages
Operating System Assignment No.4: Pratham Kumar Jha 189301211 May 20, 2020 Manipal University Jaipur
No ratings yet
Operating System Assignment No.4: Pratham Kumar Jha 189301211 May 20, 2020 Manipal University Jaipur
5 pages
Side Channel Attacks, PKCS, x509 Certificate
No ratings yet
Side Channel Attacks, PKCS, x509 Certificate
37 pages
Memory Barriers A Hardware View For Software Hacke
No ratings yet
Memory Barriers A Hardware View For Software Hacke
29 pages
Computer Architecture - Lecture 06
No ratings yet
Computer Architecture - Lecture 06
18 pages
Topics: Cache Innovations (Sections 2.4, B.4, B.5), Virtual Memory Intro
No ratings yet
Topics: Cache Innovations (Sections 2.4, B.4, B.5), Virtual Memory Intro
20 pages
Module 4: Memory Management
No ratings yet
Module 4: Memory Management
35 pages
Attack and Risk Analysis For Hardware Supported Software Copy Protection Systems
No ratings yet
Attack and Risk Analysis For Hardware Supported Software Copy Protection Systems
25 pages
Computer Architecture: Cache Memory
No ratings yet
Computer Architecture: Cache Memory
28 pages
Conspect of Lecture 7
No ratings yet
Conspect of Lecture 7
13 pages
CS3451 - Introduction To Operating Systems: Ii Year / Iv Semester
No ratings yet
CS3451 - Introduction To Operating Systems: Ii Year / Iv Semester
18 pages
Cursul 6d EN SAS
No ratings yet
Cursul 6d EN SAS
14 pages
CS8493-Operating Systems-All Units Questions
No ratings yet
CS8493-Operating Systems-All Units Questions
17 pages
Advanced Computer Architecture: BY Dr. Radwa M. Tawfeek
No ratings yet
Advanced Computer Architecture: BY Dr. Radwa M. Tawfeek
32 pages
Red Hat Openstack Platform-16.2-Network Functions Virtualization Product Guide-En-Us
No ratings yet
Red Hat Openstack Platform-16.2-Network Functions Virtualization Product Guide-En-Us
21 pages
Lec2 PDF
No ratings yet
Lec2 PDF
21 pages
Paper 3
No ratings yet
Paper 3
13 pages
Data Security
No ratings yet
Data Security
14 pages
Neha Csa Seminar
No ratings yet
Neha Csa Seminar
12 pages
Lecture: Cache Hierarchies: Topics: Cache Innovations (Sections B.1-B.3, 2.1)
No ratings yet
Lecture: Cache Hierarchies: Topics: Cache Innovations (Sections B.1-B.3, 2.1)
20 pages
Towards High Performance Paged Memory For GPUs
No ratings yet
Towards High Performance Paged Memory For GPUs
13 pages
Meltdown Attack
No ratings yet
Meltdown Attack
15 pages
Timing Cache Accesses To Eliminate Side Channels in Shared Software
No ratings yet
Timing Cache Accesses To Eliminate Side Channels in Shared Software
12 pages
A Comprehensive Analysis of Superpage Management Mechanisms and Policies
No ratings yet
A Comprehensive Analysis of Superpage Management Mechanisms and Policies
15 pages
Cache Memory
No ratings yet
Cache Memory
20 pages
4 24 CSCWD KSM Killer of Spectre and Meltdown Attacks
No ratings yet
4 24 CSCWD KSM Killer of Spectre and Meltdown Attacks
6 pages
Spectre Attack Lab
No ratings yet
Spectre Attack Lab
13 pages
Cache and Caching: Electrical and Electronic Engineering
No ratings yet
Cache and Caching: Electrical and Electronic Engineering
15 pages
DAWG: A Defense Against Cache Timing Attacks in Speculative Execution Processors
No ratings yet
DAWG: A Defense Against Cache Timing Attacks in Speculative Execution Processors
14 pages
In Memory Pointer Chasing Accelerator - Iccd16
No ratings yet
In Memory Pointer Chasing Accelerator - Iccd16
8 pages
Software Mitigations To Hedge AES Against Cache-Based Software Side Channel Vulnerabilities
No ratings yet
Software Mitigations To Hedge AES Against Cache-Based Software Side Channel Vulnerabilities
17 pages
Lab 8
No ratings yet
Lab 8
10 pages
New Cache Designs For Thwarting Software Cache-Based Side Channel Attacks
No ratings yet
New Cache Designs For Thwarting Software Cache-Based Side Channel Attacks
12 pages
CS61C Final: University of California, Berkeley College of Engineering
No ratings yet
CS61C Final: University of California, Berkeley College of Engineering
10 pages
Question Paper Code:: Reg. No.
No ratings yet
Question Paper Code:: Reg. No.
4 pages
Practice Set DPVM
No ratings yet
Practice Set DPVM
2 pages
Gvisor-seccomp Security Profiles: The Complete Guide for Developers and Engineers
From Everand
Gvisor-seccomp Security Profiles: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
gVisor Architecture and Integration: The Complete Guide for Developers and Engineers
From Everand
gVisor Architecture and Integration: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Chia Blockchain: Proof-of-Space and Time Protocols and Implementation
From Everand
Chia Blockchain: Proof-of-Space and Time Protocols and Implementation
William Smith
No ratings yet
Blowfish Cryptography in Practice: Definitive Reference for Developers and Engineers
From Everand
Blowfish Cryptography in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Container Infrastructure and Operations: Definitive Reference for Developers and Engineers
From Everand
Container Infrastructure and Operations: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet