R RRRRRRRR Final
R RRRRRRRR Final
• if the computer crashes or the power goes out, data
can still be recovered without issue.
• to keep data safe, this policy has to perform every write
operation twice
• the program or application that is being used must wait
until the data has been written to both the cache and
storage device before it can proceed.
• this comes at the cost of system performance but is
highly recommended for sensitive data that cannot be
lost.
ADVANTAGE
• it ensures information will be stored safely without risk
of data loss.
Write-Back Policy
• saves data only to the cache when processing.
• there are only certain times or conditions where the
information will also be written to the primary storage
device
• since there is no guaranteed way to keep the data safe,
this policy has a much higher probability of data loss if
something were to go wrong
• at the same time, since it no longer has to write
information to both cache and storage device, system
performance gains are noticeable when compared to
the write-through policy.
• data recoverability is exchanged for system
performance, making this ideal for applications or
programs that require low latency and high throughput.
Line Size
• retrieve not only desired word but a number of
adjacent words as well.
• increased block size will increase hit ratio at first
• hit ratio will decreases as block becomes even
bigger ---- probability of using newly fetched
information becomes less than probability of
reusing replaced
• larger blocks ---- reduce number of blocks that
fit in cache; data overwritten shortly after being
fetched; each additional word is less local so
less likely to be needed
• no definitive optimum value has been found.
• 8 to 64 bytes seems reasonable
• For HPC systems, 64- and 128- byte most
common
Number of Caches
Multilevel Caches
• High logic density enables caches on chip.
faster than bus access
frees bus for other transfers
• Pentium 4
L1 caches
8k bytes
64 byte lines
Four way set associative
PENTIUM 4 CACHE
• Pentium 4
L2 cache
Feeding both L1 cache
256k
128 byte line
8 way set associative
L3 cache on chip
PENTIUM 4 CORE PROCESSOR
• Fetch/Decode Unit
Fetches instructions from L2 cache
Decode into micro-ops
Store micro-ops in L1 cache
• Out of order execution logic
Schedules micro-ops
Based on data dependence and resources
May speculatively execute
• Execution units
Execute micro-ops
Data from L1 cache
Results in registers
• Memory subsystem
L2 cache and systems bus
PENTIUM 4 DESIGN REASONING
• Decodes instructions into RISC like micro-ops
before L1 cache
• Micro-ops fixed length
Superscalar pipelining and scheduling
• Pentium instructions long & complex
• Performance improved by separating decoding
from scheduling & pipelining
• Data cache is write back
Can be configured to write through
• L1 cache controlled by 2 bits in register
CD = cache disable
NW = not write through
2 instructions to invalidate (flush) cache and
write back then invalidate
• L2 and L3 8-way set associative
Line size 128 bytes
ARM CACHE ORGANIZATION