0% found this document useful (0 votes)
16 views

Mod-5 Microcontrollers

MC notes
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Mod-5 Microcontrollers

MC notes
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

MODULE-5

The Memory Hierarchy and Cache Memory

 The diagram above shows hierarchy of memory at processor core the innermost memory
is tightly coupled to the processor register file. Which provide fastest memory access
 At the primary level memory component are connected to the processor core through on
chip interface.
 the primary level it is also known as main memory include the components like SRAM
and DRAM which are volatile components and also non-volatile of components like flash
memory.
 The next level is to storage the devices such as Disk drive (or) removable memory. The
secondary memory stores the large programs that don't fit into main memory. As shown
in fig, the memory hierarchy includes Level 1 (L1) cache.
 Cache decrease as the time required to access both instruction and data the right buffer is
a small FIFO buffer that supports right to main memory.
 Level 2 cache can also be located between L1 cache and slow over memory the L1 and
L2 cache or also known as primary and secondary cache respectively.
 Figure 2 below shows the relationship that cache has with main memory and processor

 The upper half of the figure shows the system without cache that is main memory is
Access directly by the processor.
 the lower half of the figure shows this system with cache. The cache memory is much
faster than the main memory which quickly respond the request by the processor .
 The small block of data is transferred between lower memory and faster cache memory.
These blocks of data are known as cache lines.
 The write buffer acts as a temporary buffer in which cache transfer the block then write
buffer drains it to main memory.
CACHES AND MEMORY MANAGEMENT UNITS

If a cached core supports virtual memory, it can be located between the core and the
memory management unit (MMU), or between the MMU and physical memory.

logical cache stores data in a virtual address space. A logical cache is located between
the processor and the MMU. The processor can access data from a logical cache directly
without going through the MMU. A logical cache is also known as a virtual cache.

A physical cache stores memory using physical addresses. A physical cache is located
between the MMU and main memory. For the processor to access memory, the MMU
must first translate the virtual address to a physical address before the cache memory can
provide data to the core.
CACHE ARCHITECTURE

ARM uses two bus architectures in its cached cores, the Von Neumann and the Harvard. In
processor cores using the Von Neumann architecture, there is a single cache used for instruction
and data. In processor cores using the Harvard architecture, there are two caches: an instruction
cache (I- cache) and a data cache (D-cache).

Basic Architecture of a Cache Memory and Basic Operation of a Cache Controller

• A simple cache memory is shown on the right side of Figure 12.4. It has three main parts: a
directory store, a data section, and status information. All three parts of the cache memory are
present for each cache line.
• The cache must know where the information stored in a cache line originates from in main
memory. It uses a directory store to hold the address identifying where the cache line was copied
from main memory. The directory entry is known as a cache-tag.
• A cache memory must also store the data read from main memory. This information is held in the
data section. The size of a cache is defined as the actual code or data the cache can store from
main memory. Not included in the cache size is the cache memory required to support cache- tags
or status bits. There are also status bits in cache memory to maintain state information. Two
common status bits are the valid bit and dirty bit.
• A valid bit marks a cache line as active, meaning it contains live data originally taken from
main memory and is currently available to the processor core on demand.
• A dirty bit defines whether or not a cache line contains data that is different from the value it
represents in main memory.

• The cache controller is hardware that copies code or data from main memory to cache memory
automatically. It performs this task automatically to conceal cache operation from the software it
supports.

• The cache controller intercepts read and write memory requests before passing them on to the
memory controller. It processes a request by dividing the address of the request into three fields,
the tag field, the set index field, and the data index field. The three bit fields are shown in Figure
above.

• set index to determine which cache set the address should reside in. This cache line contains the
cache-tag and status bits, which the controller uses to determine the actual data stored there.

• The controller then checks the valid bit to determine if the cache line is active, and compares the
cache-tag to the tag field of the requested address. If both the status check and comparison succeed,
it is a cache hit. If either the status check or comparison fails, it is a cache miss.

• On a cache miss, the controller copies an entire cache line from main memory to cache memory
and provides the requested code or data to the processor. The copying of a cache line from main
memory to cache memory is known as a cache line fill.

• On a cache hit, the controller supplies the code or data directly from cache memory to
the processor
The Relationship between Cache and Main Memory

• Figure above shows where portions of main memory are temporarily stored in cache

• memory. The figure represents the simplest form of cache, known as a direct-mapped cache.

• In a direct-mapped cache each addressed location in main memory maps to a single location in cache
memory. Since main memory is much larger than cache memory, there are many addresses in main
memory that map to the same single location in cache memory. The figure shows this relationship for the
class of addresses ending in 0x824.

• The three bit fields introduced in Figure above are also shown in this figure. The set index selects the
one location in cache where all values in memory with an ending address of 0x824 are stored.

• The data index selects the word/halfword/byte in the cache line, in this case the second word in the
cache line. The tag field is the portion of the address that is compared to the cache-tag value found in the
directory store.

• During a cache line fill the cache controller may forward the loading data to the core at

• the same time it is copying it to cache; this is known as data streaming.

• Direct-mapped caches are subject to high levels of thrashing


Figure overlays a simple, contrived software procedure to demonstrate thrashing.

• The first time through the loop, routine A is placed in the cache as it executes.

• When the procedure calls routine B, it evicts routine A a cache line at a time as it is
loaded into cache and executed.

• On the second time through the loop, routine A replaces routine B, and then routine B
replaces routine A.

• Repeated cache misses result in continuous eviction of the routine that not running. This
is cache thrashing.
Set Associativity

• This structural design feature is a change that divides the cache memory into smaller equal units,
called ways. Figure above is still a four KB cache; however, the set index now addresses more
than one cache line.
• The set of cache lines pointed to by the set index are set associative.
• Mapping function allows a block of the main memory to reside in any block of a
specific set.
• Divide the cache into 64 sets, with two blocks per set.
• Memory block 0, 64, 128 etc. map to way 0, and they can occupy either of the two
positions.
• Memory address is divided into three fields:

- 22 bit tag field determines the set number.


- High order 6 bit set index fields are compared to the tag fields of the two blocks in a set.
-
• The mapping of main memory to a cache changes in a four-way set associative cache.
• Figure below shows the differences. Any single location in main memory now maps to four
different locations in the cache.
Increasing Set Associativity
As the Associativity of a cache controller goes up, the probability of thrashing goes
down.
 One method used by hardware designers to increase the set associativity of a cache
includes a content addressable memory (CAM).
 A CAM uses a set of comparators to compare the input tag address with a cache-tag
stored in each valid cache line.
 A CAM works in the opposite way a RAM works. Where a RAM produces data when
given an address value, a CAM produces an address if a given data value exists in the
memory. Using a CAM allows many more cache-tags to be compared simultaneously,
thereby increasing the number of cache lines that can be included in a set.
 Figure 12.9 shows a block diagram of an ARM940T cache. The cache controller uses the
address tag as the input to the CAM and the output selects the way containing the valid
cache line.

 The tag portion of the requested address is used as an input to the four CAMs that
simultaneously compare the input tag with all cache-tags stored in the 64 ways. If there is
a match, cache data is provided by the cache memory. If no match occurs, a miss signal
is generated by the memory controller.
 The controller enables one of four CAMs using the set index bits. The indexed CAM
then selects a cache line in cache memory and the data index portion of the core address
selects the requested word, halfword, or byte within the cache line.
Write Buffers
• A write buffer is a very small, fast FIFO memory buffer that temporarily holds data that the
processor would normally write to main memory.
• In a system without a write buffer, the processor writes directly to main memory. In a
system with a write buffer, data is written at high speed to the FIFO and then emptied to
slower main memory.
• The write buffer reduces the processor time taken to write small blocks of sequential

data to main memory.

• The efficiency of the write buffer depends on the ratio of main memory writes to the number
of instructions executed.

• A write buffer also improves cache performance; the improvement occurs during cache line
evictions.

• Data written to the write buffer is not available for reading until it has exited the write buffer
to main memory. The same holds true for an evicted cache line: it too cannot be read while it
is in the write buffer.

Measuring Cache Efficiency


There are two terms used to characterize the cache efficiency of a program: the cache hit rate and the
cache miss rate. The hit rate is the number of cache hits divided by the total number of memory
requests over a given time interval. The value is expressed as a percentage:

The miss rate is similar in form: the total cache misses divided by the total number of memory

requests expressed as a percentage over a time interval

Two other terms used in cache performance measurement are the hit time—the time it takes to access
a memory location in the cache and the miss penalty—the time it takes to load a cache line from
main memory into cache.
CACHE POLICY
There are three policies that determine the operation of a cache: The write policy, the replacement policy,
and the allocation policy.

1) Write Policy—Writeback or Writethrough

Writethrough: When the cache controller uses a writethrough policy, it writes to both cache and
main memory when there is a cache hit on write, ensuring that the cache and main memory stay
coherent at all times.

Writeback:the cache controller can write to cache memory and not update main memory, this is
known as writeback or copyback. When a cache controller in writeback writes a value to cache
memory, it sets the dirty bit true. If the core accesses the cache line at a later time, it knows by the
state of the dirty bit that the cache line contains data not in main memory.

2) Cache Line Replacement Policies


• On a cache miss, the cache controller must select a cache line from the available set in
cache memory to store the new information from main memory. The cache line selected for
replacement is known as a victim.

• If the victim contains valid, dirty data, the controller must write the dirty data from the
cache memory to main memory before it copies new data into the victim cache line. The
process of selecting and replacing a victim cache line is known as eviction.

• The strategy implemented in a cache controller to select the next victim is called
its replacement policy.

• ARM cached cores support two replacement policies, either pseudorandom or round- robin.
 Round-robin or cyclic replacement simply selects the next cache line in a set to
replace. The selection algorithm uses a sequential, incrementing victim counter that
increments each time the cache controller allocates a cache line. When the victim
counter reaches a maximum value, it is reset to a defined base value.

 Pseudorandom replacement randomly selects the next cache line in a set to


replace. The selection algorithm uses a nonsequential incrementing victim counter. In
a pseudorandom replacement algorithm the controller increments the victim counter
by randomly selecting an increment value and adding this value to the victim counter.
When the victim counter reaches a maximum value, it is reset to a defined base value.
3) Allocation Policy on a Cache Miss
• There are two strategies ARM caches may use to allocate a cache line after a the occurrence of a
cache miss. The first strategy is known as read-allocate, and the second strategy is known as read-
write-allocate.
• A read allocate on cache miss policy allocates a cache line only during a read from main memory.
If the victim cache line contains valid data, then it is written to main memory before the cache line is
filled with new data.
• A read-write allocate on cache miss policy allocates a cache line for either a read or write to
memory. Any load or store operation made to main memory

Coprocessor 15 and Caches


There are several coprocessor 15 registers used to specifically configure and control ARM cached
cores.

Primary CP15 registers c7 and c9 control the setup and operation of cache. Secondary CP15:c7
registers are write only and clean and flush cache. The CP15:c9 register defines the victim pointer
base address, which determines the number of lines of code or data that are locked in cache.

You might also like