Design of Cache Memory Mapping Techniques For Low Power Processor
Design of Cache Memory Mapping Techniques For Low Power Processor
ISSN 1819-6608
www.arpnjournals.com
2Department
ABSTRACT
The use of cache memory makes the processing of access in a faster rate. The main purpose of cache memory is to
give faster memory access by which the data read should be fast and at the same period d provide less expensive and types
of semiconductor memories which are of large memory size. There is correspondingly main memory which is large but
slow together with a smaller as well faster cache memory. The cache memory contains a copy of instruction from main
memory. The processor when it needs to read from or write to in the main memory locations, it first checks whether the
cache memory contains the required data. This paper presents a two level cache in which the splitting of cache level is used
by which faster access time and low power consumption can be achieved. The main focus of this project is reduced access
time and power consumption.
Keywords: direct mapping techniques, fully associative mapping, set associative mapping technique.
1. INTRODUCTION
Cache systems are on-chip memory elements
such that data that is needed can be stored. The miss rate
that occurred in cache memory can be found by the
controller. When the data that is required by
microprocessor is found then the data hit is said to occur in
the cache. The common usage of storing data on cache is
to achieve faster data reading time but energy
consumption is one of the drawbacks of on-chip. As the
cache memory moves away from the CPU, the access time
and the size of the cache memory storage unit increase as
well. Cache memory is an additional and fast memory unit
that has to be placed between the processing unit and the
physical memory. The mostly used instructions and data,
where this information is needed to be accessed again are
stored in cache. The physical memory and external disk
storage devices can be accessed faster by the internal
registers and cache which are located near to CPU. Cache
which is accessed faster can be considered to be more
power efficient. To address this challenge and also
sustainable computing goals are met by several energy
efficient techniques that are proposed for the cache
architecture. The static and dynamic power consumption
will lead to the total power consumption. If so, the
processor will reads from or writes immediately to the
cache, which is much smaller and faster to read and write
the data that is required. Three independent caches are
used in the modern desktop and server CPUs, such as
an instruction cache to speed up executable fetch from
instruction, a data cache is used to speed up data fetch and
store, transistion look aside table used to increase the
virtual-to-physical address translation. The two levels of
memory are used to reduce average access time works in
principle, during the course of execution of a program.
One obvious advantage of the logical cache is that cache
access speed is faster than a physical cache, because this
4783
ISSN 1819-6608
www.arpnjournals.com
2.1 Direct mapping
In a direct mapped cache, each block has only
one place that it can go. Thus when the CPU needs a
certain block from the main memory, the cache memory is
the only one place that it could possibly reside. The
needed block can also be fetched from the lower level of
memory if it is not there in the cache memory. In order to
find where the block resides in the cache, the block frame
address is divided by the number of blocks in the cache.
The advantage of direct mapping technique is that
the mapping function that is followed in this method is of
trivial. If the cache memory is needed for the data to be
stored then the cache line will be immediately available
and the access time to read the data from the cache line of
the cache memory will be faster. In optimum case the
sequential processing will be followed. The drawback of
the direct mapping is, if cache not yet full line may be
replaced and replacing scheme may be inadequate. In
worst cases referencing always contains same line number.
2.2 Fully associative mapping
In a fully associative cache, the block can be
placed anywhere in the cache since there are no
restrictions as to where it has to be placed. When the CPU
needs a certain memory block, the cache must be checked
for each block that resides in it, to determine if the
required information is present in the cache. A block is
only evicted from a fully associative cache if the cache is
full.
The advantage of fully associative cache is that
the trivial method of mapping function is followed. When
the data has to be written in the cache only if the cache is
full then the cache line will be replaced. In optimum case
the access of the data will be wide spread access. The
disadvantage of fully associative cache is, whenever the
data has to be searched it checks all the memory cell of the
cache. The extended tag field is used and additional tag
counter is required. Replacing scheme that is used in this
type of mapping is inadequate. In worst case the area are
slightly larger than the cache that are done sequentially.
2.3 Set associative mapping
In a set associative cache, certain set of places are
allocated for each memory block. A set consists of a group
of two or more blocks in the cache. When a block is
placed into cache memory from the main memory, first the
block is mapped to a set, and within that set the block is
free to go anywhere. This method of set associative
mapping combines the likes of both direct mapped and
fully associative mapping in that the block is directly
mapped to a set, and then is fully associative within that
set. In order to determine the set that the block should be
placed in, the block frame address is divided (modulo) by
the number of sets that are in the cache.
A cache is said to be n-way set associative if
there are n blocks in each set. The power consumed by the
cache memory which is of set associative is more when
4784
ISSN 1819-6608
www.arpnjournals.com
4. PROPOSED WORK
The proposed methodology consist of two level
cache memory by which the second level cache memory
which is of fully associative can be splited into equally
two levels which is of direct mapping and fully associative
mapping technique by using this dividing of a single cache
level which will lead to faster access and low power
consumption when compared to the existing system by
which the power efficiency as well the performance of the
cache can be achieved as shown in Figure-6.
The cache should achieve a less delay by which
the cache system will have a quick access as it will check
every memory address. The total power consumption of
any processors is increasing tremendously and it reaches
the powerwall imposed by thermal limitations of cooling
solutions and power delivery. Inorder to reduce the power
consumption by the cache memory we can choose the
mapping technique which has less power consumption. To
reduce the miss rate victim cache which is of fully
associative is placed at the last by which the miss rate can
also be reduced mainly reduced delay, less power
consumption is achieved in this work.
4785
ISSN 1819-6608
www.arpnjournals.com
Case-1
In this case, the L1 cache level is active and L2
cache level is non active. When the word line of L1 cache
is accessed then its corresponding output is shown in
Figure-7 and similarly when the word line of L2 cache is
not accessed then its corresponding output is shown in
Figure-8.
Case-2
In this case, the L1 cache level is not active and
L2 cache level is active. When the word line of L1 cache
is not accessed then its corresponding output is shown in
Figure-9 and similarly when the word line of L2 cache is
accessed then its corresponding output is shown in Figure10.
4786
ISSN 1819-6608
www.arpnjournals.com
Table-2. Performance Analysis of proposed cache.
Proposed system
Parameter
180nm
130nm
Average power
1.3042mW
4.7816mW
Average delay
1.6207ps
9.3500ps
2.1137fJ
44.7830fJ
5. CONCLUSIONS
Caches memory can be implemented in different
ways, but the basic concepts behind the cache technique
remain indentical. A challenge in cache design is to
conform that the required data as well as the instructions
are presented in the cache line. Overall power
consumption of any processors is increasing tremendously
and it reaches the powerwall imposed by thermal
limitations of cooling solutions and power delivery. Thus,
using technological scaling higher performance can be
achieved and managing the power consumption of
processors has become a vital necessity. In this paper,
mainly 3.36% reduction in delay and 0.45% reduction in
power consumption and 3.65% reduction in power delay
product is achieved at the cost of miss rate.
REFERENCES
[1] B.M Rogers, A. Krishna et al. 2009. Scaling the
bandwidth wall: challenges in and avenues for CMP
scaling. ACM SIGARCH computer Architecture.
[2] S. Borkar et al. 1999. Design challenges of
technology scaling. IEEE.
[3] G. Gammie A. et al. 2010. Smartreflex power and
performance management technologies for 90nm,
65nm, 45nm mobile Application processors. IEEE.
180nm
130nm
Average power
1.3044mW
4.8034mW
Average delay
3.3162ps
9.6750ps
4.3256fJ
46.4773fJ
4787
ISSN 1819-6608
www.arpnjournals.com
[8] Norman P. Jouppi. 1990. Improving direct mapped
cache performance by the addition of a small fully
associative cache and prefetch buffers.
[9] Bradford M. Beckmann and David A. wood. 2004.
Managing wire delay in large chip multiprocessor
caches. In International symposium of Micro
architecture.
[10] C.L. Hwang, T. Kirihata et al. 2002. A 2.9ns random
Access cycle embedded DRAM with a destructive
read Architecture. In IEEE symposium.
[11] G.Kirsch. 2003. Parallel and distributed processing.
On International symposium.
[12] Krisztianflaunter, nam sung Kim et al. 2002. Drowsy
caches: simple techniques for reducing leakage
power. International symposium on computer
Architecture.
[13] Michael powell et al. 200. Gatedvdd: a circuit
technique to reduce leakage in deep-submicron cache
memories. International Symposium on low power
electronics and design.
[14] Johnson Kin,Munisha Gupta et al. 1997. The filter
cache: an energy efficient memory structure.
International symposium on Micro architecture.
[15] Atsushi Hasegawa, Ikuya Kawasaki et al. 1995. High
code density low power. IEEE Micro.
[16] Ravi. T, Kannan. V. 2012. Design and Analysis of
Low power CNTFET TSPC D Flip-Flop based shift
Registers. Applied Mechanics and Materials journal.
[17] Ravi T et al. 2013. Ultra Low Power Single Edge
Triggered Delay Flip Flop Based Shift Registers uses
10-Nanometer Carbon Nano Tube Field Effect
Transistor. In American Journal of Applied Sciences.
10(12): 1509-1520.
4788