On-Chip MRAM As A High-Bandwidth, Low-Latency Replacement For DRAM Physical Memories
On-Chip MRAM As A High-Bandwidth, Low-Latency Replacement For DRAM Physical Memories
Conclusions:
In this paper, we have introduced and
examined an emerging memory
technology, MRAM, which promises to
enable large, high bandwidth memories.
MRAM can be integrated into the
microprocessor die and avoid the
conventional pin bandwidth limitations
found in off-chip memory systems. We
have developed a model for simulating
MRAM banks and use it to examine the
trade-offs between line size and bank
number to derive the MRAM
organization with the best performance.
Cost of Writes: Writes to MRAM
We break down the components of
memory consume more power than
latency in the memory system, and
reads because of the larger current
examine the potential of page placement
needed to change the polarity of the
to improve performance. Finally, we
MRAM cell. Hence, for a low power
have compared MRAM with
design, it might be better to consider an
conventional SDRAM memory systems volatile, its impact on system reliability
and another emerging technology, over conventional memory should be
chipstacked SDRAM, to evaluate its measured. Finally, our uniprocessor
potential as a replacement for main simulation does not take full advantage
memory. Our results show that MRAM of the large bandwidth inherent in the
systems perform 15% better than partitioned MRAM. We expect that chip
conventional SDRAM systems and 30% multiprocessors will have additional
better than stacked SDRAM systems. An performance gains beyond the
important feature of our memory uniprocessor model studied in this paper.
architecture is that the L2 cache and
MRAM banks are partitioned. This
architecture reduces miss conflicts in the
L2 cache and provides high bandwidth References:
when multiple L2s are accessed
simultaneously. We studied MRAM [1] V. Agarwal, M. S. Hrishikesh, S. W.
systems with perfect L2 caches and Keckler, and D. Burger. Clock rate
perfect networks to understand where versus IPC: The end of the road for
performance was being lost. We found conventional microarchitectures. In
that the penalty of cache conflicts in the Proceedings of the 27th Annual
L2 cache and the network latency had International Symposium on Computer
widely varying effects among the Architecture, pages 248–259, June 2000.
benchmarks. However, these results did
show that page allocation policies in the [2] V. Agarwal, S. W. Keckler, and D.
operating system have a great potential Burger. The effect of technology scaling
to improve MRAM performance. Our on microarchitectural structures.
work suggests several opportunities for Technical Report TR2000-02,
future MRAM research. First, our Department of Computer Sciences,
partitioned MRAM memory system University of Texas at Austin, Austin,
allows page placement policies for a TX, Aug. 2000.
uniprocessor to consider a new variable
– proximity to the processor. Allowing [3] D. Bailey, J. Barton, T. Lasinski, and
pages to dynamically migrate between H. Simon. The NAS parallel
MRAM partitions may provide benchmarks. Technical Report RNR-91-
additional performance benefit. Second, 002 Revision 2, NASA Ames Research
the energy use of MRAM must be Laboratory, Ames, CA, Aug. 1991.
characterized and compared to
alternative memory technologies. [4] H. Boeve, C. Bruynseraede, J. Das,
Applications may have quite different K. Dessein, G. Borghs, and J. D. Boeck.
energy use given that the energy Technology assessment for the
required to write the MRAM cell is implementation of magnetoresistive
greater than that to read it. In addition, elements with semiconductor
the L2 cache line size has a strong effect components in magnetic random access
on the amount of data written to the memory MRAM architectures. IEEE
MRAM and may be an important factor Transactions on Magnetics
in tuning systems to use less energy. 35:2820–2825, Sep 1999.
Third, since MRAM memory is non-
[5] P. N. Brown, R. D. Falgout, and J. E.
Jones. Semicoarsening multigrid on
distributed memory machines. Technical
Report UCRL-JC-130720, Lawrence
Livermore National Laboratory, 2000.