N06 MemoryOrganization
N06 MemoryOrganization
1 Organizing Bits
Up to this point, we’ve taken a whirlwind tour of ECE, starting with
the basics, and working our way towards understanding a computer. We
began with the fundamentals of voltage, current, and resistance. We then
described how semiconductor devices like diodes and transistors give us
the ability to control voltage and current in circuits. Transistors enable
switching circuits, digital circuits that operate on high and low voltage
values, to which we assign an abstract binary interpretation of 1 and 0,
respectively.
Transistors can be organized into logic gate circuits that operate on
binary values. In your labs, you have harnessed these logic gates to build
circuits that add (and subtract) and to build circuits that store information
(latches). In class, we’ve talked about circuits that control the function of
other digital circuits (multiplexers, decoders), and organized latches, 1-bit
units of memory, into higher order structures (registers and register files).
We’re almost ready to see how all these come together to make a
functioning computer. First, we need to develop an addressable memory
array. Then, we need to understand the different forms that memory takes
up, from register files to SRAM, DRAM, and non-volatile memory! These
resources are managed to balance speed and size constraints.
1
2 Memory
Instead of arranging our N-latches into a line like in registers,√let’s organize
the
√ N latches into a square matrix. This matrix would have N rows and
N columns. As an example, you could arrange 256 latches into a 16 x
16 square. Extending 16 row wires and 16 column wires across the array
would be sufficient to individually select each row and column.
At every point where two wires cross in this array, we can place a
latch with a simple control structure. The same single Data bus
wire is shared between all of the array crossing points. An
AND gate ensures that a latch is only selected when both its row line
AND its column line are high. Two control wires are shared between all
cross-points: a Write Enable and Read Enable.
2
• If the Write Enable is high, the latch’s Enable pin is high, so the bit
on the Data line is stored in the latch.
• If the Read Enable is high, the latch’s Q output is connected to the
Data line through a pass transistor.1
Figure 2: Using two 4-bit decoders to select one bit in a 256-latch memory
array. The 8-bit address uses the first four bits to specify the row and
the last four bits to specify the column. Including the Data Line, Read
Enable, and Write Enable, this entire array can be represented as a 256-bit
memory block with 11 inputs.
1
Note: This NMOS pass transistor can pass a 1 or 0 if the source and bulk are not connected
to each other. A more robust implementation might be a CMOS transmission gate, using a PMOS
transistor to pass 1’s and an NMOS transistor to pass 0’s.
3
Let’s assess how far we’ve come. We managed to create a unit of
memory with 256 individually addressable bits that we can control using
just 11 wires:
4
The main memory of a computer is short-term storage arranged in an
array, often called Random Access Memory (RAM). Each byte
in the array can be accessed for a read or write operation at any time.
The simple array we have described is a 256-byte RAM. It requires the
following input/output wires:
• 8 Data (In/Out) Lines, one for each of the eight 1-bit arrays
• 8 Address bits, shared across all bit arrays
• 1 Write Enable, shared across all bit arrays
• 1 Read Enable, shared across all bit arrays
5
3 Memory Management
We’ve now looked at two structures for organizing memory bits on a com-
puter: registers and RAM. There is an inherent tradeoff between the two
in regards to cost and speed..
Registers act like “hardware variables.” They are relatively low density,
and operate at the speed of the processor, i.e. fast. Arithmetic operations
are usually performed on values stored in registers.
On the other hand, RAM is higher density, both with respect to the
bits and the addressing of them. RAM is more latent than registers, but
also far more scalable. RAM is used for “bulk storage” and values are
loaded from it into the registers for processing, and then stored back into
RAM.
6
3.2 Non-Volatile Memory
RAM and Registers act like the latches we have studied earlier in this
class: when power is removed, the memory is lost, i.e. they are volatile
memory. However, when you turn your computer off, you can still access
files when you power it back on due to non-volatile memory.
Hard drives (magnetic disk) and flash drives (solid state drives, SSDs)
are the main examples of non-volatile memory. They are slower than
RAM, and much slower than registers, but the trade-off is that they con-
tinue to store bits even after they are powered off. These permanent writes
and rewrites will eventually wear out these memory systems. For example,
SSD blocks need to be erased before they are rewritten, and a block wears
out after 100,000 writes.
Careful management of these disk resources ensure that particular
blocks do not wear too quickly from repeated use. In SSD systems, this is
handled by the flash translation layer.
7
How is this possible? The idea is that, if the system can keep what is
needed for the current activity in the lowest latency memory, the system
will perform at that speed, even though large volume of data may be stored
elsewhere. Although it is true that a price may occasionally be paid to
move the data to and from more latent, lower, levels of memory, if these
events are rare, their cost will be amortized to something negligible by the
huge number of low latency accesses at the top.
Think about your every day life. You’ve got a lot of stuff! What you
are using right now is on your desk. The things you expect to use sometime
today are in your backpack. The things you aren’t using today but were
likely to use recently or in the near future are back in the dorm. And,
the things you don’t expect to use this semester, they might be ”back
home”. It doesn’t matter how long it takes you to pack your backpack in
the morning, because its just a tiny fraction of the day. And, it doesn’t
matter how long a trip it is to get to and from ”home”, it is amortized
across a semester of activity, not just one.
A computer system doesn’t really know what memory is required next,
but it uses two simple ideas to approximate an understanding of the cur-
rent working set and, thereby, maximize its chances of having what is
needed high up in the memory hierarchy, where it can be accesses with
little latency, when it is needed:
8
Figure 5: The Memory Hierarchy. Faster, more expensive memory is at
the top, while slower, cheaper, denser memory is at the bottom. Data
is dynamically allocated across this hierarchy by a Memory Management
Unit according to need.
9
• Temporal Locality: Recently used data is likely to be used again
soon. And, data that hasn’t been used ofr a long time isn’t likely
to be used for a long time. For example, if you’re reading your 18-
100 notes now, you’re likely to continue reading them for the next
5 minutes. That high school handbook? Well, probably not looking
back at it for a while!
• Spatial Locality: Not only is the exact data that has recently
been used likely to be used again in the near future, memory nearby
recently accessed memory is also likely to be used in the near future.
For example, if you’re reading page 9 of this document, you’re likely
to need page 10 or 8 as well.
10
4 Glossary
Cache: A small segment of faster, short-term working memory. In a
computer’s hardware memory hierarchy, it is usually SRAM.
Dynamic RAM (DRAM): Stores one bit per cell consisting of charge
stored on a capacitor gated by one transistor. This simple structure means
that DRAM has a higher storage density than SRAM. A DRAM cell’s
state needs to be refreshed periodically because charge on the capacitor
leaks away over time. This repeated refreshing means that DRAM re-
quires more power and has a greater latency than SRAM.
Static RAM (SRAM): Stores one bit per cell consisting of 6 transis-
tors arranged in a flip-flop. Is relatively low latency and uses relatively
little power, but has a relatively lower storage density.
Working Set The resources that are presently being used by the active
11
task. In the context of a memory hierarchy, the core data that needs to
be immediately available to prevent the execution of the task from being
delayed.
12