Lec10 Sram1
Lec10 Sram1
Elad Alon
Electrical Engineering and Computer Sciences
University of California, Berkeley
http://www-inst.eecs.berkeley.edu/~cs150
Announcements
• Homework #4 due Thursday
1
Project CPU Pipelining Review
3-stage I X M
pipeline instruction execute access data
fetch memory
• Pipeline rules:
– Writes/reads to/from DMem use leading edge of “M”
– Writes to RegFile use trailing edge of “M”
– Instruction Decode and Register File access is up to you.
• 1 Load Delay Slot, 1 Branch Delay Slot
– No Stalling may be used to accommodate pipeline hazards (in
final version).
• Other:
– Target frequency to be announced later (50-100MHz)
– Minimize cost
– Posedge clocking only
Fall 2011 EECS150 Lecture 10 Page 3
Memory-Block Basics
• Uses:
Whenever a large collection of state elements is required.
– data & program storage
– general purpose registers log2(M)
– data buffering
– table lookups
– CL implementation
M X N memory:
2
Memory Components Types:
• Volatile:
– Random Access Memory (RAM):
• SRAM "static" Focus today
• DRAM "dynamic" Focus in ~2 weeks
• Non-volatile:
– Read Only Memory (ROM):
• Mask ROM "mask programmable"
• EPROM "electrically programmable"
• EEPROM "erasable electrically programmable"
• FLASH memory - similar to EEPROM with programmer integrated
on chip
3
Address Decoding
SRAM Internals
WL1
WL2
WLi
4
SRAM Cell Details
• Most common is 6 transistors (6T) cell:
WL
BL BL
WL0 WL0
WL2 WL2
WL3 WL3
BL BL_B BL BL_B
5
SRAM Cell Array: Write
For read operation, column bit lines are both driven to high
voltage (supply), then released. When activated, cell pulls down
one bit line or the other.
Fall 2011 EECS150 Lecture 10 Page 12
6
Column Multiplexing:
• Permits input/output data widths different from row width.
• Enables physical aspect ratio closer to a square
– Why is this important?
1024x1: 256x4:
7
Logical View: Cascading Memory-Blocks
How to make larger memory blocks out of smaller ones.
Multi-ported Memory
• Motivation:
– Consider CPU core register file: Aa
Douta
Dina
• 1 read or write per cycle limits
WEa
processor performance. Dual-port
• Complicates pipelining. Difficult for Memory
Ab
different instructions to Dinb Doutb
simultaneously read or write regfile. WEb
• Common arrangement in pipelined
CPUs is 2 read ports and 1 write
port.
– I/O data buffering: disk or network interface
• Dual-porting allows both sides to data
simultaneously access memory buffer CPU
8
Dual-ported Memory Internals
• Add decoder, another set of • Example cell: SRAM
read/write logic, bit lines, word lines:
WL2
WL1
9
Adding Ports to Primitive Memory Blocks
How to add a write port to a simple dual port memory.
Example: given 1Kx8 SDP, want 1 read & 2 write ports.
Virtex-5 LX110T
memory blocks:
Distributed RAM
using LUTs
among the CLBs.
Block RAMs
in four
columns.
10
SLICEL vs SLICEM ...
SLICEL SLICEM
A SLICEM 6-LUT…
11
Example Distributed RAM (LUT RAM)
Example configuration:
Single-port 256b x 1,
registered output.
12
Example Dual Port Configurations
13
Spring 2009 EECS150 - Lec03-FPGA Page
Fall 2011 EECS150 Lecture 10 Page 27
14
Block RAM Timing
15
Inferring RAMs in Verilog
// 64X1 RAM implementation using distributed RAM
endmodule
16
Block RAM Inference
//
// Single-Port RAM with Synchronous Read
//
module v_rams_07 (clk, we, a, di, do);
input clk;
input we;
input [5:0] a;
input [15:0] di;
output [15:0] do;
reg [15:0] ram [63:0];
reg [5:0] read_a;
always @(posedge clk) begin
if (we)
ram[a] <= di; Synchronous read
read_a <= a; (registered read address)
infers Block RAM
end
assign do = ram[read_a];
endmodule
endmodule
17
Dual-Port Block RAM
module test (data0,data1,waddr0,waddr1,we0,we1,clk0, clk1, q0, q1);
assign q0 = mem[reg_waddr0];
assign q1 = mem[reg_waddr1];
endmodule
18
XUP Board External SRAM
“ZBT”
“ZBT” synchronous
synchronous
SRAM,
SRAM, 99 Mb
Mb on
on
32-bit
32-bit data bus,
data bus,
with
with four
four “parity”
“parity”
bits
bits
256K
256K xx 36
36 bits
bits
(located
(located under the
under the
removable
removable LCD)
LCD)
19