Lecture20 Memory PDF
Lecture20 Memory PDF
edu/~eecs151
EECS151/251A L20 MEMORY AND CLOCK Nikolić, Shao Fall 2019 © UCB 1
Review
• SRAM and regfile cells can have multiple R/W ports
• Memory decoding is done hierarchically
• Wire-limited in large arrays
• Multiple cache levels make memory appear both fast and big
• Direct mapped and set-associative cache
EECS151/251A L20 MEMORY AND CLOCK Nikolić, Shao Fall 2019 © UCB 2
ASIC Memories
EECS151/251A L20 MEMORY AND CLOCK Nikolić, Shao Fall 2019 © UCB 3
ASIC Memory Compilers
• Memory compiler
produces front-end
views (similar to
standard cells, but
really large ones)
EECS151/251A L20 MEMORY AND CLOCK Nikolić, Shao Fall 2019 © UCB 4
FPGA Memories
EECS151/251A L20 MEMORY AND CLOCK Nikolić, Shao Fall 2019 © UCB 5
Verilog RAM Specification
//
// Single-Port RAM with Asynchronous Read
//
module ramBlock (clk, we, a, di, do);
input clk;
input we; // write enable
input [19:0] a; // address
input [7:0] di; // data in
output [7:0] do; // data out
reg [7:0] ram [1048575:0]; // 8x1Meg
always @(posedge clk) begin // Synch write
if (we)
ram[a] <= di;
assign do = ram[a]; // Asynch read
endmodule
endmodule
initial
“data.dat” contains initial RAM contents, it gets
begin
put into the bitfile and loaded at configuration
$readmemb("data.dat", mem);
end time.
(Remake bits to change contents)
always@(posedge CLK)
read_addr <= ADDR;
endmodule
EECS151/251A L20 MEMORY AND CLOCK Nikolić, Shao Fall 2019 © UCB 12
First-in-first-out (FIFO) Memory
• Used to implement queues. • Producer can perform many writes
without consumer performing any
• These find common use in processor and reads (or vice versa). However,
communication circuits. because of finite buffer size, on
average, need equal number of reads
• Generally, used to “decouple” actions of producer and writes.
and consumer: • Typical uses:
stating state – interfacing I/O devices. Example
network interface. Data bursts
c ba from network, then processor
bursts to memory buffer (or reads
after write one word at a time from
d c ba interface). Operations not
synchronized.
after read
– Example: Audio output. Processor
produces output samples in bursts
dc b (during process swap-in time).
Audio DAC clocks it out at constant
sample rate.
EECS151/251A L20 MEMORY AND CLOCK
FIFO Interfaces
EECS151/251A L20 MEMORY AND CLOCK Nikolić, Shao Fall 2019 © UCB 16
A SLICEM 6-LUT ...
Memory data input
Normal
5/6-LUT
outputs.
Normal
6-LUT
inputs. Memory
data input.
EECS151/251A L20 MEMORY AND CLOCK Nikolić, Shao Fall 2019 © UCB 23
DRAM
EECS151/251A L20 MEMORY AND CLOCK Nikolić, Shao Fall 2019 © UCB 24
3-Transistor DRAM Cell
BL 1 BL 2
WWL
RWL WWL
M3 RWL
M1 X M2 X VDD - VT
VDD
CS BL 1
BL 2 VDD DV
VBL
EECS151/251A L20 MEMORY AND CLOCK Nikolić, Shao Fall 2019 © UCB 27
Example Clock System
EECS151/251A L20 MEMORY AND CLOCK Nikolić, Shao Fall 2019 © UCB 29
Clock Distribution
H-tree
CLK
[Restle98]
31
GCL K
Driver
Driver
Driver
GCLK GCLK
GCL K
32
33
EECS151/251A L20 MEMORY AND CLOCK Nikolić, Shao Fall 2019 © UCB 34
What Happens When We Un-Gate Clock?
EECS151/251A L20 MEMORY AND CLOCK Nikolić, Shao Fall 2019 © UCB 35
Crossing Clock Domains
• Two domains at different frequencies exchange wdata, rdata
• FIFO with two clocks
http://www.sunburst-design.com/papers/CummingsSNUG2002SJ_FIFO1.pdf
EECS151/251A L20 MEMORY AND CLOCK Nikolić, Shao Fall 2019 © UCB 36
Summary
• Memory compilers generate SRAM blocks
• Several options for memory on FPGAs: Distributed, BlockRAM, UltraRAM
• Clock generation and distribution is a major part of digital system design
• We just touched on it
EECS151/251A L20 MEMORY AND CLOCK Nikolić, Shao Fall 2019 © UCB 37