ddr_controller_spec
ddr_controller_spec
Technology, LLC
WB DDR3 SDRAM
CONTROLLER
SPECIFICATION
August 2, 2016
Gisselquist Technology, LLC Specification 2016/08/02
Revision History
Rev. Date Author Description
0.0 8/02/2016 D. Gisselquist (Pre-release) Initial Version
Contents
Page
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.1 Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2 Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2.1 Bank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2.2 Refresh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
3 Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
4 Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
5 Wishbone Datasheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
6 I/O Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Figures
Figure Page
Tables
Table Page
Preface
Now, just why am I building this? Because wishbone’s been so good to me? Because I’ve never used
AXI? Because I dislike not being able to see what goes on within a memory controller, and have
no insight into why it’s performance is as fast (or slow) as it is? Because Xilinx allows you to only
open 4 banks at a tim? Or is it because, when I went to purchase my first high speed FPGA circuit
board, the vendor offered me the opportunity to purchase a DMA controller with it? As a micro
businessman, I really can’t afford using someone else’s stuff. Time is cheap, money isn’t nearly so
cheap.
Hence, I offer my work to you as well. I hope you find it useful. Of course, the normal caveats
are available: I am available for hire, and I would be happy to modify this core or even the license
it is distributed under, for an appropriate incentive.
1.
Introduction
The purpose of this core is to provide a GPL Wishbone Core capable of commanding a DDR3
memory at full speed. A particular design goal is that consecutive reads or writes should only take
one additional clock per read/write.
Since the DDR3 memory specification is dated as of August, 2009, memory chips have been
built to this specification. However, since DDR3 SDRAM’s are rather complex, and there is a lot
of work required to manage them, controllers for DDR3 SDRAM’s remain primarily in the realm of
proprietary.
Currently, there are no DDR3 controllers present on OpenCores. Sure, there’s a project named
“DDR3 SDRAM controller”, yet it has no data files present with it. This leaves the FPGA engineers
with the choice of building a controller for a very complex interface, or using a proprietary core from
Xilinx’s Memory Interface Generator, for which there is no insight into how it works, and then
retooling their bus from wishbone to AXI.
This core is designed to meet that need: it is both open (GPL), as well as wishbone compliant.
Further, this core offers 32–bit granularity to an interface that would otherwise offer only 128–bit
granularity. This core also offers complete pipelind performance. Because of the pipeline perfor-
mance, this core is very appropriate for filling cache lines. Because the core also offers non–pipelined
performance, it is also appropriate for random access from a CPU–whether by a write–through cache
or a CPU working without a cache.
2.
Architecture
2.2 Strategies
2.2.1 Bank
Currently, banks are activated (opened) when needed and only precharged (closed) upon refresh
request. Further, upon any read or write from one bank, the next bank is activated as well, under
the assumption that the next bank will be needed soon. This is necessary to allow pipeline access
with no stalls through the memory controller.
This means that, upon any bank miss, a bank precharge followed by bank activate command will
be necessary.
2.2.2 Refresh
The current build will pause all operations for four subsequent refreshes, at roughly every 4 refresh
intervals, and then allow operations to resume. This pause is independent of anything going on, and
includes a mandatory wait for any writes to finish, followed by a precharge command—regardless of
whether or not such is required.
This is non-optimal, and ripe for optimizing later. A better strategy might be to do singular
refreshes after any single refresh period assuming the bus is free, to only issue a precharge if the bus
is busy, and to only wait prior to that precharge if a write is busy. This will be a later optimization.
3.
Operation
When accessed from within an FPGA, this core should be simple to access: Raise the i wb cyc line
at the beginning of every transaction. Set i wb stb (transaction strobe), i wb we (Write enable, true
if writing or false otherwise), i wb addr (address of value), and i wb data for every transaction. You
may move to the next transaction any time i wb stb is true on the same clock that o wb stall is
false. Transactions will be pipelined internally. When o wb ack is true, a transaction has completed.
If that transaction was a read transaction, o wb data, will also be filled with the data read from the
memory device.
4.
Clocks
This design is centered around a DDR-1600 chip. In order to run this chip at speed, it requires a
200MHz clock. Xilinx recommends a 160 MHz clock for their design, so it should work at slower
rates–I just don’t know how much slower the design will continue to work for.
If you wish to slow down the design, adjust the parameter CKREFI4 to be the number of clocks
expected in four timse 7.8 µs.
5.
Wishbone Datasheet
Tbl. 5.1 is required by the wishbone specification, and so it is included here. The big thing to notice
Description Specification
Revision level of wishbone WB B4 spec
Type of interface Slave, Read/Write, pipeline mode sup-
ported
Port size 32–bit
Port granularity 32–bit
Maximum Operand Size 32–bit
Data transfer ordering (Irrelevant)
Clock constraints Designed for 200MHz, DDR1600
Signal Name Wishbone Equivalent
i wb clk CLK I
i wb cyc CYC I
i wb stb STB I
i wb we WE I
Signal Names
i wb addr ADR I
i wb data DAT I
o wb ack ACK O
o wb stall STALL O
o wb data DAT O
is that all accesses to the DDR3 SDRAM memory are via 32–bit reads and writes to this interface.
You may also wish to note that the memory interface supports pipeline reading and writing, to speed
up any transfers. As a result, the memory interface speed should approach one transfer per clock
once the pipeline is loaded, although there will be delays loading the pipeline. Other than refresh
cycles, once the pipeline is loaded it will continue its transfer rate at one cycle per clock for as long
as it is fed at that speed.
Further, the Wishbone specification this core communicates with has been simplified in this
manner: The STB I signal has been constrained so that it will only be true if CYC I is also true.
To interface this core in an environment without this requirement, simply create the i wb stb by
anding STB I together with CYC I before sending the strobe logic into the core.
6.
I/O Ports
The wishbone ports to this core were discussed in the last chapter, and shown in Tbl. 5.1. The rest
of the I/O ports to this core are listed in Tbl. 6.1.