Memory Controller For A 6502 CPU in VHDL: Michel Wilson, 1047981

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

Memory controller for a 6502 CPU in VHDL

Final report for in3019p ‘Bachelor project’


TU Delft, faculty of EEMCS

Michel Wilson, 1047981


michel@crondor.net

May 2006

Project client Embedded Software Lab, TU Delft


Client contact A.J.C. van Gemund
Coordinator B.R. Sodoyer
Abstract

The 6502 soft core implemented in VHDL on an FPGA development board as used by the
Embedded Software Lab of the Delft Technical University currently only uses on-chip RAM. This
leads to a shortage of memory in certain situations. To solve this problem, a memory controller
for the soft core which enables access to the external RAM chips available on the development
board has to be developed. This project describes the design and the development of such a
controller, and an accompanying program to test the working of the controller and the memory.
Contents

Contents 1

1 Introduction 3
1.1 About the project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 About this document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Action plan 4
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.2 Approval and adjustment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Project description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.1 Project goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.2 Milestones and deliverables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.3 Requirements and constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Project phases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3.1 Requirements analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3.2 Design decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3.3 Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3.4 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3.5 Final product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3 Architecture and design decisions 6


3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2 Existing situation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.3 New situation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.4 Memory controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.4.1 Address space distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.4.2 Addressing modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.4.3 Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.5 Address translator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.6 Memory test algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.6.1 Addressing test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.6.2 Data test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4 Code documentation 12
4.1 VHDL code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.1.1 Toplevel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.1.2 Memory management unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.1.3 Memory controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.1.4 I/O controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2 Memory test program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

5 Conclusion and recommendations 22

1
A Working with the FPGA under Linux 23
A.1 USB to Serial Converter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
A.2 Xilinx software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

B External memory timing diagrams 25

Bibliography 26

2
C H A P T E R 1

Introduction

1.1 About the project


The Embedded Software Lab of the Delft Technical University, which is part of the Software
Engineering Department, uses a 6502 soft core, implemented in VHDL in a Digilent Spartan-3
Development Board. A limitation of the soft core is the limited amount of available memory.
The current design only uses the RAM available on the FPGA chip. The aim of the project is to
design and develop a memory controller for the soft core, which enables access to the external
RAM chips on the development board. A secondary goal of the project is to create a program for
the 6502 which can assist in testing the integrity of the memory controller, and the external RAM.
The project is completed in connection with the requirements of the course in3019p, ‘Bachelor
Project’.

1.2 About this document


In this document, the final result of the project is presented, along with the road towards this final
result. In three chapters, the documents which were produced during the projects are included.
The action plan describes the goal of the project, and its phases; the architecture and design decisions
describes the technical implementation of the controller on a high level; and finally, the technical
documentation describes the source code of the final product in detail. The final chapter contains
some concluding remarks about the project. Lastly, some remarks and instructions about working
with the FPGA board and the Xilinx software under Linux are included in the appendix.

3
C H A P T E R 2

Action plan

2.1 Introduction
2.1.1 Motivation
Currently, a 6502 soft core is used in the Embedded Software lab, which is implemented in VHDL
on a Digilent Spartan-3 Development Board, using the Xilinx ISE WebPACK design software ([3]).
The soft core is used for research, and for the course in4073 (Embedded Real-Time Systems). All
programs using this core are severely constrained by the limited amount of available memory in
the FPGA chip on the Development Board. The primary goal of the project is therefore to increase
the amount of memory available to the soft core, using the external memory available on the
development board.

2.1.2 Approval and adjustment


The action plan should be approved by the client. During the work on the project, progress should
be reported. Any adjustments to the action plan will be contained in the progress reports. Upon
approval of the progress reports, the original action plan is implicitly adjusted and approved as
well. The actual action plan thus consists of the original document and all progress reports.

2.2 Project description


2.2.1 Project goals
The single goal of the project is increasing available memory of the soft core. To attain this goal
a memory management unit (MMU) should be developed which will act as an interface between
the soft core and the external RAM which is available on the development board. The MMU
should be a reusable component. As the available external memory is larger than the address
space of the current soft core, some form of bank switching should be implemented.
Secondly, a test program should be built, demonstrating the functionality of the MMU and
validating that the component is indeed working. The test program should provide a reliable
indication if the MMU and memory are working or faulty.

2.2.2 Milestones and deliverables


The first milestone in the project is the specification of the architecture of the system. This
milestone is reached after the various design decisions are made. The deliverable is the VHDL
entity description, accompanied by a document detailing the design decisions made and the
rationale behind them. In this document, the methodology of the test program should also be
described, together with an explanation why this is a valid testing method.
The second, and final milestone is reached upon completion of the project. The main deliver-
able is the VHDL source code for the memory management unit. It should consist of a general
entity which communicates with the external memory, and a translation unit which maps the
memory into the address space of the soft core, including the handling of bank switching. Clear
documentation of the usage, but also of the inner workings, of these entities should be provided.

4
The other deliverable is the C source code of the test program, also including documentation,
explaining how to use the program, and explaining why the test procedure is deemed to be an
accurate indication of the working or failure of the MMU and/or the memory.

2.2.3 Requirements and constraints


The main requirement for the project is the reliability. This requirement has priority over all other
requirements. If some implementation of another requirement endangers the reliability of the
project the offending requirement should be implemented in a different fashion, not endangering
the primary requirement, or dropped althogether. The project’s conformance to this requirement
can be checked using the test program. This also implies that the reliability (correctness) of the test
program is a main requirement. The conformance to this requirement should be demonstrated
by an in-depth analysis of the method the test program uses to perform its function.
Further requirements of the memory management unit are adequate performance, and small
resource usage. A suitable balance of these two has to be found, but in general it should be
sufficient to be able to always either read or write a memory location without introducing wait
cycles in the processor. The total resource usage of the memory management unit should be kept
within reasonable bounds, but it should not be necessary to use extreme measures to obtain the
lowest possible resource usage.

2.3 Project phases


2.3.1 Requirements analysis
The requirements and constraints for the design are specified, in consultation with the client. Most
requirements/constraints are already specified in this document, so this is not really a separate
phase.

2.3.2 Design decisions


Decisions are made on specific implementation details, and documented including the rationale
behind them. In particular, the (external) interfaces of the VHDL entities can be specified in this
phase. Also, decisions are made on how to interface with the external memory.

2.3.3 Development
The actual implementation takes place in this phase. If any design decisions from the previous
phase are found to be infeasible, an alternative decision is to be sought, in consultation with the
client. The documentation of the previous phase should be updated in this case.

2.3.4 Testing
All implemented systems are tested, to confirm that they indeed perform according to the speci-
fication. Any inconsistencies are to be corrected.

2.3.5 Final product


The final product consists of the deliverables as mentioned above, and a report on the development
process, including all documents which are produced.

5
C H A P T E R 3

Architecture and design decisions

3.1 Introduction
In this chapter, the proposed architecture of the memory management unit extension for the
6502 softcore will be described, along with the design decisions which were made to reach this
proposal. Also, the choice of the testing algorithms will be discussed. The VHDL code was
developed using the Xilinx ISE WebPACK design software, version 7.1i, under Linux ([3].
As there is a clear distinction between interfacing with the external memory, and interfacing
with the rest of the processor, the memory controller will be divided into a part which accesses
the external memory, and a part which translates the memory address. These components will
be discussed in detail, below. Lastly, the test program will be discussed.

3.2 Existing situation


In the existing situating the system consists of the CPU itself, and the memory management unit.
Both are connected to the central clock. From the central clock, the cpu_halt signal is derived,
which has half the clock speed. This signal controls the activation of the CPU and the MMU.
When the CPU is not enabled, the MMU is enabled, and vice versa.
The interface between the CPU and the MMU consists of an address bus, read/write signals,
and data busses in both directions. The MMU partitions the address space in blocks of 4kB, into
which separate components are mapped. For the mapping in the existing situation, see table 3.1.
Essentially, the MMU functions as a kind of multiplexer.

MMU
15..12
RAM

address 11..0 .. ..
. .
read flag
RAM
write flag
CPU
data
I/O

ROM

Figure 3.1: Block diagram of the existing situation

6
MMU
15..12
RAM

address 11..0 .. ..
. .
read flag
RAM
write flag

data External
CPU memory
I/O

ROM

Memory
controller

Figure 3.2: New situation

3.3 New situation


In the new situation, an extra component is added to the MMU, which can address the external
memory chips. This component is mapped to several address blocks, from 5000h to CFFFh. The
new mapping is also displayed in table 3.1.
As the external memory is bigger than the available address space, bank switching is used.
To select the bank to map to the internal address space, a register is added to the I/O component,
at address 500h, which is subsequently mapped to E500h. Writing to this register switches to
the specified portion of the external memory. Reading the register gives the currently mapped
memory bank.
The setup of the MMU dictates the signals of the memory controller on the side of the MMU.
The same in/outputs are used as on the other components, refer to table 3.2 for the details. The
only difference is in the size of the address signal. The existing components have an address
signal of 12 bits, since they are mapped into a region of 4kB. As the memory controller is mapped
into 8 such regions, 15 bits are needed for this. Furthermore, the memory controller needs to
know the current page, from the I/O controller. For simplicity, the 5 bits of the page address are
appended to the address signal, resulting in an address signal of 20 bits.
The setup of the memory chips dictates the signals of the memory controller on the other side.
The board contains two memory chips. There is an enable signal for each chip, a signal to select
the upper or lower byte, and the data lines. Shared among the chips is the output enable, the
write signal, and the address. See table 3.3 for more details.

3.4 Memory controller


The task of the memory controller is interfacing with the external memory. It takes an address,
a direction (read or write), and the actual data, and reads/writes it to the external memory. All
the details of how the memory should be accessed, like timing, are taken care of by the memory
controller. More information about the memory chips can be found in the datasheet (see [1]).

7
Address Existing situation New situation
0000h internal RAM internal RAM
1000h internal RAM internal RAM
2000h internal RAM internal RAM
3000h internal RAM internal RAM
4000h internal RAM internal RAM
5000h unused external RAM
6000h unused external RAM
7000h unused external RAM
8000h unused external RAM
9000h unused external RAM
A000h unused external RAM
B000h unused external RAM
C000h unused external RAM
D000h unused unused
E000h I/O I/O
F000h ROM ROM

Table 3.1: Address space of the 6502 softcore

Signal Function
CLK Central 50MHz clock
halt Enable/disable the controller, see also timing section
read_enable Read the selected memory address
write_enable Write the selected memory address
address (19 downto 0) Memory address
data_in (7 downto 0) Data to memory
data_out (7 downto 0) Data from memory

Table 3.2: Signals for the memory controller, MMU side

Signal Function
mem_enable_1
Chip enable signals
mem_enable_2
mem_upper_1 Enabling/disabling of the upper output byte of the memory
mem_upper_2 chips
mem_lower_1 Enabling/disabling of the lower output byte of the memory
mem_lower_2 chips
mem_out_enable Output enable
mem_wr_enable Write enable
mem_address (17 downto 0) Memory address
mem_data_1 (15 downto 0)
Data lines
mem_data_2 (15 downto 0)

Table 3.3: Signals for the memory controller, external side

8
3.4.1 Address space distribution
As there are two different memory chips, the addresses can be distributed in different ways. The
two choices are to distribute the data linearly over the two chips, or to stripe the data across the
two chips with some block size. The striping approach can be used to increase performance when
accessing the memory sequentially. In this case however, the memory speed is high enough to
achieve what we want, so striping is not necessary. Therefore, the data is distributed linearly over
the two chips, using the MSB to select the chip to use.
As the chips have a 16 bits wide data path, the LSB is used to drive the signals which select
the upper/lower byte, and to select the required data signals. The MSB is used to select one of the
two memory chips. A 4-way multiplexer is thus needed, using the MSB and the LSB to select one
out of four 8 bit words from the data lines.

3.4.2 Addressing modes


The memory chips can be addressed in several modes. Due to the fact that the CPU uses 8 bits,
and the memory uses 16 bits, using the mem_upper/mem_lower signals to control the memory is
necessary when writing. Write cycle 4 is thus the most logical choice (see figure B.2 on page 25).
When reading, these signals are not really needed, because we are using a multiplexer anyway, to
select the right data bits. Since there is no reason (and no way) to switch off the memory between
reads, read cycle 1 is the only choice for reading (see figure B.1 on page 25).

3.4.3 Timing
In the existing system, the CPU and the memory system are enabled every other clocktick, using
the halt signal. For a memory read, this means we have one clock cycle of 20ns to do all the
work. For a write, we can also use the clock cycle in which the halt signal is high.
The memory has an access time of 10ns, making the above possible. The address is loaded at
the start of the clockcycle, and the multiplexer controlling the output is set. After 10ns, the data
becomes available. The multiplexer output should thus be connected directly to the data output,
with no latches or registers in between.
Writing is slightly more complicated. The data has to be setup 8ns before the falling edge
of the signals controlling the write (i.e. the lower/upper signals, the enable signal, and the write
enable signal). The data is written on the falling edge of one of the write controlling signals. This
means that two clock cycles are needed to write a byte of data. When halt is high, before the
write, the setup is done: the lower/upper signals, and the enable signal are directly connected to
the address bits driving them. The memory address is directly connected to the other address
bits. When halt goes low, the write signal is asserted, and the data inputs are connected to the
right output. The write is committed at the end of the cycle, when halt goes high again, by
de-asserting the write signal. An example write waveform can be seen in figure 3.3.

3.5 Address translator


The address translator translates addresses from the 6502 address space to addresses in the mem-
ory address space. With some thought, very little hardware is needed to make this translation.
First, a location in the address space of the 6502 must be chosen where we can place a bank of
external memory. The size of the bank is determined by the size of the address space available.
In this case, the space from 5000h to CFFFh is unused, so this location is used. The size of the
address space should be a power of 2, otherwise, not all of the memory can be used. The chosen
address space results in a bank size of 32kB. The 15 lowest address bits determine which byte we
want to access in the current bank, and can be directly connected to the controller. The MMU
should connect the various signals from the memory controller to the CPU when the address lies
in the chosen range.

9
CLK LLLLHHHH LLLLHHHHH
halt HHHH LLLLLLLLLLHHHH
write enable FFÆHHHHHHHHHHHHHHFF
address VV VVVVVVVVVVVVVV VV
data in VV VVVVVVVVVVVVVV VV

mem enable 1 FF LLLLLLLLLLLLLLFF


mem lower 1 FFÆHHHHHHHHHHHHHHFF
mem upper 1
FF LLLLLLLLLLLLLLFF
mem address VV VVVVVVVVVVVVVV VV
mem wr enable HHHH LLLLLLLLLLHHHH
mem data 1 ZZZZVVVVVVVVVVVVVVVV
Figure 3.3: An example write: an upper byte is written to memory chip 1. For conciseness, only
the signals for chip 1 are shown. Things to note are that the enable, lower, upper and address
signals are connected directly to the inputs. The write is controlled by the write enable signal, on
the clock, and the data is written to the outputs on the clock as well.

10
Five address bits remain, these are connected to a register which can be used to switch memory
banks. This register, in turn, is memory mapped to an available address using the (existing) I/O
mechanism. See also figure 3.4.

bank address

19 15 14 0

Figure 3.4: Address composition

3.6 Memory test algorithm


When testing the memory, two distinct tests must be performed. First, the addressing mechanism
has to be tested. And secondly, the data cells themselves have to be tested. Of course, the first test
is always slightly coupled with the second test, meaning that some data errors are also detected
in the addressing test. The idea for the memory tests were taken from the memtest86 program
(see [2]).

3.6.1 Addressing test


The most basic (and very effective) addressing test is writing the address of the memory location
to each location. This ensures that every individual cell is addressed correctly. In our case, it is
slightly more complicated, because the addresses are larger than the data width of the memory.
The solution is to perform the test in three phases. First, bits 0-7 are written and read back. Then,
bits 8-14 are used, and lastly, bits 15-19 are used. This makes sure that all addresses are correct.

3.6.2 Data test


For data testing, the so-called moving inversion algorithm can be used. The basic idea is that
the memory is filled with a certain pattern. Then, the pattern is read, inverted and written back,
starting from the lowest address. When the highest address is reached, the test is done again, but
then starting from the highest address. This is a good way to detect problems due to adjacency
effects, i.e. writing of one memory location affects another location. Depending on the speed of
the test, and the time which we want to spend on testing, one or multiple patterns can be used.
The first candidate is a pattern of just zeroes. For more testing, a ’walking bit’ pattern sequence
can be used, in which each of the bits is tested, by setting only that bit to one and running the
whole test. This implies that the whole test has to be run eight times in sequence, which could
take a long time. The pattern sequence used is stored in an array. When a shorter test time is
required, the sequence can be trimmed.

11
C H A P T E R 4

Code documentation

4.1 VHDL code


4.1.1 Toplevel
At the toplevel, the only change is in the connections from the MMU to the external memory
chips. First, external signals are added to connect to the memory chips:
34 SRAM_ADDR : out std_logic_vector(17 downto 0);
35 SRAM1_CE : out std_logic;
36 SRAM1_LB : out std_logic;
37 SRAM1_UB : out std_logic;
38 SRAM1_IO : inout std_logic_vector(15 downto 0);
39 SRAM2_CE : out std_logic;
40 SRAM2_LB : out std_logic;
41 SRAM2_UB : out std_logic;
42 SRAM2_IO : inout std_logic_vector(15 downto 0);
43 SRAM_OE : out std_logic;
44 SRAM_WE : out std_logic;
And then, these signals are connected to the MMU:
182 SRAM1_CE => SRAM1_CE,
183 SRAM1_LB => SRAM1_LB,
184 SRAM1_UB => SRAM1_UB,
185 SRAM1_IO => SRAM1_IO,
186 SRAM2_CE => SRAM2_CE,
187 SRAM2_LB => SRAM2_LB,
188 SRAM2_UB => SRAM2_UB,
189 SRAM2_IO => SRAM2_IO,
190 SRAM_OE => SRAM_OE,
191 SRAM_WE => SRAM_WE,
192 SRAM_ADDR => SRAM_ADDR,

4.1.2 Memory management unit


The memory management unit was changed to include the memory controller. Data lines to
the external memory chips are added, and connected to the memory controller, and the memory
controller itself is added to the internal address, data and control lines of the MMU. Also, the
page select register in the I/O controller is connected to the memory controller.

Entity description
The data lines to connect to the external memory chips are added to the entity description of the
MMU:
31 SRAM_ADDR : out std_logic_vector(17 downto 0);
32 SRAM1_CE : out std_logic;
33 SRAM1_LB : out std_logic;
34 SRAM1_UB : out std_logic;
35 SRAM1_IO : inout std_logic_vector(15 downto 0);

12
36 SRAM2_CE : out std_logic;
37 SRAM2_LB : out std_logic;
38 SRAM2_UB : out std_logic;
39 SRAM2_IO : inout std_logic_vector(15 downto 0);
40 SRAM_OE : out std_logic;
41 SRAM_WE : out std_logic;

Internal signals
Internal signals are added to connect the external memory to the data bus, and to the control
signals:
55 signal data_mem_to_cpu_ext: std_logic_vector(7 downto 0);
64 signal read_flag_ext : std_logic;
72 signal write_flag_ext : std_logic;
And an internal signal is added to connect the page select register in the I/O controller to the
memory controller:
75 signal extram_page : std_logic_vector(4 downto 0) := (others => ’0’);

Memory controller component declaration


The memory controller is declared, and connected to the external memory signals, and to the
internal data, address and control lines. The lower 15 bytes of the address are connected to the
internal address bus, and the upper 5 bytes are connected to the page select register:
124 extram: entity memctrl(arch)
125 port map (CLK => CLK,
126 mem_enable_1 => SRAM1_CE,
127 mem_lower_1 => SRAM1_LB,
128 mem_upper_1 => SRAM1_UB,
129 mem_data_1 => SRAM1_IO,
130

131 mem_enable_2 => SRAM2_CE,


132 mem_lower_2 => SRAM2_LB,
133 mem_upper_2 => SRAM2_UB,
134 mem_data_2 => SRAM2_IO,
135

136 mem_out_enable => SRAM_OE,


137 mem_wr_enable => SRAM_WE,
138 mem_address => SRAM_ADDR,
139

140 read_enable => read_flag_ext,


141 write_enable => write_flag_ext,
142 halt => halt,
143 address(14 downto 0) => address_bus(14 downto 0),
144 address(19 downto 15) => extram_page,
145 data_in => data_cpu_to_mem,
146 data_out => data_mem_to_cpu_ext);

I/O controller component declaration


The component declaration of the I/O controller is changed slightly, to accomodate the page select
register:
174 extram_page => extram_page);

13
Read/write flags, databus
For each 4K aperture in which the external memory is mapped, the read and write flags need to
be set:
205 with address_bus(15 downto 12) select
206 read_flag_ext <= read_flag when "0101", -- 5000h
207 read_flag when "0110",
208 read_flag when "0111",
209 read_flag when "1000",
210 read_flag when "1001",
211 read_flag when "1010",
212 read_flag when "1011",
213 read_flag when "1100", -- C000h
214 ’0’ when others;
243 with address_bus(15 downto 12) select
244 write_flag_ext <= write_flag when "0101", -- 5000h
245 write_flag when "0110",
246 write_flag when "0111",
247 write_flag when "1000",
248 write_flag when "1001",
249 write_flag when "1010",
250 write_flag when "1011",
251 write_flag when "1100", -- C000h
252 ’0’ when others;
Also, the data output of the external memory has to be connected to the central data bus when
one of the apertures in which the external memory is mapped is selected:
259 with address_bus(15 downto 12) select
260 data_mem_to_cpu <= data_mem_to_cpu_0xxx when "0000",
261 data_mem_to_cpu_1xxx when "0001",
262 data_mem_to_cpu_2xxx when "0010",
263 data_mem_to_cpu_3xxx when "0011",
264 data_mem_to_cpu_4xxx when "0100",
265 data_mem_to_cpu_ext when "0101", -- 5000h
266 data_mem_to_cpu_ext when "0110",
267 data_mem_to_cpu_ext when "0111",
268 data_mem_to_cpu_ext when "1000",
269 data_mem_to_cpu_ext when "1001",
270 data_mem_to_cpu_ext when "1010",
271 data_mem_to_cpu_ext when "1011",
272 data_mem_to_cpu_ext when "1100", -- C000h

4.1.3 Memory controller


The memory controller is the central unit in the design. It is located in memctrl.vhd.
1 library ieee;
2

3 use ieee.std_logic_1164.all;

Entity description
All the signals for connecting the controller are defined. The CLK signal must be connected to the
central clock.
The halt signal must be connected to the central halt signal of the MMU. When this signal is
low, data is read from the memory, or the signals for a write are setup. When the signal is high, a
write is committed to memory. Note that a memory write thus needs two clock cycles, one during
which the halt signal is low, and one during which the halt signal is high.

14
5 entity memctrl is
6 port(CLK : in std_logic;
7 halt : in std_logic;
All signals prefixed with mem_ are used to interface with the external memory on the board.
9 mem_enable_1 : out std_logic; -- enable memory chip 1
10 mem_lower_1 : out std_logic; -- enable lower byte mem chip 1
11 mem_upper_1 : out std_logic; -- enable upper byte mem chip 1
12 mem_data_1 : inout std_logic_vector(15 downto 0);
13 mem_enable_2 : out std_logic; -- enable memory chip 2
14 mem_lower_2 : out std_logic; -- enable lower byte mem chip 2
15 mem_upper_2 : out std_logic; -- enable upper byte mem chip 2
16 mem_data_2 : inout std_logic_vector(15 downto 0);
17 mem_out_enable : out std_logic; -- output enable
18 mem_wr_enable : out std_logic; -- write enable
19 mem_address : out std_logic_vector(17 downto 0);
The last block of signals are used to interface with the controller.
21 read_enable : in std_logic; -- read memory location
22 write_enable : in std_logic; -- write memory location
23 address : in std_logic_vector(19 downto 0);
24 data_in : in std_logic_vector(7 downto 0);
25 data_out : out std_logic_vector(7 downto 0)
26 );
27 end entity memctrl;

Internal signals
The chipselect and byteselect signals are used to select the correct chip, and either the lower
or the upper byte of that chip, for a certain memory address.
29 architecture arch of memctrl is
30 signal chipselect, byteselect : std_logic;

Memory chip control signals


The outputs of the ram chips are always enabled (line 33). The chipselect and byteselect
signals are used to drive the signals for enabling/disabling the lower and upper bytes for both
chips (lines 35–38 and lines 39–40), and for enabling the chips themselves.
31 begin
32

33 mem_out_enable <= ’0’;


34

35 mem_lower_1 <= ’0’ when chipselect = ’0’ and byteselect = ’0’ else ’1’;
36 mem_upper_1 <= ’0’ when chipselect = ’0’ and byteselect = ’1’ else ’1’;
37 mem_lower_2 <= ’0’ when chipselect = ’1’ and byteselect = ’0’ else ’1’;
38 mem_upper_2 <= ’0’ when chipselect = ’1’ and byteselect = ’1’ else ’1’;
39 mem_enable_1 <= ’0’ when chipselect = ’0’ else ’1’;
40 mem_enable_2 <= ’0’ when chipselect = ’1’ else ’1’;
The lower bit of the address is used to select either the upper or the lower byte (line 44). The
upper bit is used to select either the first or the second memory chip (line 43). The rest of the
address bits are used to drive the address lines of the memory chips (line 42).
42 mem_address <= address(18 downto 1);
43 chipselect <= address(19);
44 byteselect <= address(0);

15
Write signal
The write signal is inverted and passed through to the memory chips when the controller is not
in the halt cycle. When the controller is in the halt cycle, the external write signal is always high.
This makes sure that there is always a rising edge at the end of a write on one of the control
signals, triggering the memory chip to write the data. See also figure 3.3 on page 10, and the
accompanying discussion on timing on page 9.
45 process (CLK)
46 begin
47 if rising_edge(CLK) then
48 if halt = ’0’ then
49 mem_wr_enable <= not write_enable;
50 else
51 mem_wr_enable <= ’1’;
52 end if;
53 end if;
54 end process;

Input data multiplexing


The input data is multiplexed on the rising edges of the clock, when the controller is not halted.
By default, the outputs are switched to high-impedance mode. Only the 8 bits selected by the
combination of the lower and upper bit of the address are connected to the data inputs.
56 process (CLK)
57 begin
58 if rising_edge(CLK) and halt = ’0’ then
59 mem_data_1 <= (others => ’Z’);
60 mem_data_2 <= (others => ’Z’);
61 if write_enable = ’1’ then
62 if chipselect = ’0’ then
63 if byteselect = ’0’ then
64 mem_data_1(7 downto 0) <= data_in;
65 else
66 mem_data_1(15 downto 8) <= data_in;
67 end if;
68 else
69 if byteselect = ’0’ then
70 mem_data_2(7 downto 0) <= data_in;
71 else
72 mem_data_2(15 downto 8) <= data_in;
73 end if;
74 end if;
75 end if;
76 end if;
77 end process;

Output data multiplexing


Finally, the correct data bytes are connected to the data output, using the chipselect and
byteselect signals. The selected_chip signal is used to make the notation more concise.
79 data_out_mux: block
80 signal selected_chip : std_logic_vector(15 downto 0);
81 begin
82 selected_chip <= mem_data_1 when chipselect = ’0’
83 else mem_data_2;
84 data_out <= selected_chip(7 downto 0) when byteselect = ’0’
85 else selected_chip(15 downto 8);

16
86 end block;
87

88 end architecture arch;

4.1.4 I/O controller


A register has been added to the I/O controller to select the page of the external memory which is
mapped to the memory space of the CPU.

Entity description
One signal is added, to set the page of the external memory:
35 extram_page : out std_logic_vector(4 downto 0)

Reading/writing the register


The register is placed at address 500h in the address space of the I/O controller. When this address
is selected, the contents of the register are read or written, according to the read/write flags:
292 when x"500" => sig_extram_page <= data_in(4 downto 0);
445 when x"500" => data_out <= "000" & sig_extram_page;

4.2 Memory test program


The memory test program is fairly simple, and consists of one C file, memtest.c. The program
can be compiled with the supplied Makefile and uploaded to the CPU. When executed, it runs
the test program once.
The file starts with the standard includes:
1 #include <peekpoke.h>
2 #include <string.h>
3 #include <stdio.h>
4 #include <stdlib.h>
5

6 #include "fpga.h"
Three macros are defined, to execute a memory test function on each memory page. The macros
are used to keep the code more readable. Initially, functions were used, with a function pointer
as argument. Unfortunately, for some reason, this did not work on the 6502, buggy behaviour
was encountered.
The first macro is used to call the address test functions. These need the current page as
argument. A message is passed to the macro to display status information. The message must
contain a %02x format specifier, to display the current page.
8 #define run_addr_check(function, message) {\
9 for(page = 0; page <= 0x1f; page++) { \
10 POKE(0xe500, page); \
11 sprintf(s, message, page); \
12 fpga_puts(s); \
13 function(page); \
14 } \
15 }
The second macro is used to call the functions for the running inverse test algorithm. These
functions need the current pattern as an argument. The status message contains 2 %d format
specifiers for displaying the number of the current pattern, and the total number of patterns, and
a %02x format specifier, to display the current page.

17
17 #define run_inv(function, pattern, message) {\
18 for(page = 0; page <= 0x1f; page++) { \
19 POKE(0xe500, page); \
20 sprintf(s, message, pattern+1, N_PATTERNS, page); \
21 fpga_puts(s); \
22 function(patterns[pattern]); \
23 } \
24 }
And lastly, a macro is defined to call a running inverse test function, accessing the pages in
reverse order. This is needed for the second inverting pass of the algorithm. The arguments of
the macro are the same as those of the run_inv macro.
26 #define run_inv_reverse(function, pattern, message) {\
27 for(page = 0x1f; page <=0x1f; page--) { \
28 POKE(0xe500, page); \
29 sprintf(s, message, pattern+1, N_PATTERNS, page); \
30 fpga_puts(s); \
31 function(patterns[pattern]); \
32 } \
33 }
Function declarations are included for all test functions in the program.
35 void addr_write_1(unsigned char page);
36 void addr_check_1(unsigned char page);
37 void addr_write_2(unsigned char page);
38 void addr_check_2(unsigned char page);
39 void addr_write_3(unsigned char page);
40 void addr_check_3(unsigned char page);
41 void mov_inv_write(unsigned char pattern);
42 void mov_inv1(unsigned char pattern);
43 void mov_inv2(unsigned char pattern);
44 void mov_inv3(unsigned char pattern);
The variable s serves as a global buffer for storing formatted output messages, which can then
be output to the serial port using fpga_puts.
46 char s[100];
The array patterns contains the patterns used in the moving inverse test. It includes the zero
pattern, eight patterns with a single bit set to one in each of the eight possible positions, and two
patterns with alternating ones and zeroes. N_PATTERNS is set to the total number of patterns.
47 #define N_PATTERNS 11
48 char patterns[] = { 0x00, 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80, 0xaa, 0x55 };
The main function calls each of the tests in sequence. First, all the address check tests are run.
Then, for each available pattern, the moving inversion tests are run.
50 int main() {
51 unsigned char page;
52 int i;
53

54 run_addr_check(addr_write_1, "Check 1 write, page 0x%02x of 0x1f\r\n");


55 run_addr_check(addr_check_1, "Check 1 read, page 0x%02x of 0x1f\r\n");
56 run_addr_check(addr_write_2, "Check 2 write, page 0x%02x of 0x1f\r\n");
57 run_addr_check(addr_check_2, "Check 2 read, page 0x%02x of 0x1f\r\n");
58 run_addr_check(addr_write_3, "Check 3 write, page 0x%02x of 0x1f\r\n");
59 run_addr_check(addr_check_3, "Check 3 read, page 0x%02x of 0x1f\r\n");
60

61 for(i=0; i<N_PATTERNS; i++) {


62 run_inv(mov_inv_write, i, "Moving inversion, %d of %d, writing page 0x%02x of 0
x1f.\r\n");
63 run_inv(mov_inv1, i, "Moving inversion, %d of %d, inversion #1, page 0x%02x of 0
x1f.\r\n");

18
64 run_inv_reverse(mov_inv2, i, "Moving inversion, %d of %d, inversion #2, page 0x
%02x of 0x1f.\r\n");
65 run_inv(mov_inv3, i, "Moving inversion, %d of %d, inversion #3, page 0x%02x of 0
x1f.\r\n");
66 }
67 exit(0);
68 }
The first address write and check functions write, respectively check, the lower eight bits of
the memory address, for each location in the page.
70 void addr_write_1(unsigned char page) {
71 unsigned int mem;
72 for(mem = 0x5000; mem < 0xd000; mem++) {
73 POKE(mem, mem);
74 }
75 }
76

77 void addr_check_1(unsigned char page) {


78 unsigned int mem;
79 unsigned char val;
80

81 for(mem = 0x5000; mem < 0xd000; mem++) {


82 val = PEEK(mem);
83 if(val != mem & 0x00ff) {
84 sprintf(s, "Mismatch at addr 0x%x.\r\n", mem);
85 fpga_puts(s);
86 }
87 }
88 }
The second address write and check functions write, respectively check, bits 9 thru 15 of the
memory address, for each location in the page.
90 void addr_write_2(unsigned char page) {
91 unsigned int mem;
92 for(mem = 0x5000; mem < 0xd000; mem++) {
93 POKE(mem, (mem >> 8));
94 }
95 }
96

97 void addr_check_2(unsigned char page) {


98 unsigned int mem;
99 unsigned char val;
100

101 for(mem = 0x5000; mem < 0xd000; mem++) {


102 val = PEEK(mem);
103 if(val != (mem >> 8)) {
104 sprintf(s, "Mismatch at addr 0x%04x: expecting 0x%02x, got 0x%02x.\r\n", mem,
mem >> 8, val);
105 fpga_puts(s);
106 }
107 }
108 }
And the third address write and check functions write, respectively check, the current page
number, for each location in the page.
110 void addr_write_3(unsigned char page) {
111 unsigned int mem;
112 for(mem = 0x5000; mem < 0xd000; mem++) {
113 POKE(mem, page);
114 }

19
115 }
116

117 void addr_check_3(unsigned char page) {


118 unsigned int mem;
119 unsigned char val;
120

121 for(mem = 0x5000; mem < 0xd000; mem++) {


122 val = PEEK(mem);
123 if(val != page) {
124 sprintf(s, "Mismatch at addr 0x%04x: expecting 0x%02x, got 0x%02x.\r\n", mem,
page, val);
125 fpga_puts(s);
126 }
127 }
128 }
The setup function of the moving inverse algorithm writes the current pattern to each memory
location.
130 void mov_inv_write(unsigned char pattern) {
131 unsigned int mem;
132 for(mem = 0x5000; mem < 0xd000; mem++) {
133 POKE(mem, pattern);
134 }
135 }
The first function of the moving inverse algorithm reads and checks the pattern from memory,
and writes back the inverse of the pattern.
137 void mov_inv1(unsigned char pattern) {
138 unsigned int mem;
139 unsigned char val, inv;
140 inv = ˜pattern;
141 for(mem = 0x5000; mem < 0xd000; mem++) {
142 val = PEEK(mem);
143 if(val != pattern) {
144 sprintf(s, "Mismatch at addr 0x%04x: expecting 0x%02x, got 0x%02x.\r\n",
pattern, val);
145 fpga_puts(s);
146 }
147 POKE(mem, inv);
148 }
149 }
The second function of the moving inverse algorithm, which is executed from the highest
location downwards, reads and checks the inverted pattern, and writes back the original pattern.
151 void mov_inv2(unsigned char pattern) {
152 unsigned int mem;
153 unsigned char val, inv;
154 inv = ˜pattern;
155 for(mem = 0xcfff; mem >= 0x5000; mem--) {
156 val = PEEK(mem);
157 if(val != inv) {
158 sprintf(s, "Mismatch at addr 0x%04x: expecting 0x%02x, got 0x%02x.\r\n", inv,
val);
159 }
160 POKE(mem, pattern);
161 }
162 }
The last function of the moving inverse algorithm reads and checks the original pattern,
written by the second function.

20
164 void mov_inv3(unsigned char pattern) {
165 unsigned int mem;
166 unsigned char val;
167 for(mem = 0x5000; mem < 0xd000; mem++) {
168 val = PEEK(mem);
169 if(val != pattern) {
170 sprintf(s, "Mismatch at addr 0x%04x: expecting 0x%02x, got 0x%02x.\r\n",
pattern, val);
171 }
172 }
173 }

21
C H A P T E R 5

Conclusion and recommendations

The memory controller which was developed fullfills the requirements set for the project. It gives
a significant increase in available memory, using the external RAM on the development board.
The controller is developed as a separate unit, and is reusable in a similar microcontroller. The test
program also fullfills its requirements; during development several errors in the implementation
of the memory controller were discovered and corrrected using the program.
The speed of the memory controller is such that no extra wait cycles have to be introduced.
Theoretically, the speed could be improved slightly, so that writes and reads both only take
one clocktick. In practice this might prove to be difficult, and in this case, it was not needed.
The reusability of the controller is somewhat impeded by the interface dictated by the existing
CPU. There is no real solution for this problem, and it is questionable if reusability should be an
important goal when developing these kind of components.
The structure of the development process has some room for improvement, there was not
much direction from the side of the client in terms of development process. The cause of this
seems to be the conflicting interests of the client and the coordinator; the client was focused mainly
on the end-product, whereas the main interest of the coordinator was in a proper development
process.

22
A P P E N D I X A

Working with the FPGA under Linux

As I developed and tested the FPGA software under Linux, I have added some notes on how to
get things working. Hopefully, they may be of use to someone.

A.1 USB to Serial Converter


The 6502 softcore uses the RS-232 serial port on the FPGA board to connect to a host computer.
Less and less computers, especially laptop computers, still have such a port, however. As a
workaround, a USB to serial converter can be used. Getting the USB to serial converters which
are used at the Embedded Software Lab to work under Linux is fairly simple, assuming a 2.6
kernel is used. The following option needs to be enabled:
Symbol: USB_SERIAL_PL2303 [=m]
Prompt: USB Prolific 2303 Single Port Serial Driver
Defined at drivers/usb/serial/Kconfig:406
Depends on: USB!=n && USB_SERIAL
Location:
-> Device Drivers
-> USB support
-> USB Serial Converter support
-> USB Serial Converter support (USB_SERIAL [=m])
Compile the new kernel, and install the modules in the usual way, and after rebooting the USB
subsystem should automatically insert this module when the cable is plugged into a USB port.
The port can be accessed under /dev/ttyUSB0. For this to work fully automatically, the hotplug
subsystem should be working. In most modern Linux distributions, this should already be the
case.

A.2 Xilinx software


Installing the Xilinx webpack is relatively straightforward. For this project, an old version was
used (7.1i), the newer version didn’t work out of the box with the existing VHDL code for the
6502 soft core. For the (graphical) installer to work, the X server has to accept connections over
TCP/IP, so make sure the -nolisten tcp option is not enabled for your X server. When the X
server doesn’t accept TCP/IP connections, the installer will lock up for a few minutes and then
exit with a very unclear error message. The rest of the installation is straightforward. At the end,
some error messages are displayed about the kernel modules, which can be ignored.
Getting the kernel module to work (windrvr6, which is needed for programming the FPGA
using the impact tool) takes some handwork. Xilinx only provides a version for the 2.4 kernel
series, the 2.6 kernel series use a very different module format. The link on their site to the
sources of the 2.6 module was dead, unfortunately, at the time this document was written. From
http://www.jungo.com/download/WD702LN.tgz, the sources can be downloaded. Un-tar them,
and run make in the WinDriver/redist directory. Install the module using make install, and
then insert it using modprobe. Additionally, the modules parport, parport_pc and lp are needed
for impact to program the FPGA.

23
When using udev for device management, the device inode for the windrvr6 driver is not
automatically created. Add mknod /dev/windrvr6 c 253 0 to your startup scripts to take care
of this. And, of course, all the permissions on the inodes should be correct for the whole thing to
work.

24
A P P E N D I X B

External memory timing diagrams

The two timing diagrams below correspond to the read and write cycles used to access the external
memory. They are copied from [1].

tRC
AD D R E S S

t AA
t OHA t OHA

DOUT PREVIOUS DATA VALI


D DATA VALID

READ1.eps

Figure B.1: Read cycle 1

t WC t WC

ADDRESS ADDRESS 1 ADDRESS 2

OE

t SA
CE LOW

t HA t HA
WE t SA

t PBW t PBW
UB, LB WORD 1 WORD 2

t HZWE t LZWE
HIGH-Z
DOUT DATA UNDEFINED
t HD t HD
t SD t SD
DATAIN DATAIN
DIN VALID VALID

UB_CEWR4.eps

Figure B.2: Write cycle 4

25
Bibliography

[1] Integrated Silicon Solution, Inc. IS61LV25616AL data sheet. http://www.issi.com/pdf/


61LV25616AL.pdf, February 2006.
[2] Memtest86 website. http://www.memtest86.com/.
[3] Xilinx ISE WebPACK design software. http://www.xilinx.com/ise/logic design prod/
webpack.htm.

26

You might also like