0% found this document useful (0 votes)
65 views34 pages

DMD Semester Project Final Report-1

Uploaded by

etiennelepriol41
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views34 pages

DMD Semester Project Final Report-1

Uploaded by

etiennelepriol41
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

ETH Library

High Frame-Rate and Low-


Latency Control of Digital
Micromirror Devices (DMD) using
FPGA for Ultracold Atom-based
Quantum Experiments

Student Paper

Author(s):
Yin, Xiaorui

Publication date:
2021

Permanent link:
https://doi.org/10.3929/ethz-b-000510370

Rights / license:
In Copyright - Non-Commercial Use Permitted

This page was generated automatically upon download from the ETH Zurich Research Collection.
For more information, please consult the Terms of use.
High Frame-Rate and Low-Latency
Control of Digital Micromirror Devices
(DMD) using FPGA for Ultracold
Atom-based Quantum Experiments
Semester Project

Xiaorui Yin
[email protected]

Quantum Optics Group & Photonics Laboratory


ETH Zürich

Supervisors:
Dr. Kadir Akin (Engineering Unit in Quantum Center)
Alexander Baumgärtner (Quantum Optics Group)
Prof. Dr. Lukas Novotny (Photonics Laboratory)

May, 2021
Acknowledgements

I am very grateful to those people who helped me in the past three and a half
months. My deepest gratitude goes first to my supervisors Dr Kadir Akin and
Alexander Baumgärtner, for their careful and selfless guidance of my semester
project, which greatly improved my FPGA programming skill and taught me a
lot of specific problem-solving skills. I am also deeply indebted to all the other
supervisors, Prof. Lukas Novotny, Prof. Tilman Esslinger, and Jeffrey Mohan,
for their direct and indirect help to me.

i
Abstract

Digital Micromirror Devices (DMD) are used in ultracold atom-based quantum


experiments to create arbitrary potentials on the target atom plane. To this end,
the frame rate of DMD is required to be high enough. In this project, I present
an FPGA-based solution for achieving a high frame rate DMD. The DMD and
the controller board from Texas instruments can project pictures with a refresh
rate up to 22kHz. But with the default implementation, the frame rate is very
slow. To improve the performance, my key idea is to use a DDR2 memory to
preload the image data such that the FPGA can acquire the data much faster
than receiving data from the USB port. I present with this work that the frame
rate can be increased to 22kHz.

ii
Contents

Acknowledgements i

Abstract ii

1 Introduction 1

2 Digital Micromirror Device (DMD) 2

3 Application FPGA Design 8


3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 USB Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3 DDR2 Memory Controller . . . . . . . . . . . . . . . . . . . . . . 12
3.4 DMD Trigger Control Module . . . . . . . . . . . . . . . . . . . . 16
3.5 Remaining Modules: Applications FPGA IO, Pattern Generation
and DMD Control . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.6 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.7 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.8 Additional Information . . . . . . . . . . . . . . . . . . . . . . . . 22

4 Results 25
4.1 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.2 Latency Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

5 Conclusion and Outlook 28

Bibliography 29

iii
Chapter 1

Introduction

Digital Micromirror Devices (DMD) are an essential component of Digital Light


Processing Technology (DLP) developed by Texas instruments. The DLP tech-
nology has been widely applied in industry and laboratories. For example, in
most high-quality commercial projectors and head-up displays, DLP technology
is employed to ensure image quality and reliability. In the laboratory, DLP tech-
nology is often used as a spatial light modulator. A spatial light modulator (SLM)
is an electronic device that can modulate both the intensity and the phase of a
light beam. Therefore, the physicist can use a spatial light modulator to create
arbitrary potentials, like Bragg potentials[1]. Nevertheless, DMD is not the only
tool to control the light beam. Liquid crystal is also capable of beamshaping.
But in comparison, the DMD is a better choice in the sense of showing the truly
static image[2, 3].
The physicists in the quantum optics laboratory also want to use the DMD
as an SLM in their ultracold atom-based quantum experiments. In order to
meet the requirements of the experiments, the DMD should support a very high
frame rate (above 20kHz) and it can be controlled by an external trigger signal.
There are two types of DMD on the market: 1. The basic version from Texas
instruments. 2. The improved version from third-party design houses authorized
by Texas instruments. For the second option, the frame rate is higher than the
basic version, and the user experience (e.g. software implementation) is also
optimized. But customization is not possible and the price is relatively high.
Thus, in this project, I improve the basic version of the DMD to enable the high
frame rate and the trigger feature, by programming the FPGA that controls the
DMD.

1
Chapter 2

Digital Micromirror Device


(DMD)

[4, 5] A digital micromirror device is an optical micro-electrical-mechanical system


(MEMS) that contains an array of highly reflective aluminium micromirrors. For
XGA resolution, the size of the array is 1024x768, which means there are 1024
columns x 768 rows of micromirrors. Each micromirror is driven by a super-fast
piezo actuator, and it has two stable states: ON and OFF states. When the
micromirror is ON, it deviates 12 degrees and the light will be forwarded to the
output lens. Otherwise, with a -12 degrees deviation, the light will be absorbed.
The mechanical state of the micromirror is determined by the digital state stored
in the CMOS memory under the micromirror. By convention, digital signal one
represents the ON state. To project a colourful picture, one can use the LEDs
as the light source, or place a set of colour filters in between the DMD and the
light source. Sometimes the grayscale is important to some users. In this case,
methods like time averaging and spatial averaging can be used to implement the
grayscale.

Figure 2.1: The DMD and the Micromirror Array [4] The figure on the left is
the DMD, the micromirror array is in the centre. The right figure shows a closer
look at the micromirror array, some micromirrors are turned off.

2
2. Digital Micromirror Device (DMD) 3

(a) Micromirror (b) CMOS memory

Figure 2.2: The Micromirror (a) and the CMOS memory (b) [4]

As mentioned above, there are two types of states in the DMD system: the
mechanical state of the micromirror and the digital state in the CMOS memory.
The operation to load the image data into the CMOS memory is called LOAD.
LOAD operation does not update the mechanical state, to transfer the digital
state to the mechanical state another operation called RESET is needed. Peo-
ple may be confused by the name RESET and thought the RESET operation
will set everything to its default states. The name RESET is defined by Texas
instruments, ACTIVATE could be a better name for it. The RESET operation
can be performed in multiple modes with the block as the basic unit. A block
is a set of consecutive rows. Take XGA resolution as an example, there are 16
blocks with 48 rows each. We can only update one block in single block mode,
or we can update two or four blocks at the same time in dual/quad block modes
respectively. Updating one or multiple blocks instead of the entire array is mean-
ingful when only part of the blocks have new data arrived. Figure 2.3 shows the
Quad RESET operation. After receiving a RESET pulse, the target blocks will
update their micromirrors according to the values in their CMOS memory.

Figure 2.3: QUAD reset mode [4]

The devices used in this project are DLPLCR70EVM (0.7 XGA DMD) and
2. Digital Micromirror Device (DMD) 4

DLPLCRC410EVM. DLPLCRC410EVM is a controller board handling data pre-


processing and control. Figure 2.5 shows a block diagram of the controller board.
There are two FPGAs on the controller board. The Applications FPGA (Xilinx
Virtex 5 LX50) is used as an interface between the user interface (PC) and the
DLPC410 controller. It receives data from the PC via a USB 2.0 port (Cypress
CY7C68013A), and then process the data according to the requirements of the
DLPC410 controller. The Applications FPGA can be customized by the users to
achieve some user-defined functionalities. The DLPC410 controller FPGA (Xil-
inx Virtex 5 LX30) receives data from the Applications FPGA and forwards it to
the DMD. This FPGA is programmed by Texas Instruments, no modification is
allowed. Besides these two FPGAs, a 64-bit DDR2 SO-DIMM connector is inte-
grated on the backside of the controller board, it can be used by the Applications
FPGA.

Figure 2.4: DMD (upper) and Controller Board (lower) [5]


2. Digital Micromirror Device (DMD) 5

Figure 2.5: Block Diagram of the Controller Board [5]

Texas Instruments also provides a GUI software (Discovery D4100 Explorer)


for the user that can be downloaded in the product website. There are two
versions 104A and 133, version 104A is recommended. The GUI does not support
the updated Windows 10 OS (2021 update), to properly install the GUI without
post-installation errors, one should use an older version of Windows 10. With the
GUI, the user can easily load the images and issue some commands, such as reset
mode, delays, clear etc. The pictures 2.6 below were taken in the laboratory. The
ETH logo picture and the LOAD and RESET commands are sent to the DMD
system through the GUI, then the DMD showed the ETH logo by turning off the
micromirrors where the logo takes place.
2. Digital Micromirror Device (DMD) 6

(a) GUI: issue load and reset commands for the ETH logo.

(b) ETH Logo on the DMD: the micromirrors showing the ETH logo are turned off.
In other words, the light reflected by these micromirrors will not enter our eyes.

Figure 2.6: GUI and ETH Logo


2. Digital Micromirror Device (DMD) 7

In this section, I briefly describe the usage of DMD for Bragg potentials
generation[1]. As figure 2.7(a) shows, the imaging system includes the DMD
and two telescopes. The telescopes consisting of many objective lenses are used
to achieve a large demagnification factor which determines the final image size
in the atom plane. The DMD is placed in the Fourier plane by projecting a
sinusoidal pattern that is the Fourier transfer of two delta functions convolved
with a Gaussian envelope. The pattern loaded in micromirror array is referred
as image in this report, although they are not an image itself, but they are used
to generate an image using laser beam. The generated momentum and energy
transfer in the atom plane are proportional to the frame rate of the DMD. With
some calculations, the Bragg potentials require a maximum frame rate 4.6kHz.

(a) Two telescopes imaging system: DMD (left) and atom plane (right) with
some objective lenses in between

(b) Test setup including laser beam path: The DMD is placed in the upper
right corner. Below the DMD is the atom plane with a camera behind it. The
laser beam is generated by the laser source on the left, and the laser reaches
the atom plane after passing through different lenses.

Figure 2.7: DMD in quantum experiment[1]


Chapter 3

Application FPGA Design

An FPGA (Field Programmable Gate Array) is a special circuit that can be


programmed to implement some desired applications in a short period of time.
Compared with ASIC (Application-Specific Integrated Circuit), FPGA has obvi-
ous advantages in terms of costs and development time. Generally speaking, the
FPGA development consists of the following steps.

1. System design: According to the task requirements, divide the system


into a number of modules.
2. RTL (Register Transfer Level) design using HDL (Hardware De-
scription Language): Implement the modules in HDL (VHDL or Ver-
ilog). Since digital circuits are usually synchronized circuits that use regis-
ters to store the system states, they can be regarded as the registers level
system transferring from one state to another state at each clock edge.
3. Simulation Simulate the design with simulator to verify the correctness of
the circuit
4. Synthesis: Map the RTL description to a netlist that can match the FPGA
hardware resources.
5. Implementation: Configure the FPGA hardware resources according to
the netlist and constraints.

The development tools used in this project are Xilinx ISE 14.7 and ChipScope Pro
Analyzer. Both can be downloaded from the Xilinx archive download page [6].
Noticed that the software should be run on a Linux virtual machine (recommend
Oracle VirtualBox).

3.1 Overview

An example design of the Applications FPGA can be found on Texas Instru-


ments’s website[7]. This example design is able to generate test patterns and

8
3. Application FPGA Design 9

work with the GUI. When the test pattern generation is enabled, the DMD will
continuously display some dummy patterns like all zeros, all ones, diagonal line.
When the controller board is connected to the PC via USB, the GUI will au-
tomatically detect the DMD and the operation mode will be switched to GUI
mode. In the GUI mode, the 16-bit image data is transmitted through the USB
with a 48MHz clock. The frame rate can be calculated as follows:
48M Hz
fusb = = 976.56Hz (3.1)
1024 × 768bits/16bits

Actually the frame rate is much slower than 976.56Hz. The data is not continuous
and the break time between two data is very long. Approximately, it takes two
seconds to project an image.
It is obvious that the bottleneck of the system is USB communication. To
overcome this bottleneck, utilizing the DDR2 SODIMM can greatly accelerate
the system. With a 150MHz DDR2 memory, the improved frame rate is 24414Hz
as in theory presented below. Practically DDR2 has also bottlenecks as explained
in this chapter. However, we presented that we can reach up to 22kHz speed in
practice using DDR2.
150M Hz
fddr2 = = 24414.06Hz (3.2)
1024 × 768bits/128bits

In fact, Texas Instruments also published an example design including the


DDR2 interface several years ago, but now this design has been removed from
the website. However, we found it in a private repository [8]. Although the
DDR2 interface is instantiated in this design, the interface is unused. It is used
for built-in-self-test (BIST), a common design-for-test (DFT) technique in IC
design. Considering the GUI only works with the newest version of Applications
FPGA, my approach is to implement the DDR2 interface and trigger control
based on the newest version of the example design.
Figure 3.1 shows the architecture of the proposed FPGA design. First, the
image data and the control data are transferred from the PC to the USB inter-
face. The image data is stored in the DDR2 memory, and the control data is
stored in the control register array. Then, when the trigger pulse is coming, the
DMD control using Trigger module read the image data from the DDR2 mem-
ory and generates signals like row address (rowad ), block address (blkad ) for the
current output image data. Last, the APPSFPGA IO module outputs data to
the DLPC410 controller using a high-speed transmission method called SERDES.
Meanwhile, the original design using GUI is also preserved, the user can use the
mem_en signal to select the operation mode.
3. Application FPGA Design

Figure 3.1: Block Diagram of the Applications FPGA


10
3. Application FPGA Design 11

3.2 USB Interface

The USB_IO module is responsible for USB data receiving, some important IO
ports are shown in the following table.

Port Name Type Description


ifclk input USB GPIF input clock
system_clk input system clock
mem_clk input DDR2 memory clock
bidir inout 16bits USB data
fifo_wen input USB control signal: write enable
fifo_ren input USB control signal: read enable
fifo_regn input USB control signal: burst enable
output of the image data FIFO
mem_wr_data output 128bits
to the memory controller
mem_wr_data_valid output valid signal of the image data
valid signal for the memory
mem_wr_valid output
controller to read data
read enable signal of
mem_get_data input
the image data FIFO

Table 3.1: I/O list of USB_IO module

There is a Cypress FX2 chip on the controller board for the PC-FPGA
communication[9]. The FX2 controller simplifies the design on the FPGA side,
the USB interface only needs to read the data from the FIFO. In the GUI mode,
the GUI takes over everything. That means the GUI not only sends the image
data but also data like row address (rowad ) block address (blkad ). All the data
shares the same bus bidir distinguished by the control signals. The demultiplex-
ing is as follows:

fifo_wen fifo_ren fifo_regn bidir


0 1 1 image data
1 1 0 register address
0 1 0 register data

Table 3.2: USB data demultiplexing

The USB data and control signals are clocked by 48MHz USB clock ifclk, but
the frequency of the system clock is 200MHz, which leads to a slow to fast clock
domains crossing problem. An easy way is to use the Xilinx FIFO IP [10] such
that the write and read operations can be independently performed by the USB
clock and the memory clock. Another advantage of using FIFO is that it is also
3. Application FPGA Design 12

capable of data width conversion. The memory data bus is 128 bits wide and the
USB data bus is 16bits wide. The FIFO then concatenates eight USB input data
to form one memory data.

3.3 DDR2 Memory Controller

The memory used in this design is 2GB DDR2 SO-DIMM MT16HTF25664HY-


667E1. The memory controller is based on the Xilinx memory interface generator
(MIG) IP[11]. 667 in product code refers to its maximum transfer rate, which is
667MHz. However, the MIG interface works with 150MHz DDR (Double Data
Rate), which enables a maximum 300MHz transfer of 64bits. The MIG core
generates an interface of 128bits working at a maximum 150MHz single-ended
clock to be used by the application.
The memory interface has many IO ports, the ports beginning with ddr2*
are all physical ports connected to the DDR2 memory, they are controlled by
the MIG IP. As users, we only need to pay attention to the user ports beginning
with app*. The data width of the memory interface is 64bits wide (at 400M/s
transfer rate), but the user input data width is 128bits wide. This is because the
memory uses DDR (double data rate) to transfer data, which means the data is
sent both on the falling and rising clock edges. Therefore, one user input data is
split into two memory input data.
Figures 3.2 3.3 show the block diagram and the waveform of the write oper-
ation. There are two FIFOs in the memory interface, the address FIFO and the
data FIFO. It is important to make sure that the two FIFOs are not full before
write data to the FIFOs. This can be done by inspecting the FIFO almost full
signals (app_af_afull, app_wdf_afull). The memory interface is configured with
a burst length (BL) of four, this means for writing four consecutive data only one
address is needed. In the memory controller, the burst_indicator signal indicates
whether an address is needed for the current write operation. Its initial value
is zero, and it flips after each write operation. The address FIFO write enable
signal (app_af_wren) is set to high only if the burst indicator is zero.
3. Application FPGA Design 13

Figure 3.2: Write Block Diagram[11]

Figure 3.3: Write Waveform[11]


3. Application FPGA Design 14

Reading data from the memory is similar, one address corresponds to four
consecutive data (two 128bits user data). This is shown in figures 3.4 3.5.

Figure 3.4: Read Block Diagram[11]

Figure 3.5: Read Waveform[11]

The memory address (app_af_addr 31bits) contains the bank address, the
row address, and the column address. The number of addresses needed for one
XGA image can be calculated as follows:
1024 × 768bits
#addresses = = 3072d = 110000000000b(12bits) (3.3)
2 × 128bits
3. Application FPGA Design 15

Therefore, 14 bits (2 bits more to be compatible with other resolutions) out of


31 bits are used for data selection (wr_offset_addr & rd_offset_addr ). The
memory address is configured in this way:

XXXXXXXXXXXXXXX
| {z } XXXXXXXXXXXXXX
| {z } XX
|{z} (3.4)
image selection 15bits data selection 14bits burst 2bits

The memory interface is controlled by a finite state machine (FSM) 3.6 which
has three possible states: S0 (IDLE, default), S1 (WRITE) and S2 (READ).
When starting the system, the user is required to give the parameter num_patterns.
The internal signal write_pattern_count implies how many images have been
written to the memory. If write_pattern_count = num_patterns, it means that
all images have been written and the mem_preload_done becomes high. There-
fore, if the memory preload process is not done and the memory initialization
is done, the FSM transfers to S1. In S1 state, the controller monitors the write
valid signal (wr_valid ) and the write ready signal (wr_ready). The write valid
signal is high when the memory write FIFO in the USB interface is not empty,
and the write ready signal is high when both the address FIFO and the data
FIFO of the memory interface are not full. Once these two signals are both
high, the controller starts to read the data from the memory write FIFO by
setting the mem_get_data to high. The data comes with a data valid signal
(wr_data_valid). Hence the controller starts to write data to the memory if the
data valid signal is high. The FSM moves to S2 if the memory preload process
is done, and it receives a read request pulse (rd_en) issued by the DMD Trigger
Control Module. The memory output data is stored in the memory read FIFO
for clock domain crossing.
3. Application FPGA Design 16

Figure 3.6: Finite state machine of the memory controller with three states and
transition conditions

3.4 DMD Trigger Control Module

In the quantum experiment, the DMD needs to LOAD and RESET the next
image stored in the memory when it receives an external trigger pulse. This is
implemented by the DMD Trigger Control Module (DMD_trigger_control.vhd ).
In this module, there is also an FSM with two possible states: S0(IDLE, default)
and S1(OUTPUT DATA). To change the state to S1, it is required that the
memory preload process is done, the DMD initialization process is done and a
trigger pulse is detected. Since the trigger signal is external, it is first put into a
register to avoid timing violation, and then the output of the register is delayed
by one cycle to generate a pulse.
3. Application FPGA Design 17

Figure 3.7: Finite state machine of the DMD Trigger Controller with two states
and transition conditions

Image data loading is done by two channels A and B (four channels A, B,


C, D for FHD). The pixels are loaded alternatively by A and B channels, like
A0 B0 A1 B1 A2 B2... In addition to the channel data, DMD also needs some
other signals to correctly place the data. They are row address, row mode, block
address, block mode etc. The row address is managed by two counters, one
is embedded in another one. The outer counter is called row position counter
(cnts_row_pos_cnt) corresponding to the row on the DMD (768 rows for XGA
resolution), this counter increments only when the inner counter (active_cnt)
reaches eight. The reason is that DMD needs eight 128bits memory data to fill
one row. There are different row modes, table 3.1 shows the configurations. The
DLPC410 controller sends the data row by row. A row cycle is defined as the
cycles needed for the DMD to load one-row data, which is eight system clock
cycles for DLP7000 (XGA). In one row cycle, the data must be continuous and
valid, and the address must be given at the beginning. For example, in cycle
one, the address mode is 01 and valid is high, then the DMD increments the
internal row counter by one, and load eight consecutive data regardless of the
validity. If the data is not continuous, the DMD gets unwanted data and discard
the last few data. This “single row write operation” causes some problems to the
system, because the system clock (FIFO read clock) is faster than the memory
clock (FIFO write clock), the system always needs to wait for the memory to
write data to the FIFO. My solution is to use a user-defined FIFO empty signal
(prog_empty). prog_empty signal asserts when there are at least eight data in
the FIFO. Therefore, when the system wants to send new row data, it first checks
if there are enough data in the FIFO. Figure 3.8 shows the simulation result of
row data loading, it is clear that there are one or two idle cycles between every
single row write operation.
3. Application FPGA Design 18

Figure 3.8: Simulation of row data loading. Data 1-8 are placed in the the first
row with row mode "11". For the rest rows, row mode "01" indicates placing the
data in the next row.

Table 3.1: Row Modes and Row Addresses[12]

Considering that after every trigger pulse, the DMD will rewrite all pixels,
we are only interested in global RESET mode. Therefore, the block operation is
No-Op during the data loading and IDLE state, and after the loading is finished,
the global reset request will be sent to the DMD with a lifetime of one row cycle.
For the relationship between block operation type and block mode, refer to Table
3.2.
3. Application FPGA Design 19

Table 3.2: Block Modes and Block Addresses[12]

3.5 Remaining Modules: Applications FPGA IO, Pat-


tern Generation and DMD Control

The sections above describe how the image data is loaded into the memory,
and how the controller read the image data and send it out. There are still
some important modules to make the system work. The following modules are
originally from Texas Instruments with some adjustments.
3. Application FPGA Design 20

The Applications FPGA IO module (appsfpga_io.vhd ) provides all internal


clocks except the USB clock (ifclk ) for the system. The internal clocks are gen-
erated by phase-locked loops (PLL), a programmable oscillator that can produce
clocks with different phases and different frequencies related to a reference input
clock. For example, the base clock of the system is 50MHz, one can set the PLL as
follows to generate a 150MHz memory clock: multiplier=3, divider=1, phase=0.
In this module, the image data from DMD Trigger Control Module is transmitted
to the DLPC410 Controller with LVDS SerDes(low-voltage-differential-signal se-
rializer and deserializer) which enables high-speed signal transmission with high
noise resistance. The 64bits single-channel data is divided into four 16bits pieces,
and each piece is sent out by a DDR double clock. Therefore, the output clock is
400MHz (200MHz system clock), and the data appears both on the falling and
rising edges. The throughputs on both sides of clocks are the same. This gives
the DMD the ability to use a much higher frequency clock (400MHz).

Figure 3.9: Waveform of the LVDS SerDes. In one clock cycle of clk1x, the clk2x
has four edges. The parallel signals d1 - d4 at the positive edge of clk1x appear
one by one at each edge of clk2x

The Pattern Generation (pgen_a.vhd ) and DMD Control (DMD_control.vhd )


modules are adapted from the example design without any modification. They
are for the test pattern generation and the GUI. The user needs to select the
operation mode at the beginning by giving signals usb_tpg_en and mem_en.

3.6 Simulation

Simulation is an important process in FPGA development. The designer needs


to make sure that his FPGA design works as expected without errors. With
the help of the simulator, the design can be verified without programming the
FPGA. Xilinx ISE design suite includes a simulator ISim, but it is better to use
ModelSim if a license is available, I used ISim as a simulator in this project.
In modern IC design, many submodules can be designed independently, and
the very top module normally only connects all its submodules without any addi-
tional logic. This so-called top-down design method makes it possible to simulate
every submodule independently. In this project, I wrote testbench files for all the
modules that have been modified or added. In a testbench file, the designer needs
to instantiate the module and provide some input (stimulus), and then observe
whether the output meets the expectation.
3. Application FPGA Design 21

The USB data bus (bidir ) is a bidirectional port with type inout. Therefore,
it needs to be buffered first with Xilinx primitive IOBUF. During the simulation,
We found that the IOBUF does not work properly. After discussing it with my
supervisor Dr. Kadir, we believe this may be a problem with the simulator (ISim)
itself. To continue the simulation, the IOBUF is bypassed and the bidir bus is
changed to a normal input. However, this bypass is only made for simulation.
We still used the IOBUF during the synthesis.
The simulations for most modules are similar, but the memory simulation
requires extra steps. In addition to creating testbench and required signals, the
designer also needs to instantiate the DDR2 models which act as the physical
memory. The number of DDR2 models is determined by the memory parameters,
and the model can be found in the MIG IP folder. During the memory simulation,
we encountered a peculiar problem. When the project language of ISE is set to
VHDL, the ISim will return many warnings and the memory initialization will not
complete. Our solution is to simulate the memory in a separate ISE project with
Verilog. Normally the memory initialization process takes a lot of time. To be
more efficient, the memory is replaced with a large FIFO. After confirming that
the design is correct when using FIFO, it is simulated again using the memory.

3.7 Debugging

Normally good simulation results do not assure correct FPGA behaviour. There
are many factors that affect the actual results, of which timing is particularly
important. We use ChipScope ILA (Integrated Logic Analyzer) to monitor the
internal FPGA signals. This tool allows us to investigate the problems and actual
behaviour of firmware in real-time after programming FPGA.
Clock domain crossing (CDC) signals must be synchronized in multiple clock
domains-based design to meet the timing constraints. In the design process, only
the data buses are synchronized but not the one-bit signals. Later in debugging,
the unprocessed CDC signals would lead to failure. To synchronize the one-bit
CDC signals, we can use FlipFlop synchronizer to solve this problem. Taking
mem_read_enable as an example, mem_read_enable is generated by a combi-
national logic in the DMD Trigger Control module in the system clock domain,
it is firstly registered once by the system clock and then registered twice by the
memory clock.
While sending the data from PC to the USB interface, we found out that we
received redundant data. This is not a problem when using GUI, because GUI
sends FIFO reset signal to clear the redundant data. Since the GUI FIFO and
the memory FIFO are different, the FIFO reset signal cannot be directly used.
Therefore, we use ILA core to check where are the redundant data. Fortunately,
the redundant data only appear after row 0 and row 47, so we use a row counter
3. Application FPGA Design 22

to filter it out. This problem can be automatically corrected later when we use
our own GUI instead of the GUI given by TI. Since we are not able to modify
the GUI of TI, the problem is solved by filtering out redundant data in FPGA.
The read data from the memory was incomplete in our initial tests with
the DDR2 memory. There was a redundant zero (shown in figure 3.10) at the
beginning. Since the controller reads a fixed number of data, the last data was
lost. Investigating the reason of this issue hard because the simulation proves the
correctness of the design. Nevertheless, this problem is solved with some tricks.
For every image write operation, we append one more data at the end to make
sure that the last data can be read out, then we use a counter to bypass the first
data and the last appended data. Figure 3.11 shows this trick.

Figure 3.10: ILA result of memory read data. The first line is the valid signal.
It becomes high at time 0 with the first data "0000" (with read vertical line).
But "0000" should appear only one cycle during the valid signal, not in two
consecutive cycles.

Figure 3.11: Method to solve the memory read data issue. The last pattern data
is written to the address 12284, and an extra zero data is written to the address
12288.

3.8 Additional Information

For those who want to reproduce the design, the project settings should be as
shown in figure 3.12. When importing all the source files, two new libraries must
be set for two files: ddc4100 for appsfpga_dmd_types_pkg.vhd and ddr2 for
DDR2_2GB_150MHZ_pkg.vhd, refer to figure 3.13.
3. Application FPGA Design 23

Figure 3.12: ISE project settings

Figure 3.13: ISE libraries setting

Figure 4.1 shows the FPGA hardware resources utilization from the synthesis
report. It can be seen that the BRAM usage is very large (75%). In fact, this is
after disabling some unused FIFOs, otherwise, the BRAM usage would be 100%.
The disabled FIFOs are the FIFOs of channels C and D in the USB IO module
which are unused for XGA resolution. The ILA core also consumes BRAM, if
there are no more BRAM resources available, one can further disable the FIFOs
of channels C and D in the memory IO module.
3. Application FPGA Design 24

Figure 3.14: FPGA resources utilization


Chapter 4

Results

4.1 Testing

I tested my FPGA implementation with six test images in figure 4.2. The trigger
is generated inside the FPGA with a defined frequency which is also the frame
rate. The mem_en and num_patterns parameters are entered through the Chip-
Scope VIO (virtual input/output). the num_patterns parameter (VIO1) starts
from zero, which means that for n images, the parameter is n-1.

Figure 4.1: ChipScope VIO: The upper part is the input and output in system
clock domain (SyncIn: trigger_miss, SyncOut: mem_en). The lower part is the
number of pattern parameter given by the user.

Figure 4.3 shows the results for a 1/3Hz trigger. Driven by the 1/3Hz trigger,
the DMD displays the images one by one in a loop, each image lasts for one
second. Figure 4.4(a) shows the image when the frame rate is 22kHz. Because
human eyes can only see maximal 24 frames per second, it looks like all images
overlap together. A trigger miss signal (connected to VIO SYNC_IN(0)) is cre-
ated to check if the frame rate is indeed 22kHz. The trigger miss signal becomes
high when the FPGA detects a trigger pulse but this trigger pulse is discarded,
which means the read request is rejected. For the 22kHz frame rate, the trig-
ger miss signal is always deasserted, and it is asserted when the frame rate is
above 24kHz. This implies that 22kHz frame rate can be guaranteed, but not for
frame rates above 24kHz. The two complementary checker images (checker1 and
checker2) are used to check the completeness of the display. If the frame rate is

25
4. Results 26

above human eyes limitation, we should see all the pixels are grey, this is shown
in figure 4.4(b).

(a) checker1 (b) checker2 (c) vertical

(d) horizontal (e) DLP Logo (f) TI Logo

Figure 4.2: Six test images

(a) checker1 (b) checker2 (c) vertical

(d) horizontal (e) DLP Logo (f) TI Logo

Figure 4.3: Result at 1/3Hz frame rate. Each image can be displayed clearly.
4. Results 27

(a) test images (b) complementary checker images

Figure 4.4: Result at 22kHz frame rate. Because the frame rate exceeds the
limitation of human eyes, the image is static.

4.2 Latency Analysis

The overall system latency can be roughly measured by calculating the number
of clock cycles consumed by one operation. The latency mainly comes from four
processes:

1. Memory read latency: The memory latency is around 50 system clock


cycles.

2. Data loading: Loading 768 rows of data takes about 6144 memory clock
cycles (8192 system clock cycles).

3. Reset request: Reset request lasts 8 system clock cycles.

4. Reset delay: According to the technical document[12], the DMD needs at


most 4.5 µs for global reset operation, which is 900 system clock cycles.

A total of 9150 system clock cycles are required for one operation, which gives us
the maximum frame rate of 21.857kHz. However, the reset delay is usually less
than 4.5µs, and it is not a problem to slightly increase the frame rate to 22kHz.
Chapter 5

Conclusion and Outlook

In this project, I have implemented an FPGA based solution for achieving high
frame and low latency control of DMD. The proposed approach preloads the
images into a DDR2 memory and uses a trigger signal to read the data from the
memory. The maximal frame rate can be increased to 22kHz which is proved by
an internally generated trigger signal.
In the future, the trigger should be an external signal as a TTL (transistor-
transistor logic) input. The TTL input should be properly configured to be
compatible with the Virtex 5 FPGA in terms of threshold voltage and drive
level. Despite the memory read data issue is solved by a trick, the reason is still
worth investigating. I would suggest making a separate memory test project, and
use other types of DDR2 memory for the test. In addition to using Discovery
D4100 GUI, there are two possible methods to transfer image data to the FPGA.
One is to use ChipScope Engine Tcl (CSE/Tcl) Scripting Interface of the VIO
core by writing a Tcl script. Another one is to fully customize the Cypress FX2
interface. It is also possible to further increase the frame rate. The memory clock
is 150MHz which is too slow compared with the 200MHz system clock. If we use
another DDR2 memory that supports a 200MHz interface clock, the data loading
latency can be shortened from 8192 clock cycles to 6144 clock cycles, which leads
to a 28.36kHz frame rate.

28
Bibliography

[1] G. Clausen, “Generation of bragg spectroscopy potentials with a dmd,” ETH


Zürich, University of Oxford, 2020.

[2] K. Hueck, A. Mazurenko, N. Luick, T. Lompe, and H. Moritz, “Note: Sup-


pression of khz-frequency switching noise in digital micro-mirror devices,”
Review of Scientific Instruments, vol. 88, no. 1, p. 016103, 2017.

[3] P. P. Zupancic, “Dynamic holography and beamshaping using digital mi-


cromirror devices,” LMU Münich, Grainer Lab Harvard, vol. 242, 2013.

[4] B. Lee, “Introduction to±12 degree orthogonal digital micromirror devices


(dmds),” Texas Instruments, pp. 2018–02, 2008.

[5] T. Instruments, DLP Discovery 4100 Development Platform User’s Guid,


2018.

[6] Xilinx, “Xilinx ise downloads site,” website, https://www.xilinx.


com/support/download/index.html/content/xilinx/en/downloadNav/
vivado-design-tools/archive-ise.html.

[7] T. Instruments, DLP Discovery 4100 Applications FPGA Pattern Generator


Design User’s Guid, 2018.

[8] F. Blanc, “Fpga_hyperproto,” website, https://redmine.laas.fr/projects/


fpga_hyperproto/repository.

[9] Cypress, Designing with EZ-USB FX2LP Slave FIFO Interface.

[10] Xilinx, LogiCORE IP FIFO Generator v8.3 Product Guid, 2012.

[11] Xilinx, Memory Interface Solutions User Guide, 2010.

[12] T. Instruments, DLPC410 DMD Digital Controller, 2020.

29

You might also like