Dvi2rgb - v1 - 7 Digilent FPGA Core IP Reference
Dvi2rgb - v1 - 7 Digilent FPGA Core IP Reference
Pullman, WA 99163
509.334.6306
www.digilentinc.com
1 Introduction
This user guide describes the Digilent DVI-to-RGB Video IP quick facts
Decoder Intellectual Property. This IP interfaces directly Supported device
Zynq®-7000, 7 series
to raw transition-minimized differential signaling families
(TMDS) clock and data channel inputs as defined in DVI Supported user Xilinx®: IIC, vid_io
1.0 specs for Sink devices. It decodes the video stream interfaces Digilent: TMDS
and outputs 24-bit RGB video data along with the pixel Provided with core
clock and synchronization signals recovered from the Design files VHDL
TMDS link.
Simulation model VHDL Behavioral
Constraints file XDC
2 Features Software driver N/A
Connects directly to top-level digital visual Tested design flows
interface (DVI) port Vivado™ Design Suite
24-bit video (clocked parallel video data with Design entry
2016.4
synchronization signals) output
Vivado Synthesis
Display Data Channel interface with built-in Synthesis
2016.4
EDID ROM
Resolutions supported: 1920x1080/60Hz down to 800x600/60Hz (148.5 MHz – 40 MHz)
Selectable preferred resolution in EDID
Xilinx interfaces used: IIC, vid_io
Digilent interfaces used: TMDS
3 Performance
The IP constrains TMDS_Clk to 165 MHz, the maximum frequency outlined in DVI 1.0 specifications.
However, depending on the actual FPGA part or speed grade, the maximum supported frequency might
be lower. If the top-level design fails timing on pulse-width checks inside the IP instance, TMDS_Clk
needs to be (re)constrained to the maximum frequency supported on the project part. Check the part
datasheet for FMAX_BUFIO, which is the most likely reason for failed timing. TMDS_Clk should be constrained
for FMAX_BUFIO/5. Consequently, this is the maximum pixel clock frequency supported on that FPGA family
and speed grade.
4 Overview
The IP is built from multiple blocks: one clock recovery block, one data decoder block for each data
channel (see [3], [4]), one optional DDC (Display Data Channel) block and one control/reset block.
Since the clock frequencies are relatively high and the recovered clocks have tight phase requirements,
dedicated clocking primitives are instantiated inside this block. These can be seen in Figure 2. The
MMCM primitive incorporates a voltage controlled oscillator (VCO) that has an operating range specified
in the FPGA data sheet. Since there is no single set of MMCM parameters that maps the whole range of
DVI pixel clock frequencies to the VCO range, an IP customization parameter is available to optimize for
the expected resolution and pixel clock frequency.
4.2.1 Synchronization
To help with synchronization, the DVI protocol specifies period cues (control tokens) to be sent. These
control tokens are sufficiently different from the rest of the data that their succession can be used for
synchronization. Synchronization is automatically (re)started when a stable clock is detected and
recovered. The time it takes for a lock to be achieved depends on the phase relationship of the clock and
data streams. It should not last more than 1 minute.
DVI characters are 10-bits long, so a 10:1 deserialization of the data stream is needed. This can be
achieved with two ISERDESE2 primitives in a cascaded DDR configuration. In this configuration, the
master and slave ISERDESE2 take the serial data stream and sample it on both edges of a serial clock.
Thus, for every five serial clock periods, ten data bits are sampled. This 10-bit data is then output
synchronously with a divided clock, which is our pixel clock from the clock recovery block. Although this
recovers 10-bit words from the data stream at a frequency which can be passed on to general logic
inside the FPGA, it does guarantee the word actually starts at a character boundary or if the data stream
is sampled when data is stable.
To find the best moment to sample the data stream (i.e., the middle of an open eye), an IDELAYE2
primitive is inserted in front of ISERDESE2. This primitive is capable of delaying the data signal in tap
increments. In this IP, a 78ps increment is used for a total of 32 increments. For the highest pixel clock
frequency supported (165 MHz), one bit period is covered in 7 tap increments. The goal is to find the tap
delay value that shifts the data enough so that it gets sampled in the middle of its stable zone. The
phase alignment module compares the 10-bit words with the four special control tokens. If a succession
of tokens is not recognized in a timeout period, we are in the jitter zone and it increments the tap delay.
This is done repeatedly until control tokens are reliably recognized and the algorithm settles on the
middle of the stable bit period (open eye). An “open eye” is defined by a succession of a minimum
number of tap values (3) where the control token can be reliably detected, and it is delimited on both
ends by a tap value where it cannot be. However, using this definition will miss open eyes that begin or
end at the extremities of the tap delay range (0 or 31), because no two jitter zone delimiters will be
found. So even if the open eye begins or ends in the extremities, if it is sufficiently long (16 tap
increments), it will be considered a valid eye.
However, the IDELAYE2 primitive only provides a fine phase adjustment on the bit level, not covering
the whole character. To find the character boundary, a coarse phase adjustment is needed. This is
achieved by the “bitslip” feature of the ISERDESE2 primitive. If all the tap increments have been tried
and control tokens are still not detected, it is assumed that we are not at the right character boundary.
In this case, invoking “bitslip” causes either a shift right by one while bit or a shift left by three in the 10-
bit word. After “bitslip” completes, phase alignment begins again, looping through the tap increments
until tokens are found.
Phase alignment is considered completed when a succession of control tokens are reliably detected on
all data channels. At this moment all three data channel are considered valid.
However, since inter-pair channel skew is not negligible and channels are aligned independently, the
recovered data streams might have different delays on them. To eliminate this skew, the channels are
bonded by buffering them in FIFO memories and holding them back independently until the video
blanking period starts at the same time on all three. At this stage, all three data channels are valid and
in-sync.
4.2.2 Decoding
The TMDS standard encodes data so that the serial data stream contains few transitions (0-to-1 or 1-to-
0) and a DC balance (the same number of zeros and ones over a long time period). Every 10-bit
character actually encapsulates 8 bits of useful data. The exception to this are the control tokens, which
encapsulate 2 bits of control data. The data decoder block applies the decoding algorithm as specified in
the DVI 1.0 specifications. After decoding we are left with control data in blanking periods or pixel data
in active periods. Since each data channel carries one color, a 24-bit RGB pixel bus is output from the IP.
5 Port Descriptions
The signals of the DVI to VGA Core are listed and described in Table 1.
6.1 Constraints
The TMDS clock input Clk_p/n is constrained in the IP to the maximum DVI clock frequency, 165 MHz.
On some architectures this might result in timing impossible to meet. Depending on the application, if a
lower pixel clock frequency is acceptable, the clock can be constrained on top-level, which will override
the IP-internal constraints.
For example, to constrain the design for 720p resolution (74.25 MHz), calculate de clock period (13.468
ns), and add the following to a project XDC file to constrain the clock on the top-level input port:
6.2 Customization
The IP provides the following customizable parameters: the polarity of reset signals, PixelClk clock buffer
type, the frequency range of TMDS clock, the preferred resolution to be declared in the bundled EDID,
the availability of the DDC channel, and the serial clock output.
Enabling the DDC channel and serial clock will add the respective ports to the IP and are available to user
logic.
The parallel pixel clock (PixelClk) is recovered by the use of a BUFR buffer. Since BUFR is restricted to a
single clock region and the video data output from the core is synchronous to PixelClk, any downstream
logic consuming video data is also restricted to this clock region. The option to re-buffer PixelClk
introduces a BUFG after the BUFR and re-registers video data into the BUFG-domain. This will allow
downstream logic to be placed anywhere on the device.
Setting the expected TMDS clock frequency enables the IP to instantiate FPGA primitives that respect
timing requirements in the clock recovery logic. If the actual pixel clock recovered from the stream falls
outside the range set here, the VCO operating range of the FPGA might not be respected and in extreme
cases clock recovery might fail and the video stream will not be decoded properly.
The preferred resolution can be set, if the DDC channel is enabled. The resolution set here will select the
proper initialization file for the emulated EDID ROM. This EDID will be read out by connected sources
and might choose to transmit at this resolution.
One case would be to use this clock to drive the Digilent RGB2DVI core, sharing the clocking logic
between the two cores.
7 References
The following documents provide additional information on the subjects discussed:
1. Xilinx Inc., UG471: 7 Series FPGAs SelectIO Resources, v1.3, October 31, 2012.
2. Xilinx Inc., UG472: 7 Series FPGAs Clocking Resources, v1.6, October 2, 2012.
3. Xilinx Inc., XAPP460: Video Connectivity Using TMDS I/O in Spartan-3A FPGAs, V1.1, June 24,
2011.
4. Xilinx Inc., XAPP495: Implementing a TMDS Video Interface in the Spartan-6 FPGA, v1.0,
December 13, 2010.
5. Xilinx Inc., WP249: SPI-4.2 Dynamic Phase Alignment, v1.3, July 6, 2011.
6. DDWG: Digital Visual Interface DVI, Revision 1.0, April 2, 1999.