Xapp1322 Transceiver Link Tuning
Xapp1322 Transceiver Link Tuning
Summary
Manual link fine tuning is known to be a tedious and time-consuming operation. The search for
the highest link margin can delay an engineer in the laboratory for days. This application note
provides an automated fine tuning method for transceiver links. The automatic link tuning
application drives through a step-by-step experience on how to use the GT Debugger
methodology for the best equalization setup, power reduction, crosstalk analysis, and
measurement documentation.
Download the reference design files for this application note from the Xilinx website. For
detailed information about the design files, see Reference Design.
Introduction
In a typical transceiver link tuning analysis the user manually modifies the transmitter setup
(amplitude, post-emphasis, and pre-emphasis) according to the signal measured at the receiver
end and then chooses the best compromise. The goal is to find the transmitter setup that
guarantees the highest receiver margin. In cases where power consumption or crosstalk from
multiple aggressors must be controlled or reduced, the problem becomes even more intricate
because additional variables are present.
Link tuning is known to be a tedious procedure that should be repeated for every link in the
design. Due to the high number of configurations tested, the challenge with manual tuning is
to keep a well-ordered report of all experiences.
This application note presents a method to run automatic tuning of the channel. This method is
applicable when both sides of the link (transmitter and receiver) are Xilinx parts and are
controlled by hardware manager in the Vivado® design suite.
The application note makes extensive use of the GT Debugger tool presented in Automatic
Insertion of Debug Logic for Transceivers in Synthesis DCP (XAPP1295) [Ref 1]. The GT Debugger
offers a set of pre-built Tcl functions enabling transceiver ports probing and driving, and DRP
access.
This application note is organized as a walk-through debug session, starting from a guided
implementation of the design in Automatic Insertion of Debug Logic for Transceivers in Synthesis
DCP (XAPP1295) and finally focused on signal integrity, power and noise optimization, and test
documentation. Simple procedures are combined together to form a more complex analysis
path.
Features
The GT Debugger is a powerful set of Tcl procedures allowing a low-level transceiver debug
without adding any additional HDL code to the design under test. For this reason, the
GT Debugger is best suited to all cases where the HDL code is protected or encrypted, or where
the HDL code is simply not available.
The GT Debugger inserts its logic beside the original design, providing complete control of any
transceiver input or output port. The GT Debugger also allows you to modify any of the
transceiver registers during runtime and provides high-level procedures for safe
read-modify-write operations.
Some higher level Tcl routines combining general-purpose I/O (GPIO) and dynamic
reconfiguration port (DRP) access are presented here as examples. These allow flexible control
of the transceiver and more sophisticated analysis like the eye scan, or even a full automatic TX
and RX link tuning.
An automatic procedure analyzes the DCP file and creates a summary file with the extension
.do. You can manually edit the .do file or leave the GUI to edit it according to your preferences
(refer to Helping GUI for GT Debugger). It is also possible to parse the .do file with a Tcl script,
and this is the quickest method in case of repetitive debug activities (see Figure 1).
Synthesized DCP
Analysis
Design modification
Implementation
X19479-071917
Installation
Unzip the provided reference design file xapp1322-transceiver-link-tuning.zip to
any location. A directory called gt_debugger_<version_number> is created. Here,
<version_number> is the latest available version and can be 1.0 or higher. The path to the
gt_debugger GUI should be stored in a new Windows environment variable named GTDBG
(Figure 2).
X19480-070717
You can create a directory in which to store the modified designs and name it temp_designs,
for example. In the Linux environment, you can set the variable to
setenv GTDBG /myworkdirectory/gt_debugger_1.0.
The file config.tcl is available in the gt_debugger_1.0 directory after the tool is installed.
In this release, the config file is used to define the text editor you want to enable to open the
log files. It is possible to define the text editor for both platforms, Linux and Windows. For
Linux, it is enough to write the name of the editor. For Windows, the complete path is needed.
If not set, the default editor is the native Windows notepad.
To modify the setup of the text editor, edit lines 2 and/or 6 (bold text only) in config.tcl.
The Vivado design suite does not have an embedded distribution in Linux, so an installation of
Tcl/Tk is needed. Tcl/Tk should come with the Linux distribution by default, so there should be
no need to install it separately.
For Windows, a shortcut named gt_dbg is provided. Before starting it, open the properties
window of the shortcut and set the correct path to the Vivado tools installation. By default, the
tool starts from c:\Temp. If this folder doesn’t exist, you can create it or modify the path in the
Start in box of the gt_dbg shortcut properties. By clicking the shortcut, a new Vivado tools Tcl
shell opens and the GUI appears after a few seconds (Figure 3).
X19481-070717
X19482-070717
In a new project, only the Analysis button is enabled. The other buttons are enabled as soon as
the mandatory steps of the flow are completed. Opening an existing project enables the
buttons depending on the steps that have already been completed. The analysis of the DCP file
generates the .do file, which is a list of available transceivers with all available ports and
interfaces (Figure 5).
X-Ref Target - Figure 5
X19483-070717
• GPIO: Adding a port to this set allows you to drive or read the port from the Tcl console.
• VIO: Adding a port to this set allows you to probe the activity directly with a virtual
input/output (VIO).
• GPIO + ILA: This is the same as GPIO, but it also adds the integrated logic analyzer (ILA)
feature for a complete signal activity measurement.
• VIO + ILA: This is the same as VIO, but it also adds the ILA feature for a complete signal
activity measurement.
'021,725&/.
3URELQJ0HWKRG &ORFN&RQQHFWLRQ '538VDJH
&RPPRQDQG&KDQQHO
(QWLWLHVLQWKH'HVLJQ
The GT Debugger also allows you to probe the DMONITOR port. The DMONITOR port requires
a free running clock to be connected to DMONITORCLK. This clock can be internal, external, or
neither. Connect the DMONITORCLK only if the DMONITOR is used.
The DRP port can also be accessed with read and write operations. You can select how to
connect the DMONITORCLK and the DRP port in the GUI according to this sequence of steps:
1. Select the channel or common to configure. The port column in the middle is populated.
2. Select the probe (i.e., GPIO). The assignment arrow is activated.
3. Move the ports to be probed to the right column.
4. Save and eventually exit.
There is a quicker method to connect some ports that allows you to skip the steps described
above. By clicking the button Save Default, the tool loads and saves a default configuration. In
the root directory of the tool is a file called default.tcl. You can edit this file to define a
default configuration. If Save Default is used after having connected some ports, the tool
disconnects all ports and then connects them accordingly to default.tcl.
After the Save Default operation, you can exit without clicking Save All or connecting other
ports according to the steps described above. With the last step, the selected ports are
appended in the .do file. At this point, you should click Save All before exiting.
Clicking the Unconnect All button cleans the .do file by removing all the connections.
When the configuration is done, it is possible to move to the next step and run Generate
Bitstream. This step launches the synthesis, implementation, and bitstream generation. After
generating the bitfile (Figure 7), the GUI can be closed and the GT Debug can proceed with the
Vivado hardware manager.
X-Ref Target - Figure 7
X19485-070717
• Report:
° Configuration DumpFile: Extracts the connected ports from the .do file and opens the
file.
• Edit Configuration: Opens the .do file to be edited manually in the text editor.
• Run Synthesis: Run synthesis only.
• Run Implementation: Runs implementation and bitstream generation.
X19486-070717
X19487-070717
Copy the commands from the TopDemo.tcl file and paste them into the Vivado tools Tcl shell.
The command execution order is important, and the first operation required is to source the file
tuningproc.tcl. This is the repository for the procedures called during the experiments. A
third file, configuration.tcl, describes the link connections and the tuning strategy.
Hardware Description
All figures and examples in this application note refer to a test case designed for the GTH
transmitter and receiver in UltraScale devices. The method is general and can be replicated with
any combination of transceivers from the 7 series, UltraScale, and UltraScale+ devices.
For a fully functional test, two asynchronous sets of transceivers should be present. In the
reference design as shown in Figure 10, two asynchronous sets of transceivers on two different
boards (near-end (NE) and far-end (FE) side) are configured and controlled with two
independent Vivado hardware managers using independent JTAG links.
NE FPGA FE FPGA
NE JTAG FE JTAG
X19488-081017
The two FPGAs in the test referred to as the NE and FE FPGAs can be programmed with the
same design. In this application note, two transceiver characterization boards, UC1250 and
UC1287, were used (Figure 20).
Although the Vivado tools allow you to target multiple JTAG servers and multiple FPGAs with a
single hardware manager, due to some temporary restrictions the GT Debugger still requires
that one hardware manager be dedicated to only one FPGA.
Automatic resets on errors should be prevented. Figure 11 shows the VIO gating the resets due
to PRBS errors (resetonerror). Table 2 summarizes the main characteristics of the GTH Wizard
example design. Raw data is transmitted and received (with no encoder and decoder), and both
TX and RX buffers are used.
• Differential free-running clock: The original GT Wizard example design has a single-ended
free-running clock. The input buffer has been replaced with a differential input buffer.
• Gated reset on error: This is necessary to prevent unwanted automatic resets following data
errors. The signal prbs_error_any_sync is gated by the signal resetonerror. A VIO drives the
resetonerror signal.
• DRP
• RXLPMEN
• TXDIFFCTRL
• TXPOSTCURSOR
• TXPRECURSOR
• DMONITOROUT
• DMONITORCLK
• LOOPBACK
The 100 MHz free-running clock for the reset sequence also sources the DRP interface and the
DMONITORCLK. The GUI provided with this application note makes the port selection
immediate. From the GUI, it is possible to load a default set of ports or interfaces to be probed
and driven, and it is possible to manually customize the connections. If a signal is marked in the
HDL with KEEP or DONT_TOUCH, this signal should not be added to the probed port list,
otherwise it generates an error later during implementation.
X19489-070717
Figure 11: Stable Condition without Data Errors and Reset on Error Function Gated
In the file configuration.tcl, a dictionary called “links” describes the connections between
the transceivers and should match the cable connections. The same transceivers can at the
same time send data to themselves (through NE PMA loopback) and to any other FPGA receiver
(through the cable connection). For example, in Figure 12 link 1 represents a cable connection
existing from FE transmitter cX0Y8 to NE receiver cX0Y8, and link 5 is a near-end loopback on FE
FPGA cX0Y8. When the NE PMA LOOPBACK is active, the signal is still present on the output pin
so one transmitter can indeed transmit the signal to two receivers.
X-Ref Target - Figure 12
X19490-070717
Figure 12: Example in configuration.tcl and Relative Cable Connection between NE FPGA
and FE FPGA
Note: The FE FPGA transceivers in Figure 12 are all set to PMA Loopback mode.
X19491-070717
The actual transceiver attributes can differ from what is stated in the HDL. The DRP space can be
overwritten (for example, during the startup sequence or due to real-time protocol
replacement) and modified from the original setup. The attribute file is stored in the GT Debug
design directory (Figure 14).
X-Ref Target - Figure 14
X19492-070717
X19493-070717
The socket opens a TCP network connection and allows a client and server to exchange
information. After the socket link is initiated, the Vivado tools become a powerful environment
allowing an efficient information exchange between Tcl shells. In this application note, the
socket is the method used to send and receive information between two Vivado tools Tcl shells.
There are multiple examples of possible socket configurations available on the Internet. A
complete socket description is beyond the scope of this application note.
The example design contains two Vivado hardware managers driving one FPGA each via JTAG,
and a serial data link is present between the two boards. The link is established by configuring
a server and a client. The server is created by sourcing the FE_server.tcl script at the
far-end Tcl shell. The FE_server.tcl scripts contains the following instruction:
This command opens a network socket and returns a channel identifier. In this case a server side
socket is opened on port 2828, and the accept command is run. The procedure accept
contains a fileevent statement that allows it to process the incoming data whenever something
is present in the channel. The vwait command enters the Tcl event loop to process events until
some event handler sets the value of variable x, blocking the application if no events are ready.
Thus, after the FE side socket server is started, the relative Tcl shell remains blocked until
someone closes the channel. The client is activated by sourcing NE_client.tcl at the
near-end Tcl shell. After the channel is active, because the FE hardware manager shell is
blocked, all commands must be routed to the NE hardware manager Tcl shell only.
In the next set of experiments, the procedure sendcomm executed at the NE side can send a
command line through the socket, have it executed at the FE side, and receive feedback from
the FE. The experiment completes by closing the channel. Always follow the steps in the order
presented here:
At this point, the FE Tcl shell is released, but you should still close the channel at the FE side.
In summary, the socket allows you to send any instruction from the NE (client) to the FE (server)
and have this instruction executed at the FE side. This feature is particularly important, for
example, when a variable on the FE side should be known by the NE side.
The socket is extensively used in this application note to join two hardware managers and drive
multiple FPGAs at the same time. This method allows you to share information between
independent hardware managers managed by different laptops if the laptops are connected to
the same network.
The vector eye scan differs from the brute force scan in integrated bit error ratio test (IBERT),
where the offset sampler is made moving over the whole eye space. In the vector eye scan, the
measurement points are laid on a vector starting from the center of the eye. The eye is explored
by N = 2 M vectors with M ∈ N (Figure 16).
X-Ref Target - Figure 16
1 1 1
X19494-070717
Figure 16: Positions of Exploring Vectors in the Vector Eye Scan Procedure
The algorithm used is binary search, also known as half-interval or logarithmic search. It is
assumed that the bit error rate (BER) grows if the distance between the offset sampler and data
sampler increases.
In Figure 17, the circle represents a measurement with no bit errors, and a cross is a
measurement with bit errors. The measurement point is initially chosen in the middle of the
vector. If there are no bit errors (measurement 1), the measurement point is moved farther from
the center in the middle of the remaining vector segment (measurement 2).
Because no errors are found in measurement 2, the next measurement point is chosen in the
middle of the right-most remaining segment (measurement 3). If there are bit errors, the next
measurement point is moved closer to the center, again in the middle of the remaining segment
(measurement 4).
In a few steps (O(log n) comparisons, on average) the center-most failing point is found
(measurement 4). This is a great advantage for a slow Tcl-based eye scan.
••
•••
• •••• •• • • • • • •
X19495-070717
Figure 17: First Error Search Algorithm in the Vector Eye Diagram
The tool calculates the actual BER and marks the point with a number corresponding to the
negative BER exponent. For example if a BER = 10–8 is measured, the spot is marked with 8
(Figure 18). The achievable BER depends on the prescale setting, and this in turn affects the
measurement time.
X-Ref Target - Figure 18
X19496-070717
Experiment 5 - BF_Scan
The BF_scan procedure repeats the single vector eye scan over a set of transmitter
configurations. When calling the procedure, you should declare the ranges of TXDIFFCTRL,
TXPOSTCURSOR, and TXPRECURSOR, the incremental step, and the eye scan prescale.
This procedure needs the socket channel to connect the NE and FE boards. A receiver is used on
the NE board, and a transmitter is used on the FE board. The NE board is where the BF_Scan
should be launched. The NE Tcl shell:
The dictionary margin_d keeps a memory of all measurements ordered by prescale, link name,
and transmitter setup. It is possible to skip the repetition of the measurement if the same setup
was already examined in the past.
Progressive tuning explores the link behavior with a "zooming" strategy. It runs an initial, rough
analysis on the full variables extension, then it progressively zooms on the best position found
and increases the sampling density at the same time.
The search strategy is set in the configuration.tcl file, in the strategy_d dictionary. The
search for the best configuration goes by steps. In each step, the TXDIFFCTRL, TXPOSTCURSOR,
and TXPRECURSOR variation range and number of increments is defined.
In the example shown in Figure 19, the search is divided into three steps. The TXDIFFCTRL is
initially explored on 100% of the range, in the second run on 40% of the range, and finally on
20% of the range in the last run. The step size is reduced progressively from 5 to 3, and finally
to 1. At every new step, the focus is moved on the best position from the previous scan. The
number of steps, number of increments, and range are configurable. The script recognizes
when a setup has already been measured, and if it has, the measurement is skipped.
X-Ref Target - Figure 19
6WUDWHJ\VWHS 6ZHHSH[WHQVLRQ
,QFUHPHQWVRQ7;',))&75/
X19497-081017
X19498-070717
Note: In Figure 20, the blue cable has been used to allow some coupling between links 1 and 2. Most of
the coupling is actually due to the connectors.
With GT Debugger it is possible to estimate crosstalk by comparing the signal eye with or
without the aggressor presence. This crosstalk reduction script identifies the main aggressors
and helps to modify the aggressor signal energy. However, crosstalk is not only a matter of
signal amplitude but also and mainly a matter of signal spectrum. Xilinx recommends
completing the crosstalk analysis with a signal and channel frequency domain measurement as
a preferred way to mitigate the crosstalk noise.
The first procedure xtalk01 runs a tuning of all links in the list when all aggressors are switched
off. The second procedure xtalk02 compares the ideal eyes (with no crosstalk) and real eyes
(with crosstalk) and represents the result in two tables.
The coupling between channel 1 and channel 2 reduces the eye width and eye height. This is
reported by the jitter influence and amplitude influence tables (Figure 21).
X19499-070717
From this crosstalk measurement, the highest crosstalk is measured between channels 1 and 2.
In particular, when the aggressor is channel 1, the victim channel 2 eye width is reduced by two
steps, and the height is reduced by 8 steps. When the aggressor is channel 2, the victim
channel 1 eye width is reduced by about one step, and the height is reduced by 18 steps. Each
link transmitter and eye aperture at the receiver side can be recalled by entering the commands
shown in Figure 22.
X-Ref Target - Figure 22
X19500-070717
$FWXDOVHWXS
/RZHUHQHUJ\VHWXS
X19501-070717
Figure 23: Comparison of Two Configurations with Same Eye Width but Different Signal Energy
From Figure 23, the link 1 original configuration appears at position 0: TXDIFFCTRL = 13;
TXPOSTCURSOR = 0; TXPRECURSOR = 0. The eye width is 22 and eye height is 128. List item 77
shows that configuration TXDIFFCTRL = 5; TXPOSTCURSOR = 0; TXPRECURSOR = 0 still has the
same eye width and acceptable eye height of 77. The same approach can be repeated to reduce
the energy on link 2. After having replaced the link 1 and link 2 setup in the best_d dictionary,
you can run the crosstalk analysis again (Figure 24).
X-Ref Target - Figure 24
X19502-070717
Figure 24: New Aggressor Tables after Link 1 and Link 2 Energy Optimization
With the new setup for link 1 and 2, the crosstalk was greatly reduced on victim 1 but remained
almost the same on victim 2. You might wonder if this is really a success, because you started
from a lower height margin. However, this simple example with just one aggressor and one
victim should be extended to a general case with many aggressors. When many aggressors are
present, reducing their strength provides a significant benefit on the victim signal. In general,
avoiding wastage of signal power helps to keep the overall crosstalk contribution low and
mitigates reflections caused by transmission line impedance mismatches.
Reference Design
Download the reference design files for this application note from the Xilinx website.
Conclusion
This application note provides an easy way to use the powerful GT Debugger and gives some
hints about what the GT Debugger can do for channel analysis, tuning, power, and noise
optimization. The tool also shows how to share information between multiple Vivado hardware
managers, which is a precious resource if the optimized link transmitter and receiver do not
belong to the same FPGA.
• From the Vivado ® IDE, select Help > Documentation and Tutorials.
• On Windows, select Start > All Programs > Xilinx Design Tools > DocNav.
• At the Linux command prompt, enter docnav.
Xilinx Design Hubs provide links to documentation organized by design tasks and other topics,
which you can use to learn key concepts and address frequently asked questions. To access the
Design Hubs:
• In the Xilinx Documentation Navigator, click the Design Hubs View tab.
• On the Xilinx website, see the Design Hubs page.
Note: For more information on Documentation Navigator, see the Documentation Navigator page on
the Xilinx website.
References
1. Automatic Insertion of Debug Logic for Transceivers in Synthesis DCP (XAPP1295).
2. UltraScale Architecture Transceivers User Guides:
Revision History
The following table shows the revision history for this document.