Sdffsadadsf
Sdffsadadsf
Configuration
User Guide
Introduction
Virtex™ UltraScale+™ devices provide the highest performance and integration capabilities
in a FinFET node, including both the highest serial I/O and signal processing bandwidth, as
well as the highest on-chip memory density. As the industry's most capable FPGA family,
the Virtex UltraScale+ devices are ideal for applications including 1+Tb/s networking and
data center and fully integrated radar/early-warning systems.
Virtex UltraScale devices provide the greatest performance and integration at 20 nm,
including serial I/O bandwidth and logic capacity. As the industry's only high-end FPGA at
the 20 nm process node, this family is ideal for applications including 400G networking,
large scale ASIC prototyping, and emulation.
Artix™ UltraScale+ devices provide high serial bandwidth and signal compute density in a
cost-optimized device for critical networking applications, vision and video processing, and
secured connectivity. Coupled with the innovative InFO packaging, which provides excellent
thermal and power distribution, Artix UltraScale+ devices are perfectly suited to
applications requiring high compute density in a small footprint.
Zynq™ UltraScale+ MPSoC devices provide 64-bit processor scalability while combining
real-time control with soft and hard engines for graphics, video, waveform, and packet
processing. Integrating an Arm®-based system for advanced analytics and on-chip
programmable logic for task acceleration creates unlimited possibilities for applications
including 5G Wireless, next generation ADAS, and Industrial Internet-of-Things.
This user guide describes the UltraScale architecture-based FPGAs configuration and is part
of the UltraScale architecture documentation suite available at:
www.xilinx.com/documentation.
Overview
This chapter provides a brief overview of the configuration methods and features for the
UltraScale architecture-based FPGAs. Subsequent chapters provide more detailed
descriptions of each configuration method and feature.
AMD FPGAs are highly flexible, reprogrammable logic devices. Like processors, AMD FPGAs
are fully user programmable. For FPGAs, the program is called a bitstream, which defines
the application-specific FPGA functionality. The bitstream loads into the FPGA internal
memory at system power-up or on demand by the system.
Like processors and processor peripherals, AMD FPGAs can be reprogrammed, in system,
on demand, an unlimited number of times. After programming, the FPGA bitstream is
stored in highly robust CMOS configuration latches (CCLs). Although CCLs are
reprogrammable like SRAM memory, CCLs are designed primarily for data integrity. Because
the AMD FPGA bitstream is stored in CCLs, the device must be reconfigured after it is power
cycled.
The process whereby the defining data is loaded or programmed into the FPGA is called
configuration. Configuration is designed to be flexible to accommodate different
application needs and, wherever possible, to leverage existing system resources to
minimize system costs.
Similar to processors, AMD FPGAs optionally load or boot themselves automatically from an
external nonvolatile memory device. Alternatively, similar to processor peripherals, AMD
FPGAs can be downloaded or programmed by an external device, such as a microprocessor,
DSP processor, microcontroller, PC, or board tester. The configuration datapath can be serial
to minimize pin requirements, including configuration through the industry-standard IEEE
1149.1 JTAG boundary scan interface. A parallel configuration datapath provides maximum
performance and access to industry-standard interfaces, ideal for external data sources like
processors, or x8- or x16-parallel flash memory.
The configuration bitstream is loaded into the FPGA through special configuration pins.
These configuration pins serve as the interface for a number of different configuration
modes:
• Slave serial
• Slave SelectMAP (parallel) (x8, x16, and x32)
• JTAG boundary scan
• Master SPI (serial peripheral interface) (serial NOR flash x1, x2, x4, and dual x4,
effectively x8)
• Master BPI (byte peripheral interface) (parallel NOR flash x8 and x16)
• Master serial
• Master SelectMAP (parallel) (x8 and x16)
The terms master and slave refer to the direction of the configuration clock (CCLK):
• In master configuration modes, the FPGA drives CCLK from an internal oscillator.
Configuration options are used to select the desired frequency. After configuration, the
CCLK is turned off by default, and the CCLK pin is 3-stated with a weak pull-up.
• In slave configuration modes, CCLK is an input.
The specific configuration mode is selected by setting the appropriate level on the mode
input pins M[2:0]. The M2, M1, and M0 mode pins should be set at a constant DC voltage
level, either through pull-up or pull-down resistors (<1 kΩ ), or tied directly to ground or
VCCO_0. The JTAG (boundary scan) configuration interface is always available, regardless of
the mode pin settings.
The FPGA can also control its own configuration through internal connections from the
FPGA logic to the configuration logic. The device can be either fully reprogrammed with an
alternative design it has selected, or partial reconfiguration allows specific regions of the
FPGA to be reprogrammed with new functionality while applications continue to run in the
remainder of the device.
• Master serial and master SelectMAP configuration modes are not supported in the
UltraScale+ FPGAs. These modes are not recommended in the other UltraScale families.
• The configuration interface can operate only at 1.8V or 1.5V in the UltraScale+ FPGAs.
There is no CFGBVS pin in UltraScale+ devices. When migrating from an UltraScale
FPGA to an UltraScale+ FPGA, the CFGBVS pin location becomes RSVDGND and must
be connected to GND.
• The configuration timing and configuration rate options are different between
UltraScale FPGAs and UltraScale+ FPGAs. The configuration frame size is 93 32-bit
words in the UltraScale+ FPGAs and 123 32-bit words in the UltraScale FPGAs.
Notes:
1. The master SPI and BPI configuration modes are recommended over the legacy master serial and master
SelectMAP modes because they provide a wider flash density selection and lower cost solution. See Differences
Between UltraScale FPGA Families, page 9.
The master configuration modes are optimized to work with standard third-party flash
memories. The SPI mode interfaces to standard x1, x2, or x4 serial NOR flash memories,
while the BPI mode interfaces to x8 or x16 parallel NOR flash memories.
TIP: The master serial and master SelectMAP configuration modes are supported but not needed for
most applications. The 7 series FPGAs supported master serial mode for configuration from legacy
serial PROMs or for custom, CPLD-based configuration state machines driven by the FPGA CCLK. The
master SelectMAP mode has been superseded by the BPI configuration mode for direct configuration
from parallel flash. See Differences Between UltraScale FPGA Families, page 9.
The UltraScale architecture-based FPGAs add a new configuration mode for configuring
from two quad SPI flash memories in parallel. The resulting x8 configuration reduces the
configuration time while still allowing for the use of standard, high-speed, low-cost serial
NOR configuration memories.
The UltraScale architecture-based FPGAs combine RDWR_B and FCS_B on one pin and move
it into the dedicated configuration bank 0. CSI_B and ADV_B are combined on another pin.
The I/O flexibility is maximized by reducing the number of pins required for configuration.
Other configuration pins that were dual-purpose I/O in the 7 series and are dedicated in
bank 0 for the UltraScale architecture-based FPGAs include the first four data pins D[03:00]
and the control pin for pull-ups during configuration, PUDC_B.
The UltraScale architecture-based FPGAs use only one I/O bank (bank 65) for multi-function
pins needed for some configuration modes. The pin-outs describe the banks for each pin.
Configuration interfaces can be powered at 1.5V, 1.8V, 2.5V, or 3.3V. See Configuration
Banks Voltage Select (Kintex UltraScale and Virtex UltraScale FPGAs), page 35 for voltage
ranges supported by mode and by device.
Chapter 7, Design Entry provides details on the configuration and boundary scan
components. The following primitives are different than the 7 series primitives:
• DNA_PORTE2
° Extended to 96 bits
• FRAME_ECCE3
° No longer supported
° Use the Vivado® Integrated Logic Analyzer to monitor the internal signals of a
design. See Integrated Logic Analyzer Product Guide (PG172) [Ref 3].
• ICAPE3
KU025 Differences
The smallest Kintex UltraScale device, the KU025, has a reduced set of configuration
features. The KU025 does not support RSA authentication, SEU mitigation (SEM) IP, or
post-configuration CRC.
Table 1-2 summarizes the differences between the 7 series and UltraScale families,
including the features that are restricted in the Kintex UltraScale KU025 device.
Differences in 3D ICs
3D ICs using SSI (Stacked Silicon Interconnect) technology support the same configuration
modes as the monolithic devices. See the UltraScale Architecture and Products Overview,
(DS890) [Ref 5] and the 3D ICs website [Ref 6] for more information on devices using SSI
technology.
3D ICs have multiple super logic regions (SLRs), each with its own configuration controller.
One SLR is defined as the master, while the others are the slaves (see Table 1-3). The
configuration banks and the PCIe blocks that support tandem configuration are always
located in the master SLR. The SLRs are numbered from 0 at the bottom of the device
floorplan.
Notes:
1. All Master SLRs have a CONFIG_ORDER_INDEX value of 0.
Design Considerations
To make an efficient system, it is important to choose the FPGA configuration mode that
best matches the system's requirements. Each configuration mode dedicates certain FPGA
pins and can temporarily use other multi-function pins during configuration. These
multi-function pins are then released for general use when configuration is completed.
Similarly, the configuration mode can place voltage restrictions on an FPGA I/O bank.
Several different configuration options are available, and while the options are flexible,
there is often an optimal solution for each system. Several topics must be considered when
choosing the best configuration option: overall setup, speed, cost, and complexity.
Master Modes
The self-loading FPGA configuration modes, generically called master modes, are available
with either a serial or parallel datapath. In master mode, the FPGA's configuration bitstream
typically resides in nonvolatile memory on the same board. The FPGA internally generates a
configuration clock signal called CCLK, and the FPGA controls the configuration process by
sending a clock or addresses to the flash memory. See Figure 1-1.
X-Ref Target - Figure 1-1
Serial Parallel
Slave Modes
The externally controlled loading FPGA configuration modes, generically called slave
modes, are also available with either a serial or parallel datapath. In slave mode, an external
processor, microcontroller, DSP processor, or tester downloads the configuration image into
the FPGA, as shown in Figure 1-2. The advantage of the slave configuration modes is that
the FPGA bitstream can reside almost anywhere in the overall system. The bitstream can
reside in flash along with the host processor's code, on a hard disk, or somewhere over a
network connection.
Serial Parallel
Processor, Processor,
Microcontroller FPGA Microcontroller FPGA
[07:00] 8, 16, 32 [07:00]
SERIAL_DATA DIN DATA [15:00] D [15:00]
[31:00] [31:00]
CLOCK CCLK
SELECT CSI_B
Slave Serial Mode READ/WRITE RDWR_B
CLOCK CCLK
JTAG Connection
The JTAG mode is also a serial configuration mode, popular for prototyping and highly
utilized for board test. The four-pin JTAG boundary scan interface is common on board
testers and debugging hardware. The AMD programming cables for UltraScale
architecture-based FPGAs use the JTAG interface for prototype download and debugging.
Regardless of the configuration mode ultimately used in the application, it is best to also
include a JTAG configuration path to ease design development. See Figure 1-3.
X-Ref Target - Figure 1-3
JTAG Tester,
Processor,
Microcontroller FPGA
DATA_OUT TDI
MODE_SELECT TMS
CLOCK TCK
DATA_IN TDO
UG570_c1_03_053113
the FPGA can read a bitstream from a standard serial NOR flash device. The AMD tools
provide device programming support for select flash memories, by communicating with the
FPGA through its standard JTAG interface and programming the flash indirectly through the
FPGA.
• If there is spare nonvolatile memory already available in the system, the bitstream can
be stored in system memory. It can even be stored on a hard drive or downloaded
remotely over a network connection. If so, one of the slave modes should be
considered: slave serial or slave parallel mode, or JTAG mode.
• If nonvolatile memory is already required for an application, it is possible to leverage it
to also store FPGA configuration bitstreams with small or no incremental cost. For
example, the FPGA configuration bitstream can be stored with any processor code for
the board. If the processor is a MicroBlaze™ embedded processor in the FPGA, the
FPGA configuration data and the MicroBlaze processor code can share the same
nonvolatile memory device.
• At the same clock frequency, parallel configuration modes are inherently faster than
the serial modes because they program 8, 16, or 32 bits at a time.
• Configuring a single FPGA is inherently faster than configuring multiple FPGAs in a
daisy-chain. In a multi-FPGA design, where configuration speed is a concern, each
FPGA should be configured separately.
• In master modes, the FPGA internally generates the CCLK configuration clock signal.
The maximum supported CCLK frequency setting depends on the read specifications
for the attached nonvolatile memory. A faster memory enables faster configuration.
When using the internal oscillator source for CCLK, the output frequency can vary with
process, voltage, or temperature.
• Using the external EMCCLK clock source option enables a precision external clock
source for optimal configuration performance.
The general calculation used to estimate the configuration time is given in Equation 1-1.
BitstreamSize
ConfigurationTime = ----------------------------------------------------------------------------------- Equation 1-1
ConfigurationRate × DatabusWidth
The UltraScale FPGA bitstream size can be found in Table 1-4, page 18. The maximum
configuration clock frequency is dependent on the configuration mode and application
implementation. Guidelines to calculate the maximum configuration clock frequency are
provided in Chapter 2, Master SPI Configuration Mode, and Chapter 4, Master BPI
Configuration Mode. If the configuration time from power-up is required then T POR should
be added to the configuration time.
The Vivado tools provide the Tcl command calc_config_time which can be used to
estimate configuration time. Use help calc_config_time for usage information.
Protecting a Bitstream
Like processor code, a bitstream that defines the device’s functionality loads into the device
during power-on. Since this configuration data is stored off chip there exists a possibility of
unauthorized duplication / modification.
Like processors, there are multiple techniques to protect the bitstream and any embedded
intellectual property (IP) cores. The surest way to protect the confidentiality of your IP is to
encrypt the configuration data using an AES-256 key. Keys for the on-chip decryption logic
can be stored in either battery-backed RAM or one time programmable eFUSEs. This
technique allows for off-chip storage of your IP protected with high grade encryption.
Notes:
1. All UltraScale FPGA configuration frames consist of 123 32-bit words. All UltraScale+ FPGA configuration frames consist of
93 32-bit words.
2. Configuration array size equals the number of configuration frames times the number of words per frame.
Table 1-5: JTAG and IDCODE for UltraScale Architecture-based FPGAs (Cont’d)
Table 1-5: JTAG and IDCODE for UltraScale Architecture-based FPGAs (Cont’d)
Notes:
1. The “X” in the JTAG IDCODE value represents the revision field (IDCODE[31:28]) which can vary.
2. If the IDCODE does not match the expected value, ensure that PROGRAM_B is not held Low.
• Correct DRC errors and review any warnings when the configuration bitstream is
generated.
• Determine if remote upgrade will be needed and plan for it. See SPI Flash Programming
Including Bitstream Revision Selection (XAPP1191) [Ref 4].
Design Tools
Before implementing the design and generating the bitstream data file, it is important to
review the configuration settings to make sure they are correct for your design. Once the
bitstream settings are correct, the bitstream data file can be generated using the
write_bistream Tcl command or by using the Generate Bitstream button in the Vivado
flow navigator. There are three types of configuration settings in Vivado IDE:
Both design properties for configuration and configuration bitstream settings can be
defined by using the Edit Device Properties dialog for the selected synthesized or
implemented design netlist. The following steps describe how to set various properties
using this method:
3. In the Edit Device Properties dialog, select one of the categories in the left-hand
column.
4. Set the properties to the desired values, and click OK.
5. Select File > Save Constraints to save the updated properties to the target XDC file.
You can also set these properties using the set_property command in an XDC file or in
the Tcl command window.
Pinout Planning
The configuration mode(s) used in an application can affect the planning of the design
pin-out. It is important to determine and plan for the configuration modes before
beginning floorplanning or pin selection. The configuration mode not only determines the
connectivity of selected pins, it also determines the VCCO voltage required for the I/O bank
that includes multi-function pins.
1. Determine the configuration mode(s) for the FPGA. Be sure to account not only for the
primary configuration mode, but any additional configuration modes that are used for
debugging or updates.
2. Determine the set of pins and the bank locations for both the primary and secondary
configuration modes planned.
3. Determine how each of these pins are used and any restrictions placed on design usage
as standard I/O. Consider internal and external pull-ups or pull-downs, connections to
external devices, etc.
4. For each set of configuration pins, determine the common required I/O voltage support
for the required configuration bank(s). Only compatible I/O standards can be used
elsewhere in that bank.
Configuration Interfaces
AMD UltraScale FPGAs have seven configuration interfaces, and UltraScale+ FPGAs have
five configuration interfaces. Each configuration interface corresponds to one or more
configuration modes and bus width, shown in Table 1-6. For detailed interface timing
information, see the respective data sheet [Ref 9] or [Ref 10].
Notes:
1. Not recommended in UltraScale FPGAs, and not supported in UltraScale+ FPGAs. See
Differences Between UltraScale FPGA Families, page 9.
2. JTAG mode is always available independent of the Mode pin settings. Setting the Mode
pins to JTAG-only is not recommended for devices based on SSI technology due to
restrictions on ICAP access.
3. Slave serial is the default setting due to internal pull-up resistors on the Mode pins.
Configuration Pins
Each configuration mode has a corresponding set of interface pins that span one or two
banks on the FPGA. Bank 0 contains the dedicated configuration pins and is always part of
every configuration interface. Bank 65 contains multi-function pins that are involved in a
few of the configuration modes. If the Persist option is used (see Persist Option, page 179),
the multi-function I/O for the selected configuration mode remain active after
configuration. Table 1-7 and Table 1-8 show the configuration pins and their locations
across the I/O banks. See Differences Between UltraScale FPGA Families, page 9.
PUDC_B (2) 0 PUDC_B (2) PUDC_B (2) PUDC_B (2) PUDC_B (2) PUDC_B (2) PUDC_B (2) PUDC_B (2)
D[07:04] 65 - - - - D[07:04] - -
D[15:08] 65 - - - - - - -
A[15:00]_D[31:16] 65 - - - - - - -
A[28:16] 65 - - - - - - -
EMCCLK (3) 65 - EMCCLK (3) EMCCLK (3) EMCCLK (3) EMCCLK (3) - EMCCLK (3)
CSI_ADV_B 65 - - - - - - -
FOE_B 65 - - - - - - -
FWE_FCS2_B 65 - - - - FCS2_B - -
Notes:
1. CFGBVS is available in UltraScale FPGAs only.
2. PUDC_B has special functionality during configuration but is independent of all configuration interfaces, i.e. PUDC_B does
not need to be voltage compatible with other pins in a configuration interface.
3. EMCCLK is only used when the external master CCLK enable option enables EMCCLK as an input for clocking the master
configuration modes.
4. DOUT is only used in a serial configuration daisy-chain for outputting data to the downstream FPGA (or for the Debug
Bitstream option). Otherwise, DOUT is high-impedance.
5. CSO_B is only used in a parallel configuration daisy-chain for outputting a chip-enable signal to a downstream device.
Otherwise, CSO_B is high-impedance.
6. RS0 and RS1 are only driven in BPI mode when a MultiBoot event is initiated or when the Configuration Fallback option is
enabled and a Fallback event occurs. Otherwise, RS0 and RS1 are high-impedance. RS[1:0] pins are not recommended to be
used in User mode when they are used for configuration.
7. Dashes indicate that the pin is not used in the configuration mode and is high-impedance and ignored during configuration.
PUDC_B (2) 0 PUDC_B (2) PUDC_B (2) PUDC_B (2) PUDC_B (2) PUDC_B (2) PUDC_B (2) PUDC_B (2)
EMCCLK (3) 65 EMCCLK (3) EMCCLK (3) EMCCLK (3) EMCCLK (3) - - -
DOUT_CSO_B (4)(5) 65 CSO_B (5) CSO_B (5) CSO_B (5) CSO_B (5) CSO_B (5) CSO_B (5) CSO_B (5)
Notes:
1. CFGBVS is available in UltraScale FPGAs only.
2. PUDC_B has special functionality during configuration but is independent of all configuration interfaces, i.e. PUDC_B does
not need to be voltage compatible with other pins in a configuration interface.
3. EMCCLK is only used when the external master CCLK enable option enables EMCCLK as an input for clocking the master
configuration modes.
4. DOUT is only used in a serial configuration daisy-chain for outputting data to the downstream FPGA (or for the Debug
Bitstream option). Otherwise, DOUT is high-impedance.
5. CSO_B is only used in a parallel configuration daisy-chain for outputting a chip-enable signal to a downstream device.
Otherwise, CSO_B is high-impedance.
6. RS0 and RS1 are only driven in BPI mode when a MultiBoot event is initiated or when the Configuration Fallback option is
enabled and a Fallback event occurs. Otherwise, RS0 and RS1 are high-impedance. RS[1:0] pins are not recommended to be
used in User mode when they are used for configuration.
7. Dashes indicate that the pin is not used in the configuration mode and is high-impedance and ignored during configuration.
POR_ N/A Dedicated Input Power On All N/A Reduces TPOR time (from power up to INIT_B
OVERRIDE Reset Delay rise) as specified in data sheet. Connect
Override directly to VCCINT for a shorter T POR time if
required and if supported by the power-up
timing of the configuration data source.
Connect to GND for standard longer POR
delay.
VBATT N/A Supply N/A Battery Serial, SPI, N/A Battery backup supply for the FPGA internal
Voltage Backup Supply SelectMAP, BPI, volatile memory that stores the key for the AES
JTAG decryptor. For encrypted bitstreams that
(with BBRAM require the decryptor key from the volatile key
encryption) memory area, connect this pin to a battery to
preserve the key when the FPGA is unpowered.
M[2:0] 0 Dedicated Input Configuration All ≤1 kΩ Determine the configuration mode. See
Mode Table 1-6 for the configuration mode settings.
Connect each mode pin either directly, or via a
≤1 kΩ resistor, to V CCO_0 or GND.
TCK 0 Dedicated Input IEEE Std JTAG 10 kΩ Clock for all devices on a JTAG chain. Connect
1149.1 (JTAG) to AMD cable header's TCK pin. Treat as a
Test Clock critical clock signal and buffer the cable
header TCK signal as necessary for multiple
device JTAG chains. If the TCK signal is
buffered, connect the buffer input to an
external weak (e.g. 10 kΩ) pull-up resistor to
maintain a valid High when no cable is
connected.
TMS 0 Dedicated Input JTAG Test JTAG 10 kΩ Mode select for all devices on a JTAG chain.
Mode Select Connect to AMD cable header's TMS pin.
Buffer the cable header TMS signal as
necessary for multiple device JTAG chains. If
the TMS signal is buffered, connect the buffer
input to an external weak (e.g. 10 kΩ) pull-up
resistor to maintain a valid High when no cable
is connected.
TDI 0 Dedicated Input JTAG Test Data JTAG N/A JTAG chain serialized data input. For an
Input isolated device or for the first device in a JTAG
chain, connect to AMD cable header's TDI pin.
Otherwise, when the FPGA is not the first
device in a JTAG chain, connect to the TDO pin
of the upstream JTAG device in the JTAG scan
chain. If the TCK signal is buffered, connect the
buffer input to an external weak (e.g. 10 kΩ)
pull-up resistor to maintain a valid High when
no cable is connected.
TDO 0 Dedicated Output JTAG Test Data JTAG N/A JTAG chain serialized data output. For an
Output isolated device or for the last device in a JTAG
chain, connect to AMD cable header's TDO pin.
Otherwise, when the FPGA is not the last
device in a JTAG chain, connect to the TDI pin
of the downstream JTAG device in the JTAG
scan chain.
PROGRAM_B 0 Dedicated Input Program (bar) All ≤ 4.7 kΩ Active-Low reset to configuration logic. When
PROGRAM_B is pulsed Low, the FPGA
configuration is cleared and a new
configuration sequence is initiated.
Configuration reset is initiated upon the
falling edge, and configuration (i.e.
programming) sequence begins upon the
following rising edge. PROGRAM_B can
externally be held Low during power-up to
stall the power-on configuration sequence at
the end of the initialization process. If
PROGRAM_B is held Low, JTAG operations can
be restricted. Dedicated pins remain disabled
while PROGRAM_B is held Low.
Connect PROGRAM_B to an external ≤ 4.7 kΩ
pull-up resistor to VCCO_0 to ensure a stable
High input. Recommended push-button to
GND to enable manual configuration reset.
INIT_B 0 Dedicated Bidirectional Initialization All 4.7 kΩ Active-Low FPGA initialization pin or
(open-drain) (bar) configuration error signal. The FPGA drives
this pin Low when the FPGA is in a
configuration reset state, when the FPGA is
initializing (clearing) its configuration
memory, or when the FPGA has detected a
configuration error. Note that INIT_B does not
drive Low when VCCINT is powered down. In
UltraScale+ devices the INIT_B pin might be
seen as High (because of external resistors on
board for INIT_B) for approximately 40 ms
after power ON. The initial High time depends
on the POR_OVERRIDE setting. With
POR_OVERRIDE Low, the High time is approx.
40 ms. With POR_OVERRIDE High, the High
time is approx. 9 ms.)
Upon completing the FPGA initialization
process, INIT_B is released to high-impedance
at which time an external resistor is expected
to pull INIT_B High. INIT_B can externally be
held Low during power-up to stall the
power-on configuration sequence at the end
of the initialization process. When a High is
detected at the INIT_B input after the
initialization process, the FPGA proceeds with
the remainder of the configuration sequence
dictated by the M[2:0] pin settings. After
configuration, INIT_B can optionally be
leveraged to indicate when the FPGA has
detected a configuration error.
Connect INIT_B to a 4.7 kΩ pull-up resistor to
V CCO_0 to ensure clean Low-to-High
transitions.
DONE 0 Dedicated Bidirectional Done All 4.7 kΩ A High signal on the DONE pin indicates
completion of the configuration sequence. By
default, the DONE output is open-drain.
Note: DONE has a default internal pull-up
resistor of approximately 10 kΩ. External
4.7 kΩ resistor circuits are not required but are
recommended. For multiple SLR devices the
DONE pull-up resistor cannot be stronger than
4.7 kΩ.
In UltraScale+ devices the DONE pin might be
seen as High (because of external resistors on
board for DONE) for approximately 40 ms after
power ON. (The initial High time depends on
the POR_OVERRIDE setting. With
POR_OVERRIDE Low, the High time is approx.
40 ms. With POR_OVERRIDE High, the High
time is approx. 9 ms.)
CCLK 0 Dedicated Input or Configuration Master Serial, N/A Runs the synchronous FPGA configuration
Output Clock Master sequence by default. The FPGA sources the
SelectMAP, SPI, configuration clock and drives CCLK as an
BPI output. Note: Treat CCLK as a critical clock
(synchronous) signal to ensure good signal integrity.
PUDC_B 0 Dedicated Input Pull-Up During All ≤1 kΩ Active-Low input enables internal pull-up
Configuration resistors on the SelectIO pins after power-up
(bar) and during configuration, including
multipurpose configuration pins when not
used for the selected configuration mode.
When PUDC_B is Low, internal pull-up resistors
are enabled on each SelectIO pin. When
PUDC_B is High, internal pull-up resistors are
disabled on each SelectIO pin. PUDC_B must
be tied either directly, or via an ≤ 1 kΩ resistor,
to V CCO_0 or GND.
RDWR_FCS_B 0 Dedicated Input or Read/Write SelectMAP N/A An external device controls the RDWR_B signal
Output (bar) or Flash (RDWR_B) to control the direction of the SelectMAP data
Chip Select bus for read/write from/to the SelectMAP
(bar) interface. When RDWR_B is High, the FPGA
outputs read data onto the SelectMAP data
bus. When RDWR_B is Low, an external
controller can write data to the FPGA through
the SelectMAP data bus.
D00_MOSI 0 Dedicated Bidirectional Data Bit 0 or SPI x1 (MOSI) N/A Output for sending commands to the serial
Master-Output (slave) flash device. Connect to the flash serial
Slave-Input data input (DQ0/D/SI/IO0) pin.
D01_DIN 0 Dedicated Input or Data Bit 1 or BPI, SelectMAP N/A Multi-purpose pin that functions as the D01
Bidirectional Data Input (D01) data input pin. See D[31:00] row in this table.
Serial, SPI x1 N/A Data input that receives serial data from the
(DIN) data source. Connect DIN to the serial data
output pin of the serial data source
(DQ1/Q/SO/IO1 pin). By default, data from
DIN is captured on the rising edge of CCLK.
SPI x2/x4/x8 N/A Data input from dual or quad flash device.
(D01) Connect to the flash data output pin
(DQ1/Q/SO/IO1 pin).
D02 0 Dedicated Input or Data Bit 2 SPI x4/x8 4.7 kΩ Connect to the flash quad data bit 2 output
Bidirectional (DQ2/W#/WP#/IO2) pin and connect to an
external 4.7 kΩ pull-up resistor to V CCO_0.
D03 0 Dedicated Input or Data Bit 3 SPI x4/x8 4.7 kΩ Connect to the flash quad data bit 3 output
Bidirectional (DQ3/HOLD#/IO3) pin and connect to an
external 4.7 kΩ pull-up resistor to V CCO_0.
CFGBVS 0 Dedicated Input or Configuration All N/A Supported in Kintex UltraScale and Virtex
Bidirectional Banks Voltage UltraScale FPGAs only. Determines the I/O
Select voltage operating range and voltage tolerance
for the dedicated configuration bank 0, and
during configuration for the configuration
pins in bank 65, when those banks are HR I/O
banks. Connect CFGBVS High or Low per the
bank voltage requirements. If the V CCO_0
supply for bank 0 is supplied with 2.5V or 3.3V,
then this pin must be tied High (connected to
V CCO_0). Tie CFGBVS to Low (connect to GND)
only if the VCCO_0 for bank 0 is 1.5V or 1.8V.
When bank 65 is used for configuration, it
should have the same voltage as bank 0. See
Configuration Banks Voltage Select (Kintex
UltraScale and Virtex UltraScale FPGAs),
page 35.
EMCCLK 65 Multi- Input External SPI, BPI, Master N/A Optional external clock input for running the
function Master Serial, Master configuration logic in a master mode (versus
Configuration SelectMAP the internal configuration oscillator). The
Clock FPGA can optionally switch to EMCCLK as the
clock source, instead of the internal oscillator,
for driving the internal configuration engine.
The EMCCLK frequency can optionally be
divided via a bitstream setting and is
forwarded for output as the master CCLK
signal.
CSI_ADV_B 65 Multi- Input or Chip Select SelectMAP N/A Active-Low input that enables the FPGA
function Output Input (bar) or (CSI_B) SelectMAP configuration interface. An
Address Valid external configuration controller can control
(bar) CSI_B for selecting the active FPGA on the
SelectMAP bus, or in a parallel configuration
daisy-chain, connect to the CSO_B pin of the
upstream FPGA.
DOUT_CSO_B 65 Multi- Output Data Output Serial, SPI x1 N/A Data output for a serial configuration
function or (DOUT) daisy-chain. If the device is in a serial
Chip Select configuration daisy-chain, then connect to the
Output (bar) DIN of the downstream slave-serial FPGA.
All N/A Note: DOUT can output data when the Debug
Bitstream option is enabled.
D[31:00] 65 D Multi- Input or Data Bus Serial, SPI, N/A A subset or all of the D[31:00] pins are the data
[03:00] function Bidirectional SelectMAP, BPI bus interface for the serial, SPI, SelectMAP or
are in BPI modes. By default, data from the data bus
Bank 0 is captured on the rising edge of CCLK. The
remaining data pins are unused, ignored, and
high impedance during configuration, and
D[31:04] can be used as I/O after
configuration.
The data bus signals are inputs when reading
the configuration data from the flash. Data bus
signals can be outputs during a write to a
parallel flash Read Configuration register or
during SelectMAP readback. For serial and SPI
modes, also see the D00-D03 rows in this
table.
JTAG N/A All data pins are unused, ignored, and high
impedance during configuration.
A[15:00]_ 65 Multi- Input or Address Bus BPI (A[15:00]) N/A Address Bus bits 15 to 0 - see A[28:00] row in
D[31:16] function Output LSBs or Data this table.
Bus MSBs
SelectMAP x32 N/A Data Bus bits 31 to 16 - see D[31:00] row in this
table.
A[28:00] 65 Multi- Output Address Bus BPI N/A Output addresses to a parallel NOR flash. A00
function is the least-significant address bit. Connect the
FPGA A[28:00] pins to the parallel NOR flash
address pins with the FPGA A00 pin connected
to the least-significant flash address input pin
that is valid for the used data bus width.
Depending on the flash type and used data
bus width, the least-significant address bit of
the flash can be A1, A0, or A-1. Note that any
upper address pins that exceed the address
bus width of the parallel NOR flash are driven
during configuration, but can be used as I/O
after configuration.
FOE_B 65 Multi- Output Flash BPI ≤ 4.7 kΩ Active-Low output-enable control signal for a
function Output-Enable parallel NOR flash. Connect to the flash
(bar) output-enable input and connect to an
external ≤ 4.7 kΩ pull-up resistor to VCCO_65.
FWE_FCS2_B 65 Multi- Output Flash BPI (FWE_B) ≤ 4.7 kΩ Active-Low write-enable control signal for a
function Write-Enable parallel NOR flash. Connect to the flash
(bar) write-enable input and connect to an external
or ≤4.7 kΩ pull-up resistor to V CCO_65.
Flash Chip
Select 2 (bar) SPI x8 (FCS2_B) 2.4 kΩ Active-Low chip select output that enables
second SPI quad flash device for configuration
in x8 mode.
RS[1:0] 65 Multi- Output Revision Select BPI only, with N/A Revision selection output pins. Normally
function fallback, or high-impedance during configuration. When
with MultiBoot the bitstream configuration fallback option is
enabled, the FPGA drives RS0 and RS1 Low
during the fallback configuration process that
follows a detected configuration error. When a
user-invoked MultiBoot configuration is
initiated, the FPGA can drive the RS0 and RS1
pins to a user-defined state during the
MultiBoot configuration process.
Configuration is not supported below the minimum recommended operating voltage for
1.5V as specified in the data sheet. The CFGBVS pin setting determines the I/O voltage
support for bank 0 at all times, before, during, and after configuration. CFGBVS similarly
controls the voltage tolerance on bank 65, but only during configuration.
The UltraScale FPGAs have two I/O bank types for configuration: high-range (HR) I/O banks
support 3.3V and lower I/O standards, and high-performance (HP) banks support I/O
standards of 1.8V or lower. The dedicated configuration and JTAG I/O are located in bank 0,
which is a high-range bank type on all Kintex UltraScale and Virtex UltraScale devices, and
a high-performance bank in Artix UltraScale+, Kintex UltraScale+, and Virtex UltraScale+
devices. Several of the configuration modes also rely on pins in bank 65. Bank 65 is an HR
bank in most Kintex UltraScale FPGAs, an HP bank in the KU095 and Virtex UltraScale
FPGAs, and an HP bank in all Artix UltraScale+, Kintex UltraScale+, and Virtex UltraScale+
FPGAs.
Table 1-10 shows the CFGBVS pin connection options and the corresponding set of valid
VCCO_0 supply and I/O voltages.
Table 1-10: CFGBVS Pin Connection for Bank VCCO Supplies and I/O Signal Voltages
Supported Banks VCCO Supply and I/O Signal Voltages
CFGBVS Pin Kintex UltraScale Virtex UltraScale
Connection except KU095 and KU095
Banks 0, 65 Bank 0
V CCO_0 3.3V or 2.5V 3.3V or 2.5V
(3.3V or 2.5V)
GND 1.8V or 1.5V 1.8V or 1.5V
CAUTION! When CFGBVS is connected to GND for 1.8V or 1.5V I/O operation, the VCCO_0 and I/O
signals to bank 0 must be 1.8V (or lower). Otherwise, the device can be damaged from the application
of voltages to pins on Bank 0 that are greater than the 1.8V operation maximum.
The interface pins associated with the configuration mode can span bank 0 and bank 65,
primarily when using 8-bit or wider data interfaces. When both banks are used for a
configuration interface, the VCCO pins for both banks must receive the same voltage to
ensure a consistent I/O voltage interface and timing for all of the configuration interface
pins. Using the same voltage for banks 0 and 65 is recommended because it allows the
option of using an 8-bit or wider configuration mode, and avoids the I/O transition
described under I/O Transition at the End of Startup, page 155.
1. Determine the configuration mode(s) for the FPGA. Note that the JTAG interface is
always supported in bank 0 at the VCCO_0 voltage level regardless of the configuration
mode.
2. For each configuration mode to be used for the FPGA, determine the set of pins used for
the configuration mode and the bank locations (see Table 1-7 and Table 1-8).
3. For each set of configuration pins, determine the common required I/O voltage support
for the required configuration bank(s).
4. Determine the target FPGA family. The Virtex UltraScale and Kintex KU095 FPGAs only
support 1.8V/1.5V configuration on bank 65.
5. Set the CFGBVS pin to support the required configuration I/O voltage. See Table 1-11
and Table 1-12 for the appropriate CFGBVS pin setting.
Table 1-11: Kintex UltraScale (Except KU095) Compatible Voltages and CFGBVS Pin Connection
Table 1-11: Kintex UltraScale (Except KU095) Compatible Voltages and CFGBVS Pin Connection (Cont’d)
Notes:
1. JTAG interface is always supported in bank 0 at the V CCO_0 voltage level regardless of the configuration mode.
2. In most Kintex UltraScale FPGAs, bank 65 is an HR I/O bank, supporting voltages up to 3.3V. In the Virtex UltraScale and
KU095 FPGAs, bank 65 is an HP I/O bank, limited to 1.8V or lower I/O standards.
3. Using 2.5V or 3.3V on bank 65 is recommended when bank 0 is at 2.5V or 3.3V. See I/O Transition at the End of Startup,
page 155.
4. Bank 65 is not used for configuration in this mode. V CCO_65 can be set according to the needs of the I/O interface after
configuration.
Table 1-12: Virtex UltraScale and Kintex UltraScale KU095 Compatible Voltages and CFGBVS Pin
Connection
Notes:
1. JTAG interface is always supported in bank 0 at the V CCO_0 voltage level regardless of the configuration mode.
2. In the Virtex UltraScale FPGAs and Kintex UltraScale KU095 FPGA, bank 65 is a high-performance bank, limited to 1.8V
or lower I/O standards. CFGBVS does not affect that bank.
3. Bank 65 is not used for configuration in this mode. V CCO_65 can be set according to the needs of the I/O interface after
configuration.
Power-On Reset
To ensure proper power-on behavior, the guidelines in the respective family data sheet
must be followed. For configuration, UltraScale architecture-based FPGAs require power on
the V CCO_0, VCCAUX, VCCBRAM, and VCCINT pins. Power sequencing requirements are
described in the data sheet. The power supplies should ramp monotonically within the
power supply ramp time range specified in the data sheet. All supply voltages should be
within the recommended operating ranges; any dips in V CCINT or VCCAUX below their data
retention voltages in the data sheet can result in loss of configuration data.
The FPGA automatically provides a delay between power-on and the beginning of
configuration, called the power-on reset (POR) delay. The POR delay count is short or long
depending on POR_OVERRIDE. The TPOR delay starts from the time the last required supply
rail is supplied to the FPGA at 95% of its nominal value, and ends with the FPGA asserting
the INIT_B pin, sampling the Mode pins, and starting to toggle the CCLK if master mode is
selected.
POR_OVERRIDE
The Power On Reset Override select (POR_OVERRIDE) pin must be set High or Low to
determine the power-on delay before configuration begins. The POR_OVERRIDE is a logic
input pin referenced between V CCINT and GND. When VCCO_0, VCCAUX, VCCBRAM, and VCCINT
power supplies are all ramped up to 95% of their normal value in a total time of <= 2 ms,
the POR_OVERRIDE pin can be tied High at power-up (e.g., connected to the V CCINT supply
rail). The POR delay is shortened as specified in the data sheet (fast POR counter) with the
POR_OVERRIDE pin tied high. When the POR_OVERRIDE pin is Low (e.g., connected to
GND), the POR delay is longer (slow POR counter). POR_OVERRIDE should be connected to
GND unless the flash will always be ready as soon as the FPGA is powered up (see Power-On
Sequence Precautions for Flash).
IMPORTANT: Do not connect POR_OVERRIDE to V CCO_0 as with bank 0 pins. POR_OVERRIDE must be
connected to V CCINT or GND. Do not leave POR_OVERRIDE floating.
The device always waits for the V CCINT power-on threshold to be met before determining
the POR_OVERRIDE value, eliminating the possibility of false High readings. V CCINT is
recommended to ramp first.
IMPORTANT: Configuration modes with a bus width of 8, 16, or 32 require V CCO_65, in addition to the
V CCO_0 that is built in to the power-on sequence requirement of the FPGA. Make sure V CCO_65 is
supplied at or before VCCO_0 to ensure proper configuration.
In general, the system design must consider the effect of the power sequence, the power
ramps, FPGA power-on reset timing, and flash power-up timing on the timing relationship
between the start of FPGA configuration and the readiness of the flash. Refer to the AMD
data sheet for FPGA power supply requirements and timing, and check the flash data sheet
for the flash power-up timing requirements.
One of these system design approaches can ensure that the flash is ready to receive
commands before the FPGA starts its configuration procedure:
• Control the sequence of the power supplies such that the flash is certain to be powered
and ready before the FPGA begins its configuration procedure.
• Use the longer T POR delay by connecting POR_OVERRIDE to GND.
• Hold the FPGA INIT_B pin Low from power-up to delay the start of the FPGA
configuration procedure. Release the INIT_B pin to High after the flash becomes ready.
Dedicated configuration logic can divide the EMCCLK input or use the full rate (divide by 1).
The EXTMASTERCCLK_EN_en option is set in the Vivado tools with the
BITSTREAM.CONFIG.EXTMASTERCCLK_EN property (see Vivado Design Suite User Guide
Programming and Debugging (UG908) [Ref 8] for details):
Connect the EMCCLK input to the oscillator or other clock source on the board. Use good
signal integrity design practices, especially for very high-speed clocks, to avoid signal
integrity issues that can cause errors during configuration. EMCCLK is a single-ended clock
input.
The configuration begins with the CCLK generated by the FPGA internal oscillator until the
bitstream header is read. If the EMCCLK option is enabled then the FPGA switches from the
internal oscillator to the clock found on the EMCCLK pin.
VBATT
The dedicated VBATT pin provides a backup power supply for the AES decryptor key
memory, similar to V CCBATT in the 7 series FPGAs. VBATT is required only when bitstream
encryption is used, the key is stored in battery-backed RAM, and a backup supply to the key
space (powered by VCCAUX) is desired. If encryption is not used or is used with the eFUSE
key, tie V BATT to VCCAUX or GND. For more details on how VBATT is used for encryption
applications, see Chapter 8, Bitstream Security, eFUSEs, and Device DNA.
Introduction
The master SPI configuration mode in AMD UltraScale™ architecture-based FPGAs enables
the use of low pin count, industry-standard serial NOR flash devices for bitstream storage.
The FPGA supports a direct connection to the de facto standard, four-pin SPI interface of a
serial NOR flash device for reading a stored bitstream.
M[2:0]
D00_MOSI DOUT
D01_DIN FCS_B
D02 FCS2_B
D03
D04
D05
D06
D07
INIT_B DONE
PROGRAM_B
PUDC_B
EMCCLK CCLK
ug570_c2_01_073014
VCCINT
2.4 kΩ
Tie to VCCO_0 or GND; Bank 0
VCCO_0
see Configuration Banks CFGBVS VCC
Voltage Select section
D00_MOSI D
Tie to VCCO_0 or GND PUDC_B D01_DIN Q
M2 FCS_B S
M1 CCLK C SPI Flash
VCCO_0 UltraScale
VCCO_0 FPGA VCCO_0 VCCO_0 VCCO_0
W
M0
4.7 kΩ
4.7 kΩ
4.7 kΩ
HOLD
GND
PROGRAM_B PROGRAM_B
VCCO_0 INIT_B
1
VREF DONE
TMS
Xilinx Cable Header
TMS
(JTAG Interface)
TCK
TCK
TDO
TDI
TDI TDO
N.C. GND
N.C.
14
Refer to the Notes following this figure for related information. ug570_c2_02_031915
1. The DONE pin is by default an open-drain output. See Table 1-9, page 27 for DONE
signal details.
2. The INIT_B pin is a bidirectional, open-drain pin. An external pull-up resistor is required.
See Table 1-9, page 27 for INIT_B signal details.
3. CCLK signal integrity is critical.
4. DOUT should be connected to the DIN of the downstream FPGA for daisy-chained SPI x1
configuration mode. Daisy-chaining is not supported for x2, x4, or x8 master SPI
configuration modes.
5. A series resistor should be considered for the datapath from the flash to the FPGA to
minimize overshoot. The proper resistor value can be determined from simulation.
6. The FPGA V CCO_0 supply must be compatible with the VCC for the I/O of the flash device.
7. Data is clocked out of the flash on the CCLK falling edge and clocked in on the FPGA on
the rising edge, unless negative edge clocking is enabled in the Vivado Edit Device
Properties dialog.
8. The CCLK frequency is adjusted by the Vivado Configuration Rate bitstream setting
(BITSTREAM.CONFIG.CONFIGRATE) if the source is the internal oscillator. Alternatively,
the Enable External Configuration Clock option
(BITSTREAM.CONFIG.EXTMASTERCCLK_EN) can switch the CCLK to source from the
EMCCLK pin to use an external clock source. See EMCCLK Option, page 64 and File
Generation, page 71 for details.
9. The FPGA PUDC_B pin is tied to GND to enable internal pull-ups or it can be tied to
VCCO_0 to 3-state the SelectIO pins after power-up and during configuration. See
Table 1-9, page 27 for PUDC_B signal details.
The Vivado tools provide control of configuration bitstream options through Tcl command
line properties, and also provides support through a configuration dialog box. After loading
a design, you can select Tools > Edit Device Properties to edit programming and
configuration properties in the Edit Device Properties dialog box. For more details, see
Vivado Design Suite User Guide Programming and Debugging (UG908) [Ref 8].
The Vivado tools provide the ability to program a serial flash using an indirect programming
method. This downloads a new FPGA design that provides a connection from the Vivado
tools through the FPGA to the flash. Previous FPGA memory contents are lost during this
operation. For the specific densities supported by the programming tools, consult UG908
[Ref 8].
For additional details on the SPI x1, x2, and x4 operation, including programming
instructions, see SPI Configuration and Flash Programming in UltraScale FPGAs (XAPP1233)
[Ref 12]. The SPI x1 mode sequence diagram is shown in Figure 2-3.
X-Ref Target - Figure 2-3
PROGRAM_B
INIT_B
M[2:0] 001
CCLK … … … …
FCS_B
D01_DIN
Bitstream Data
(Fast Read)
DONE
1. Waveforms represent the relative sequence of events and are not to scale. See the flash
memory data sheet for detailed SPI command and data timing.
VCCINT
VCCAUX
Tie to battery supply when needed, VCCINT
otherwise tie to VCCAUX or GND; VBATT VCCAUX
see VBATT section
2.4 kΩ
4.7 kΩ
4.7 kΩ
CFGBVS
Voltage Select section VCCO_0
VCC
Tie to VCCO_0 or GND PUDC_B
D00_MOSI DQ0
D01_DIN DQ1
M2
S#
SPI Flash
M1 FCS_B
VCCO_0 VCCO_0
D02 DQ2
4.7 kΩ
M0 D03 DQ3
CCLK C
PROGRAM_B PROGRAM_B
VCCO_0 VCCO_0
UltraScale GND
VCCO_0
4.7 kΩ
4.7 kΩ
1 FPGA
VREF
TMS INIT_B
Xilinx Cable Header
TMS
(JTAG Interface)
TCK
TCK DONE
TDO
TDI
TDI TDO
N.C. GND
N.C.
14
Refer to the Notes following this figure for related information. UG570_c2_04_022218
1. The DONE pin is by default an open-drain output. See Table 1-9, page 27 for DONE
signal details.
2. The INIT_B pin is a bidirectional, open-drain pin. An external pull-up resistor is required.
VCCINT
VCCAUX
Tie to battery supply when needed, VCCINT
otherwise tie to VCCAUX or GND; VBATT VCCAUX
see VBATT section
VCCO_0 VCCO_0 VCCO_0 VCCO_0
Tie to VCCINT or GND;
POR_OVERRIDE
see Power On Reset section VCC
2.4 kΩ
4.7 kΩ
4.7 kΩ
Bank 65
Optional EMCCLK VCCO_65
VCC
D04 DQ0
D05 DQ1
SPI Flash
FCS2_B S#
(Secondary)
D06 DQ2
UltraScale D07 DQ3
FPGA C
Bank 0
Tie to VCCO_0 or GND;
GND
see Configuration Banks CFGBVS
Voltage Select section
VCCO_0 VCCO_0 VCCO_0 VCCO_0
Tie to VCCO_0 or GND PUDC_B VCCO_0
VCC
2.4 kΩ
4.7 kΩ
4.7 kΩ
M2 VCC
M1 D00_MOSI DQ0
VCCO_0
D01_DIN DQ1
VCCO_0 SPI Flash
M0 FCS_B S#
(Primary)
4.7 kΩ
D02 DQ2
D03 DQ3
PROGRAM_B PROGRAM_B CCLK C
4.7 kΩ
TMS
Xilinx Cable Header
TMS
(JTAG Interface)
TCK
TCK INIT_B
TDO
TDO
TDI DONE
TDI
N.C.
N.C.
GND
14
Refer to the Notes following this figure for related information. UG570_c2_05_022218
Figure 2-5: Master SPI Dual Quad (x8) Configuration Interface Example
1. The DONE pin is by default an open-drain output. See Table 1-9, page 27 for DONE
signal details.
2. The INIT_B pin is a bidirectional, open-drain pin. An external pull-up resistor is required.
To generate a bitstream for x8 SPI mode, the bitstream should be generated with the
property CONFIG_MODE to SPIx8. For x8 SPI configuration, the primary flash must contain
the initial portion of a configuration bitstream that includes the x8 SPI configuration
command. When the FPGA reads in this command, it will issue either Quad Output Fast
Read (6Bh) or Quad Output Fast Read, 32-bit address (6Ch) simultaneously to both the
primary and secondary flash memories. The secondary flash should contain dummy
information that is equal in size to the initial portion of the bitstream in the primary flash.
Beginning at the next address after the initial portion of the bitstream in the primary flash
and after the dummy data in the secondary flash, the configuration bitstream is split evenly
between the flash devices beginning with the first four bits in the primary flash and the next
four bits in the secondary flash. The entire configuration bitstream will then be split
between the two flash devices with the least significant nibble of each byte in the primary
flash and the most significant nibble at the same address in the secondary flash.
The x8 SPI master configuration mode requires that the flash devices be identical and
identically configured. For example, some flash devices have programmable latency or
dummy cycles via nonvolatile configuration bits that may need to be set to allow high clock
rates for the read commands. The latency cycles must be the same between the primary and
secondary flash devices in order to maintain bit alignment.
The solution supported by the UltraScale FPGAs requires the flash to boot up in a 24-bit
addressing mode for the 0Bh, 3Bh, and 6Bh commands and 32-bit addressing for the 0Ch,
3Ch, and 6Ch commands. The Vivado tool Edit Device Properties dialog box provides the
option to enable 32-bit addressing. To generate a bitstream for flash densities over 128 Mb
the property BITSTREAM.CONFIG.SPI_32BIT_ADDR should be set to Yes. See Vivado Design
Suite User Guide Programming and Debugging (UG908) [Ref 8] for details. Valid flash devices
must support the instructions in Table 2-1 (SPI Instructions and Required Opcodes) to
interface with the UltraScale FPGAs.
CCLK
MISO[3:0]
(from flash)
The configuration rate setting can be increased for a faster configuration time, if the timing
requirements discussed in this section are satisfied. When determining a valid
configuration rate setting, these timing parameters must be considered:
To maximize performance, the FPGA needs to use the falling edge clocking mode to take
advantage of the entire clock period (see SPI Configuration Timing). The following details
assume this option has been enabled in the Vivado tool Edit Device Properties dialog box.
The FPGA master configuration clock has a tolerance of FMCCKTOL. Due to the master
configuration clock tolerance (FMCCKTOL), the Vivado tool Edit Device Properties dialog
box configuration rate option must be checked so that the period for the worst-case
(fastest) master CCLK frequency is greater than the sum of the FPGA address valid time, SPI
clock low to output valid, and FPGA setup time, as shown in Equation 2-1.
1
------------------------------------------------------------------------------- ≥ T SPITCO + T SPIDCC Equation 2-1
ConfigRate × ( 1 + FMCCKTOL MAX )
The frequency tolerance of the FPGA master configuration clock can be a significant factor
in this calculation at higher CCLK rates. If maximum configuration speeds are needed, it is
recommended to use an external clock to minimize the impact of that variable. This requires
connection to the EMCCLK pin and enabling this option in the Vivado tool Edit Device
Properties dialog box.
Additional Information
For step-by-step instructions for using the SPI configuration mode with serial NOR flash,
see SPI Configuration and Flash Programming in UltraScale FPGAs (XAPP1233) [Ref 12].
For examples of how to use the SPI flash after configuration, to store non-volatile user data
or to remotely update configuration images, see UltraScale FPGA Post-Configuration Access
of SPI Flash Memory using STARTUPE3 (XAPP1280) [Ref 13].
Introduction
In serial configuration modes, the FPGA is configured by loading one configuration bit per
CCLK cycle. CCLK is an output in master serial mode and an input in slave serial mode.
Figure 3-1 shows the basic serial configuration interface.
X-Ref Target - Figure 3-1
M[2:0]
DOUT
DIN
INIT_B
PUDC_B
PROGRAM_B
DONE
CCLK
ug570_c3_01_110413
VCCINT
VCC
VCCO_0 VCCO_0
UltraScale VCCO_0 VCCO_0
FPGA
M2
4.7 k
Microprocessor M1
or CPLD
4.7 k
4.7 k
M0
Configuration PROGRAM_B
Memory
Source CLOCK CCLK DONE
SERIAL_OUT DIN
PROGRAM_B INIT_B
DONE
INIT_B
TDI TDO
GND TMS
TCK
GND
PROGRAM_B
1 VCCO_0
VREF
TMS
Xilinx Cable Header
(JTAG Interface)
TCK
TDO
TDI
N.C.
N.C.
14
Refer to the Notes following this figure for related information. ug570_c3_02_022117
1. The DONE pin is by default an open-drain output. See Table 1-9, page 27 for DONE
signal details.
2. The INIT_B pin is a bidirectional, open-drain pin. An external pull-up resistor is required.
See Table 1-9, page 27 for INIT_B signal details.
3. CCLK signal integrity is critical.
4. See the respective data sheet ([Ref 9] or [Ref 10]) for the VCCINT, VCCAUX, and VCCO_0
supply voltages.
5. The FPGA PUDC_B pin is tied to GND to enable internal pull-ups or it can be tied to
VCCO_0 to 3-state the SelectIO pins after power-up and during configuration. See
Table 1-9, page 27 for PUDC_B signal details.
The AMD Kintex UltraScale and Virtex UltraScale FPGAs support master serial mode for
configuration from legacy serial PROMs (when applicable) or for custom, CPLD-based
configuration state machines driven by the FPGA CCLK. The Artix UltraScale+, Kintex
UltraScale+, and Virtex UltraScale+ FPGAs do not support master serial mode. AMD
Platform Flash PROMs do not support UltraScale architecture-based FPGAs.
RECOMMENDED: The alternative master SPI mode is the dominant configuration mode for a low-pin
count configuration from a serial-type flash device. Master serial mode is not recommended for new
designs. See Differences Between UltraScale FPGA Families, page 9.
PROGRAM_B
INIT_B
Master CLK Begins Here(2)
CCLK
DONE
UG470_c2_03_110413
1. Bit 0 represents the MSB of the first byte. For example, if the first byte is 0xAA.
(1010_1010), bit 0 = 1, bit 1 = 0, bit 2 = 1, etc.
2. For master serial configuration mode, CCLK is driven only after INIT_B goes High to
shortly after DONE goes High. Otherwise CCLK is in a high-impedance state.
3. CCLK can be free-running in slave serial mode.
Introduction
The AMD UltraScale™ architecture-based FPGAs master BPI configuration mode enables
the use of high speed industry-standard parallel NOR flash devices for bitstream storage.
The FPGA supports a direct connection to the address, data, and control signals of a parallel
NOR flash for extracting a stored design image bitstream.
The master BPI configuration interface is represented in Figure 4-1. Detailed connections
between the FPGA and the parallel NOR flash for master BPI configuration mode are shown
in Figure 4-2 and Figure 4-4. The FPGA signals are defined in Table 1-9, page 27.
M[2:0] A[28:00]
D[15:00] CSO_B
INIT_B RS[1:0]
PUDC_B CCLK
PROGRAM_B FCS_B
EMCCLK FOE_B
FWE_B
DONE
ADV_B
ug570_c4_01_073014
When choosing a parallel NOR flash for the configuration storage several factors should be
considered:
• The storage capacity required by the application (current and migration options)
• The data bus width options for reduced configuration time
• The flash I/O voltage range
Refer to Table 1-4, page 18 for information on the bitstream size in order to determine the
minimum flash density required for configuration. The configuration pins in bank 0 and the
multi-purpose pins in bank 65 are used by the master BPI configuration mode interface and
must receive the same V CCO voltage and be compatible with the parallel NOR flash I/O
specification. The flash data sheet should be reviewed carefully to ensure the feature and
voltage requirements are supported.
Parallel NOR flash examples are provided in this section for Synchronous and Asynchronous
read options. See Vivado Design Suite User Guide Programming and Debugging (UG908)
[Ref 8] for details for flash families that are supported and can be indirectly programmed
using Vivado® tool device programmer.
Table 4-1 provides an overview of the UltraScale architecture-based FPGAs feature support
with the flash read options. Refer to Master BPI Synchronous Read, page 60 and Master BPI
Asynchronous Read, page 64 for details.
The UltraScale™ architecture-based FPGAs master BPI configuration mode can read a
bitstream from select parallel NOR devices that support burst, synchronous reads. The
master BPI configuration mode with synchronous read is the fastest direct flash
configuration option for UltraScale architecture-based FPGAs without the need for
customized external control logic.
Figure 4-2 provides the connectivity diagram between the FPGA and parallel NOR flash for
the master BPI configuration mode to support the synchronous read and the EMCCLK
(external master configuration clock). Refer to the External Master Configuration Clock
(EMCCLK) Option in Chapter 1 for configuration clock option details. Figure 4-2 supports
both synchronous and asynchronous read modes. If only asynchronous mode is required for
the application, refer to Figure 4-4 for connections that are optional.
VCCINT VCCAUX
VCCINT VCCAUX
Tie to battery supply when needed,
otherwise tie to VCCAUX or GND; VBATT
see VBATT section
Tie to VCCINT or GND;
see Power On Reset section POR_OVERRIDE
Bank 0
See Configuration Banks
Voltage Select section
for appropriate connection CFGBVS
VCCO_0
1 VCCO_0 Tie to VCCO_0 or GND VCCO_0
PUDC_B
VREF
Ribbon Cable Header
Xilinx 14-pin JTAG
TMS
TMS
IEEE 1149.1
JTAG Port
TCK
TCK
TDO
TDO
TDI
TDI VCCO_0
N/C
VCCO_0
N/C M2
14 Mode = Master BPI M1 UltraScale
4.7 kΩ
VCCO_0 M0 FPGA
VCC Parallel
4.7 kΩ
4.7 kΩ
4.7 kΩ
4.7 kΩ
4.7 kΩ
4.7 kΩ
NOR Flash DONE
VCC
VCCQ
PROGRAM_B
VCCQ RST INIT_B
VCCO_0
D[03:00]
4.7 kΩ
CLK CCLK
CE FCS_B
WP
Bank 65
WE FWE_B VCCO_65
OE FOE_B
ADV ADV_B
RS0
DQ[15:0] RS1
D[15:04]
A[n:1] A[28:00] CSO_B
PROGRAM_B
UG570_c4_02_022117
Figure 4-2: Master BPI Configuration Interface Example for x16 Synchronous Read
IMPORTANT: Review the flash vendor's data sheet carefully to ensure that the flash LSB address signal
is connected to the FPGA LSB address signal A[00].
1. The DONE pin is by default an open-drain output. See Table 1-9, page 27 for DONE
signal details.
2. The INIT_B pin is a bidirectional, open-drain pin. An external pull-up resistor is required.
See Table 1-9, page 27 for INIT signal details.
3. CCLK signal integrity is critical.
4. For the synchronous read example the x16 data bus interface is supported. x8 data bus
interface is only supported in asynchronous read mode.
5. CSO_B should be connected to the CSI_B of the downstream FPGA for parallel
daisy-chains.
6. The FPGA V CCO_0 supply must be compatible with the supply voltage for the I/O of the
selected parallel NOR device.
7. The CCLK frequency is adjusted by the Vivado Configuration Rate bitstream setting
(BITSTREAM.CONFIG.CONFIGRATE) if the source is the internal oscillator. Alternatively,
the Enable External Configuration Clock option
(BITSTREAM.CONFIG.EXTMASTERCCLK_EN) can switch the CCLK to source from the
EMCCLK pin to use an external clock source. See EMCCLK Option, page 64 and File
Generation, page 71 for details.
8. The FPGA PUDC_B pin is tied to GND to enable internal pull-ups or it can be tied to
VCCO_0 to 3-state the SelectIO pins after power-up and during configuration. See
Table 1-9, page 27 for PUDC_B signal details.
9. See the respective data sheet ([Ref 9] or [Ref 10]) for the VCCINT, VCCAUX, and VCCO_0
supply voltages.
10. The ADV_B and CCLK connections are required for synchronous read operation, but
these connections to the flash are optional for asynchronous read mode. The CCLK
output is not used to connect to flash in the asynchronous read mode, but it is used to
sample flash read data during configuration. All timing is referenced to CCLK. On setups
only targeting asynchronous read, the flash ADV_B and CLK lines must be tied to GND.
11. The RS[1:0] pins are not connected, as shown in Figure 4-2. This sample schematic
supports single bitstream configuration. These output pins are optional and can be
used for MultiBoot configuration.
12. The JTAG connections are shown for a simple, single-device JTAG scan chain. When
multiple devices are on the JTAG scan chain, use the proper IEEE Std 1149.1 daisy-chain
technique to connect the JTAG signals. The TCK signal integrity is critical for JTAG
operation. Route, terminate, and if necessary, buffer the TCK signal appropriately to
ensure signal integrity for the devices in the JTAG scan chain.
INIT_B
CCLK
FCS_B
FOE_B
FWE_B
ADV_B
D0 D2 D0 D2 D4 Dn
D1 D1 D3
UG570_c4_07_021715
First, the FPGA reads the bitstream asynchronously to determine the targeted read option.
The read always starts at the default internal CCLK rate. After the INIT_B signal is released
and the control signals FCS_B, FOE, and ADV_B are asserted with a valid address A[28:00]
then data is captured from the parallel NOR flash on the data bus D[15:0]. The FPGA reads
the bitstream header to determine the flash read option selected for reading the
configuration data. When a synchronous command is read in the bitstream header, the
FPGA configuration controller initiates an asynchronous write to the Read Configuration
Register (RCR) of the connected parallel NOR flash.
Next, the FPGA writes the flash RCR synchronous and latency bits to enable a flash
synchronous read. To perform the asynchronous write operation, the FPGA asserts the
FCS_B and FWE_B while the INIT_B and FOE_B are deasserted. The FPGA issues the Flash
Configuration register write sequence of two write cycles. The first cycle has the Read
Configuration Register (RCR) data on A[16:01] and command 0x60 on the data bus. The
second cycle has the RCR data on A[16:01] and the command 0x03 on the data bus. The
RCR values are different for the different flash families and are determined by the bitstream
options, described in File Generation, page 71.
Lastly, the FPGA switches from the asynchronous read to synchronous read protocol and
reinitiates the bitstream read. This sequence is implemented by the FPGA asserting the
FCS_B and the FOE_B signals and having ADV_B asserted for one cycle with a valid address.
The configuration data is then burst from the flash and read back by the FPGA. Once the
header information is read, the configuration clock source can change to the user selection.
IMPORTANT: It is important to understand that the flash is left in the same read mode that is used for
configuration. For example, the flash is left in synchronous read mode after the FPGA is configured in
the synchronous read mode.
EMCCLK Option
By default, the master BPI configuration mode uses an internally generated configuration
clock source CCLK. Using this clock option is convenient because an external clock
generator source is not required. However, for applications where configuration time
reduction is critical the external master configuration clock (EMCCLK) should be used. The
EMCCLK clock allows the use of a more precise external clock source than the FPGA's
internal clock with the master CCLK frequency tolerance (FMCCKTOL). UltraScale FPGAs
support the ability to dynamically switch to an external clock source (EMCCLK) when in
master BPI mode. For more details, see External Master Configuration Clock (EMCCLK)
Option in Chapter 1.
VCCINT VCCAUX
VCCINT VCCAUX
Tie to battery supply when needed,
otherwise tie to VCCAUX or GND; VBATT
see VBATT section
TMS
TMS
IEEE 1149.1
JTAG Port
TCK
TCK
TDO
TDO
TDI VCCO_0
TDI
N/C VCCO_0
N/C M2
Mode = Master BPI M1
4.7 kΩ
14
VCCO_0 M0 UltraScale
VCC Parallel FPGA
4.7 kΩ
4.7 kΩ
4.7 kΩ
4.7 kΩ
4.7 kΩ
NOR Flash DONE
VCC
VCCQ
PROGRAM_B
VCCQ RST INIT_B
VCCO_0
D[03:00]
4.7 kΩ
N/C CCLK
WP CE FCS_B
VCCO_0
Bank 65
WE FWE_B
10 kΩ
VCCO_65
OE FOE_B
RY/BY RS0
VCCO_0 RS1
DQ[14:0] D[14:04]
4.7 kΩ
PROGRAM_B UG570_c4_04_022117
Figure 4-4: Master BPI Configuration Interface Example for x16 Asynchronous Read
1. Parallel NOR flash that have a BYTE# signal must set the BYTE# signal appropriately. For
x16 data bus width the BYTE# signal must be set High. For x8 data bus width the BYTE#
signal must be set Low. Refer to the flash data sheet for details.
2. Review the flash vendor's data sheet carefully to ensure that the flash LSB address signal
is connected correctly depending on the vendor and data bus width used. Parallel NOR
flash with the dual purpose DQ15/A-1 signal must ensure that it is connected properly.
The DQ15/A-1 is a data pin in x16 mode. For the x8 mode the flash DQ15/A-1 is an LSB
address line and needs to be connected to the FPGA A00..
3. The DONE pin is by default an open-drain output. See Table 1-9, page 27 for DONE
signal details.
4. The INIT_B pin is a bidirectional, open-drain pin. An external pull-up resistor is required.
See Table 1-9, page 27 for INIT signal details.
5. The x16 BPI interface is shown in Figure 4-4. For x8 BPI interfaces, only D[07:00] are
used.
6. The flash vendor data sheet should be referred to for flash signal connectivity details.
Ensure the FPGA LSB A00 is aligned to the flash LSB address (dependent on the flash
family and data width selected).
7. CSO_B should be connected to the CSI_B of the downstream FPGA for parallel
daisy-chains.
8. The FPGA VCCO_0 supply must be compatible with the VCC for the I/O of the selected
parallel NOR device.
9. The CCLK frequency is adjusted by the Vivado Configuration Rate bitstream setting
(BITSTREAM.CONFIG.CONFIGRATE) if the source is the internal oscillator. Alternatively,
the Enable External Configuration Clock option
(BITSTREAM.CONFIG.EXTMASTERCCLK_EN) can switch the CCLK to source from the
EMCCLK pin to use an external clock source. See EMCCLK Option, page 64 and File
Generation, page 71 for details.
10. The FPGA PUDC_B pin is tied to GND to enable internal pull-ups or it can be tied to
VCCO_0 to 3-state the SelectIO pins after power-up and during configuration. See
Table 1-9, page 27 for PUDC_B signal details.
11. See the respective data sheet ([Ref 9] or [Ref 10]) for the VCCINT, VCCAUX, and VCCO_0
supply voltages.
12. ADV_B and CCLK connections are available on some supported flash families, but the
connections are not required for asynchronous read operation. The CCLK output is not
used to connect to flash in the asynchronous read mode, but it is used to sample flash
read data during configuration. All timing is referenced to CCLK. On asynchronous read
setups, if the flash has ADV_B and CLK lines, they must be tied to GND.
13. The RS[1:0] pins are not connected, as shown in Figure 4-4. This sample schematic
supports single bitstream configuration. These output pins are optional and can be
used for MultiBoot configuration.
14. The JTAG connections are shown for a simple, single-device JTAG scan chain. When
multiple devices are on the JTAG scan chain, use the proper IEEE Std 1149.1 daisy-chain
technique to connect the JTAG signals. The TCK signal integrity is critical for JTAG
operation. Route, terminate, and if necessary, buffer the TCK signal appropriately to
ensure signal integrity for the devices in the JTAG scan chain.
drives the flash control signals FWE_B High, FOE_B Low, and FCS_B Low. Although the CCLK
output is not required to be connected to the parallel NOR flash device for asynchronous
read, the FPGA outputs an address after the rising edge of CCLK, and the data is still
sampled on the next rising edge of CCLK. In the master BPI mode with asynchronous read,
the address starts at 0 and increments by 1 until the DONE pin is asserted. If the address
reaches the maximum value (29'h1FFFFFFF) and configuration is not done (DONE is not
asserted), a wraparound error flag is raised in the Status register, and fallback
reconfiguration starts.
X-Ref Target - Figure 4-5
PROGRAM_B
INIT_B
FWE_B 1
FOE_B 0
FCS_B 0
A[n:00] A0 A1 An
CCLK
D[n:00] D0 D1 Dn
RS[1:0] Z
CSO_B Z
DONE
UG570_c4_05_110613
In the asynchronous page read sequence, the first word read from a multiword page takes
a standard asynchronous read time and a subsequent read of another word from the same
page takes significantly less time. The sequence of the page read operation is controlled by
the FPGA bitstream. The generation of a bitstream with page read support requires the
setting of multiple bitstream properties to take advantage of page read and maximize the
CCLK frequency. The page size bitstream property determines the number of words in each
page. Words other than the first word of each page are read in one master CCLK cycle. The
first read cycle bitstream property determines the number of CCLK cycles that are allotted
to reading the first word of each page. Refer to File Generation, page 71 for details on how
to set the page size and first read cycle.
After an FPGA reset, the default page size is 1, the first access CCLK is 1, and the master
CCLK is running at the slowest default frequency. The configuration register (COR1)
contains parallel NOR flash page read control bits. After the COR1 register is programmed,
the BPI address timing switches at the page boundary as shown in Figure 4-6. When the
command is received, the master CCLK switches to a user-desired frequency, using it to load
the rest of the configuration.
X-Ref Target - Figure 4-6
CCLK
FCS_B
FOE_B
FWE_B
A[2:0] 7 0 1 2 3 4 5 6 7
D[n:0] D0 D1 D2 D3 D4 D5 D6 D7
CCLK = 2 CCLK = 2
PAGE_SIZE = 4 PAGE_SIZE = 4
UG570_c4_06_110613
Figure 4-6: Master BPI Configuration Mode Page Read Waveform (Page Size=4, First Access CCLK=2)
Configuration Time
Synchronous Read
For the fastest parallel NOR flash configuration time, use the master BPI Configuration
mode synchronous x16 read option with the EMCCLK. There are several system factors that
must be considered when you determine the maximum configuration clock rate for
synchronous reads in your application. These following parameters should be considered:
The following example utilizes the EMCCLK. The parallel NOR flash clock-to-out and the
FPGA setup data sheet specifications are used to determine the maximum EMCCLK
frequency. Board trace delay is also another factor that should be considered. An estimation
for the maximum BPI Fast Configuration EMCCLK can be calculated with Equation 4-1 and
must be less than the supported EMCCLK frequency (FEMCCK) specified in the FPGA data
sheet.
1
MaxFreq = -------------------------------------------------------------------------------------------------------------------------------------------------------------- Equation 4-1
FlashClockToOut ( T CHQV ) + FPGADataSetup ( T BPIDCC ) + BoardDelay
External master configuration clock frequency (EMCCLK Rate) or FPGA nominal master CCLK
frequency (configuration Rate)
The BPI Page Size option is set to the number of words in a page, as defined by the Parallel
NOR flash data sheet. Because the Parallel NOR flash has different timing for the read of the
first word of a page, and for the subsequent read of a word from the same page, two timing
checks are required in order to ensure the validity of the Configuration Rate and First Read
Cycle attribute values.
First, the Configuration Rate setting must be checked. The period for the worst-case
(fastest) master CCLK frequency must be greater than the sum of the FPGA address valid
time, flash page access time, and FPGA setup time, as shown in Equation 4-2.
1
------------------------------------------------------------------------------- ≥ T BPICCO + T APA + T BPIDCC Equation 4-2
ConfigRate × ( 1 + FMCCKTOL MAX )
Second, the First Read Cycle must be checked. The First Read Cycle option specifies the
number of FPGA CCLK cycles allocated for the reading of the first word of each page. The
duration of the first read cycle is equivalent to the period of one CCLK cycle multiplied by
the First Read Cycle option value. The worst-case allocated duration of the first read cycle
must be greater than the sum of the FPGA address valid time, Parallel NOR flash read access
time, and FPGA setup time, as shown in Equation 4-3.
BPIFirstReadCycle
------------------------------------------------------------------------------ ≥ T BPICCO + T ACC + T BPIDCC Equation 4-3
ConfigRate × ( 1 + FMCCKTOL MAX )
Asynchronous Read
The master BPI mode asynchronous read is the simplest parallel NOR flash setup, and
significantly slower configuration times than the other read options. For the asynchronous
read calculation, the following parameters must be considered:
• External master configuration clock frequency (EMCCLK Rate) or FPGA nominal master
CCLK frequency (Configuration Rate)
• External master configuration clock frequency tolerance (EMCCLK Tolerance) or FPGA
master CCLK frequency tolerance (FMCCKTOL)
• FPGA CCLK rising edge to address valid (T BPICCO)
• Parallel NOR flash address to output valid (access) time (TACC)
• FPGA data setup time (TBPIDCC)
Equation 4-4 is the basic calculation to determine the configuration clock rate or EMCCLK
rate:
1
------------------------------------------------------------------------------ ≥ T BPICCO + T ACC + T BPIDCC Equation 4-4
ConfigRate × ( 1 + FMCCKTOL MAX )
flash power-on sequence or power-on ramps is essential. The parallel NOR flash interface
signals are within FPGA dedicated bank 0 and I/O bank 65.
The FPGA sends the address to the parallel NOR flash to acquire the bitstream after the
FPGA has completed its power-on reset sequence. The parallel NOR flash is not ready to
receive an address until the parallel NOR flash power-on reset sequence has completed.
Under specific conditions when the VCC power supply to the parallel NOR flash powers up
after the FPGA V CCINT and VCCAUX power supplies, the FPGA address counter can pass the
critical start of the bitstream within the parallel NOR flash before the flash becomes
responsive. The system must be designed such that the parallel NOR flash is ready to
receive the address before the FPGA sends the address. For more details, see Power-On
Sequence Precautions for Flash in Chapter 1.
File Generation
To implement a master BPI configuration mode solution you must create a design
bitstream, then convert the bitstream into a flash programming file, and finally program the
parallel NOR flash device. The following key properties should be reviewed when
generating a bitstream for the master BPI configuration mode. These properties are also
available through the Vivado tool Edit Device Properties dialog box. Refer to Vivado
Design Suite User Guide Programming and Debugging (UG908) [Ref 8] for more details.
set_property BITSTREAM.CONFIG.EXTMASTERCCLK_EN
Disable|Div-1|Div-2|Div-3|Div-4|Div-6|Div-8|Div-12|Div-16|Div-24|Div-48
For faster performance, synchronous reads can be enabled for select parallel NOR flash.
Specify the property BITSTREAM.CONFIG.BPI_SYNC_MODE with Type1 option or Type2
option according to what the selected family supports.
If master BPI configuration with asynchronous read is required, but a faster performance is
desired, the page mode and read cycle options can be used. To enable these features use
the properties BITSTREAM.CONFIG.BPI_PAGE_SIZE and
BITSTREAM.CONFIG.BPI_1ST_READ_CYCLE:
• Page sizes are 1 (default), 4, or 8. If the actual flash page size is larger than 8, the value
of 8 should be used to maximize the efficiency.
• First access CCLK cycles of 1 (default), 2, 3, or 4. CCLK cycles must be 1 if the page size
is 1.
After bitstream generation it is also important during flash programming file generation to
ensure the data ordering is setup correctly. On AMD FPGAs, data bit D00 is the
most-significant bit (MSB) and bit D15 is the least significant bit (LSB). Consequently, it is
crucial to understand how the data ordering in the configuration data file corresponds to
the data ordering expected by the FPGA. UltraScale FPGA bitstream files (.bit, .rbt) are never
bit-swapped. By default, for the BPI and SelectMAP modes the .mcs file formats are
bit-swapped (see Bit Swapping, page 141). This convention is consistent across all AMD
FPGAs. The master BPI configuration mode data ordering is the same as the SelectMAP data
ordering. During the flash programming file generation the data bus width option must be
set to x8 or x16 appropriately based on the target parallel NOR flash. Refer to Vivado Design
Suite User Guide Programming and Debugging (UG908) [Ref 8] for details.
° This method is popular if programming must be done on-board. This method can
accommodate multiple design iterations and is extremely useful for debugging in a
lab environment. The Vivado programming tool will provide the ability to program a
parallel NOR flash indirectly. An FPGA design bitstream is downloaded first to
provide a connection from the Vivado tools through the FPGA to the parallel NOR
flash. When using this method it is important to recognize that the previous FPGA
memory design contents are lost during the flash operations. I/O signals that are
not a part of the master BPI configuration mode interface are disabled. You must
understand the behavior of the FPGA during this process and how it can affect
other devices in the system. Refer to Vivado Design Suite User Guide Programming
and Debugging (UG908) [Ref 8] for the specific flash family members supported by
the programming tools.
For step-by-step instructions for using the BPI configuration mode with parallel NOR flash,
see UltraScale FPGA BPI Configuration and Flash Programming (XAPP1220) [Ref 14].
Introduction
The SelectMAP configuration interface provides an 8-bit, 16-bit, or 32-bit bidirectional data
bus interface to the FPGA configuration logic that can be used for both configuration and
readback. Both master SelectMAP and slave SelectMAP interfaces are supported. See
Differences Between UltraScale FPGA Families, page 9.
RECOMMENDED: The alternative master BPI mode is the dominant configuration mode for
configuration from a parallel-type flash device. Master SelectMAP mode is not recommended for new
designs. See Differences Between UltraScale FPGA Families, page 9.
Readback and the read direction of the data bus are applicable only to slave SelectMAP
mode. The bus width of SelectMAP is automatically detected. One or more devices can be
configured through the SelectMAP bus.
Typical setup includes a processor providing data and clock. Alternatively, another
programmable logic device, such as a CPLD, can be used as a configuration manager
that configures the FPGA through the FPGA slave SelectMAP interface.
Multiple FPGAs are configured in series with different images from a flash memory or
processor.
Multiple FPGAs are configured in parallel with the same image from a flash memory or
processor.
The basic master SelectMAP and slave SelectMAP configuration methods are described in
this chapter.
The SelectMAP configuration interface pins shown in Figure 5-1 are defined in Table 1-9,
page 27.
X-Ref Target - Figure 5-1
M[2:0]
D[31:00]
INIT_B
PUDC_B CSO_B
PROGRAM_B
RDWR_B
CSI_B DONE
CCLK
UG570_c5_01_120913
VCCINT
VCCINT VCCAUX
Tie to battery supply when needed,
otherwise tie to VCCAUX or GND; VBATT VCCAUX
see VBATT section
Tie to VCCINT or GND;
VCC see Power On Reset section POR_OVERRIDE
VCCO_0
Bank 65
VCC VCCO_65
SELECT CSI_B
CSO_B
D[31:04]
D[31:0] D[03:00] Bank 0 VCCO_0
Tie to VCCO_0 or GND;
VCCO_0 see Configuration Banks VCCO_0
CFGBVS VCCO_0
Microprocessor 4.7 kΩ Voltage Select section UltraScale
4.7 kΩ
or CPLD FPGA
Configuration Tie to VCCO_0 or GND PUDC_B
Memory
Source READ/WRITE RDWR_B
CLOCK CCLK INIT_B
PROGRAM_B PROGRAM_B
VCCO_0 VCCO_0
4.7 kΩ
DONE M2
M1
INIT_B
M0
GND DONE
TMS
TCK
TDI TDO
GND
PROGRAM_B
1 VCCO_0
VREF
X i l in x Ca b l e He ad e r
(J TA G In te rfac e)
TMS
TCK
TDO
TDI
N.C .
N.C .
14
ug570_c2_05_031915
1. Refer to Using a Microprocessor to Configure 7 Series FPGAs via Slave Serial or Slave
SelectMAP Mode (XAPP583) [Ref 15], for a discussion of one possible implementation.
2. The processor or CPLD I/O needs to support a voltage that is compatible with the
connected FPGA pins.
3. The DONE pin is an open-drain output. See Table 1-9, page 27 for DONE signal details.
4. The INIT_B pin is a bidirectional, open-drain pin. An external pull-up resistor is required.
See Table 1-9, page 27 for INIT_B signal details.
5. The CSI_B and RDWR_B signals can be tied to GND if only one FPGA is going to be
configured and readback is not needed.
6. CCLK signal integrity is critical.
7. Data bus width can be x8, x16, or x32 for slave SelectMAP configuration.
8. The FPGA PUDC_B pin is tied to GND to enable internal pull-ups or it can be tied to
VCCO_0 to 3-state the SelectIO pins after power-up and during configuration. See
Table 1-9, page 27 for PUDC_B signal details.
CSI_B
The chip select input (CSI_B) enables the SelectMAP bus. When CSI_B is High, the FPGA
ignores the SelectMAP interface, neither registering any inputs nor driving any outputs. The
D[31:00] pins are placed in a High-Z state, and RDWR_B is ignored.
If only one device is being configured through the SelectMAP interface and readback is not
required, the CSI_B signal can be tied to ground.
RDWR_B
RDWR_B is an input to the FPGA that controls whether the data pins are inputs or outputs:
For configuration, RDWR_B must be set for write control (RDWR_B = 0). For readback,
RDWR_B must be set for read control (RDWR_B = 1) while CSI_B is asserted.
3D ICs do not support the ABORT sequence. In monolithic devices, changing the value of
RDWR_B from Low to High while CSI_B is Low triggers an ABORT, and the configuration I/O
changes from input to output asynchronously. The ABORT status appears on the data pins
synchronously. Changing the value of RDWR_B from High to Low while CSI_B is Low also
triggers an ABORT, and the configuration I/O changes from output to input asynchronously
with no ABORT status readback. If readback is not needed, RDWR_B can be tied to ground
or used for debugging with SelectMAP ABORT.
The RDWR_B signal is ignored while CSI_B is de-asserted. Read/write control of the
3-stating of the data pins is asynchronous. The FPGA actively drives SelectMAP data without
regard to CCLK if RDWR_B is set for read control (RDWR_B = 1, Readback) while CSI_B is
asserted.
CCLK
All activity on the SelectMAP data bus is synchronous to CCLK. When RDWR_B is set for
write control (RDWR_B = 0, Configuration), the FPGA samples the SelectMAP data pins on
rising CCLK edges. When RDWR_B is set for read control (RDWR_B = 1, Readback), the FPGA
updates the SelectMAP data pins on rising CCLK edges.
On the next rising CCLK edge, the device begins sampling the data pins. Only D[07:00] are
sampled by configuration until the bus width is determined. After bus width is determined,
the proper width of the data bus is sampled for the Synchronization word search.
Configuration begins after the synchronization word is clocked into the device.
After the configuration bitstream is loaded, the device enters the startup sequence. The
device asserts its DONE signal High in the phase of the startup sequence that is specified by
the bitstream. The configuration controller should continue sending CCLK pulses until after
the startup sequence has finished. This can require several CCLK pulses after DONE goes
High.
After configuration, the CSI_B and RDWR_B signals can be de-asserted, or they can remain
asserted. Because the SelectMAP port is inactive, toggling RDWR_B at this time does not
cause an ABORT. Figure 5-3 summarizes the timing of SelectMAP configuration with
continuous data loading.
PROGRAM_B
(3)
INIT_B
CCLK
(1) (5) (11)
CSI_B
(2) (4) (12)
RDWR_B
(6) (7) (8) (9)
D[07:00] Byte 0 Byte 1 Byte n
(10)
DONE
7UG570_c5_03_100614
1. CSI_B signal can be tied Low if there is only one device on the SelectMAP bus. If CSI_B
is not tied Low, it can be asserted at any time.
2. RDWR_B can be tied Low if readback is not needed. RDWR_B should not be toggled after
CSI_B has been asserted because this triggers an ABORT on the next CCLK.
3. The Mode pins are sampled when INIT_B goes High.
4. RDWR_B should be asserted before CSI_B to avoid causing an ABORT on the next CCLK.
5. CSI_B is asserted, enabling the SelectMAP interface.
6. The first byte is loaded on the first rising CCLK edge after CSI_B is asserted.
7. The configuration bitstream is loaded one byte per rising CCLK edge.
8. After the startup command is loaded, the device enters the startup sequence.
9. The startup sequence lasts a minimum of eight CCLK cycles.
10. The DONE pin goes High during the startup sequence. Additional CCLKs can be required
to complete the startup sequence.
11. After configuration has finished, the CSI_B signal can be deasserted.
12. After the CSI_B signal is deasserted, RDWR_B can be deasserted.
13. The data bus can be x8, x16, or x32 (for slave SelectMAP).
Configuration can be paused in two ways: by deasserting the CSI_B signal (Free-Running
CCLK method, Figure 5-4) or by halting CCLK (Controlled CCLK method, Figure 5-5). For
encrypted bitstreams using an obfuscated key with the SelectMAP or ICAP interface, do not
pause bitstream loading by temporary de-assertion of the configuration interface
chip-select (CSI_B). Instead, keep CSI_B asserted and stop the CCLK to pause bitstream
loading. See Answer 73656 for details.
X-Ref Target - Figure 5-4
PROGRAM_B
(2)
INIT_B
(3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
CCLK
CSI_B
D[n:00]
(1)
RDWR_B
UG570_c5_04_100614
1. RDWR_B is driven Low by the user, setting the D[n:00] pins as inputs for configuration.
RDWR_B can be tied Low if readback is not needed. RDWR_B should not be toggled after
CSI_B has been asserted because this triggers an ABORT on the next CCLK.
2. The device is ready for configuration after INIT_B goes High.
3. A byte is loaded on the rising CCLK edge. The data bus can be x8, x16, or x32 wide (for
slave SelectMAP).
4. A byte is loaded on the rising CCLK edge.
5. The user deasserts CSI_B, and the byte is ignored.
6. The user deasserts CSI_B, and the byte is ignored.
7. A byte is loaded on the rising CCLK edge.
8. A byte is loaded on the rising CCLK edge.
9. The user deasserts CSI_B, and the byte is ignored.
CCLK
(3)
CSI_B
(2)
RDWR_B
(1)
D[n:00] Byte 0 Byte 1 Byte n
UG570_c5_05_101414
1. The Data pins are in the High-Z state while CSI_B is deasserted. The data bus can be x8,
x16, or x32 (for slave SelectMAP).
2. RDWR_B has no effect on the device while CSI_B is deasserted.
3. CSI_B is asserted by the user. The device begins loading configuration data on rising
CCLK edges.
4. A byte is loaded on the rising CCLK edge.
5. A byte is loaded on the rising CCLK edge.
6. A byte is loaded on the rising CCLK edge.
In SelectMAP x8 mode, configuration data is loaded at one byte per CCLK, with the MSB of
each byte presented to the D00 pin. This convention (D00 = MSB, D07 = LSB) differs from
many other devices. This convention can be a source of confusion when designing custom
configuration solutions. Table 5-1 shows how to load the hexadecimal value 0xABCD into
the SelectMAP data bus.
Notes:
1. D[07:00] represent the SelectMAP DATA pins.
Table 5-2 shows the bit ordering for the SelectMAP x8, x16, and x32 data bus widths.
x16 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7
x8 0 1 2 3 4 5 6 7
SelectMAP ABORT
3D ICs do not support the ABORT sequence. In monolithic devices an ABORT is an
interruption in the SelectMAP configuration or readback sequence occurring when the state
of RDWR_B changes while CSI_B is asserted as sampled by CCLK. During a configuration
ABORT, internal status is driven onto the D[04:07] pins over the next four CCLK cycles. The
other D pins are always High. After the ABORT sequence finishes, the user can
resynchronize the configuration logic and resume configuration. For applications that must
deassert RDWR_B between bytes, see the Controlled CCLK method shown in Figure 5-5.
CCLK
CSI_B
RDWR_B
DATA[00:07] STATUS
ABORT UG470_c02_09_110413
CCLK
CSI_B
RDWR_B
DATA[00:07] DATA
ABORT
UG470_c02_10_110413
ABORTs during readback are not followed by a status word because the RDWR_B signal is
set for write control (FPGA D[x:00] pins are inputs).
The ABORT sequence lasts four CCLK cycles. During those cycles, the status word changes
to reflect data alignment and ABORT status. A typical sequence might be:
After the last cycle, the synchronization word can be reloaded to establish data alignment.
Introduction
AMD Kintex® UltraScale™ and Virtex® UltraScale FPGAs support IEEE standards 1149.1 and
1149.6 (ACJTAG), defining a Test Access Port (TAP) and boundary-scan architecture. The Test
Access Port and boundary-scan architecture is commonly referred to collectively as JTAG.
JTAG is an acronym for the Joint Test Action Group, the technical subcommittee initially
responsible for developing this standard. The boundary-scan architecture is used to ensure
the board-level integrity of individual components and the interconnections between them.
With multi-layer PC boards becoming increasingly dense and with more sophisticated
surface mounting techniques in use, boundary-scan testing is becoming widely used as an
important debugging tool.
Devices containing boundary-scan logic can send data out on I/O pins to test connections
between devices at the board level. The circuitry can also be used to send signals internally
to test the device-specific behavior. These tests are commonly used to detect opens and
shorts at both the board and device level.
For compliance with the pre-configuration BSDL file description, PUDC_B should be tied to
VCCO_0 to disable pull-ups, which matches the pre-configuration BSDL file disable result
description of ‘Z’ for when a boundary-scan controller disables the output to a pin.
Otherwise, if PUDC_B is tied to GND, then pre-configuration weak pull-up resistors are
enabled and the corresponding output disable result of ‘PULL1’ in the BSDL file is a more
accurate match to the device pin behavior. However, to maximize boundary-scan tests for
external pull-up or pull-down resistors with pre-configured devices, the default disable
result value ‘Z’ in the pre-configuration BSDL file is recommended for both settings of
PUDC_B, and all external pull-down resistors must be sufficiently strong to override
potential internal pull-ups that are enabled when PUDC_B is tied to GND.
In addition, for compliance the PROGRAM_B pin must be High to have proper JTAG
functionality. For example, if the PROGRAM_B pin is held Low, the IDCODE value might be
incorrect.
Notes:
1. TMS and TDI have default weak internal pull-up resistors, as specified by the IEEE Std 1149.1, as do TDO and TCK. These
internal pull-up resistors are active, regardless of the mode selected. Refer to the data sheet for internal pull-up values.
TMS
TDI
TTAPTCK TTCKTAP
TCK
TTCKTDO
Figure 6-2 shows a typical JTAG setup with the simple connections required to attach a
single device to a JTAG signal header, which can be driven from a processor, or a AMD
programming cable under control of the configuration tools. TCK is the clock used for
boundary-scan operations. The TDO-TDI connections create a serial datapath for shifting
data through the JTAG chain. TMS controls the transition between states in the TAP
controller. Proper physical connections of all of these signals are essential to JTAG
functionality.
JTAG Header
TDO FPGA
UG570_c6_02_111113
JTAG Header
TDO
Additionally, if the chain is large (three devices or more), TMS and TCK should be buffered
to ensure that they have sufficient drive strength at all receivers, and the voltage at logic
High must be compatible with all devices in the chain.
When interfacing to devices from other manufacturers, optional JTAG signals can be
present (such as TRST and enables) and might need to be driven.
Providing Power
To ensure proper power-on behavior, the guidelines in the data sheet must be followed. The
power supplies should ramp monotonically within the power supply ramp time range
specified in the data sheet. All supply voltages should be within the recommended
operating ranges; any dips below the data retention values in the data sheet can result in
loss of configuration data.
Test-Logic-Reset
1
0
1 1
Select Next State 0
Run-Test/Idle Select-DR Select-IR
TMS 0 0
1 1
Capture-DR Capture-IR
0 0
Shift-DR Shift-IR
Shift-IR/Shift-DR
0 0
1 1
1 1
Exit1-DR Exit1-IR
0 0
Pause-DR Pause-IR
0 0
1 1
Exit2-DR Exit2-IR
0 0
1 1
Update-DR Update-IR
TCK 1
0
1
0
Select Data
Instruction Decoder Register
TDO
Bypass[1] Register
IDCODE[32] Register
Boundary[N] Register
Figure 6-5 shows a 16-state finite state machine. The four TAP pins control how data is
scanned into the various registers. The state of the TMS pin at the rising edge of TCK
determines the sequence of state transitions. There are two main sequences, one for
shifting data into the data register and the other for shifting an instruction into the
Instruction register.
A transition between the states only occurs on the rising edge of TCK, and each state has a
different name. The two vertical columns with seven states each represent the Instruction
Path and the Data Path. The data registers operate in the states whose names end with “DR,”
and the Instruction register operates in the states whose names end in “IR.” The states are
otherwise identical.
X-Ref Target - Figure 6-5
1 TEST-LOGIC-RESET
0
1 1 1
0 RUN-TEST/IDLE SELECT-DR-SCAN SELECT-IR-SCAN
0 0
1 1
CAPTURE-DR CAPTURE-IR
0 0
SHIFT-DR 0 SHIFT-IR 0
1 1
EXIT1-DR 1 EXIT1-IR 1
0 0
PAUSE-DR 0 PAUSE-IR 0
1 1
0 0
EXIT2-DR EXIT2-IR
1 1
UPDATE-DR UPDATE-IR
1 0 1 0
NOTE: The value shown adjacent to each state transition in this figure
represents the signal present at TMS at the time of a rising edge at TCK.
UG570_c6_05_111313
Test-Logic-Reset
All test logic is disabled in this controller state, enabling the normal operation of the IC. The
TAP controller state machine is designed so that regardless of the initial state of the
controller, the Test-Logic-Reset state can be entered by holding TMS High and pulsing TCK
five times. Consequently, the Test Reset (TRST) pin is optional.
Run-Test-Idle
In this controller state, the test logic in the IC is active only if certain instructions are
present. For example, if an instruction activates the self test, then it is executed when the
controller enters this state. The test logic in the IC is idle otherwise.
Select-DR-Scan
This controller state controls whether to enter the Data Path or the Select-IR-Scan state.
Select-IR-Scan
This controller state controls whether or not to enter the Instruction Path. The controller can
return to the Test-Logic-Reset state otherwise.
Capture-IR
In this controller state, the shift register bank in the Instruction Register parallel-loads a
pattern of fixed values on the rising edge of TCK. The last two significant bits must always
be 01.
Shift-IR
In this controller state, the Instruction register gets connected between TDI and TDO, and
the captured pattern gets shifted on each rising edge of TCK. The instruction available on
the TDI pin is also shifted in to the Instruction register.
Exit1-IR
This controller state controls whether to enter the Pause-IR state or Update-IR state.
Pause-IR
This state allows the shifting of the Instruction register to be temporarily halted.
Exit2-DR
This controller state controls whether to enter either the Shift-DR state or Update-DR state.
Update-IR
In this controller state, the instruction in the Instruction register is latched to the latch bank
of the Instruction register on every falling edge of TCK. This instruction becomes the current
instruction after it is latched.
Capture-DR
In this controller state, the data is parallel-loaded into the data registers selected by the
current instruction on the rising edge of TCK.
These controller states are similar to the Shift-IR, Exit1-IR, Pause-IR, Exit2-IR, and Update-IR
states in the Instruction path.
UltraScale FPGAs support the mandatory IEEE Std 1149.1 commands as well as several AMD
vendor-specific commands. The EXTEST, SAMPLE/PRELOAD, BYPASS, IDCODE, and
USERCODE instructions are all included. The TAP also supports internal user-defined
registers (USER1, USER2, USER3, and USER4) and configuration/readback of the device.
INTEST is not supported. The HIGHZ_IO command is similar to the standard HIGHZ
command but only disables the user I/O pins.
For details on the standard boundary-scan instructions EXTEST and BYPASS, refer to IEEE
Std 1149.1.
Notes:
1. The Instruction register size increases in the devices based on SSI technology. See the BSDL files for
device-specific information.
Boundary Register
The test primary data register is the Boundary register. Boundary-scan operation is
independent of individual IOB configurations. Each IOB, bonded or unbonded, starts as
bidirectional with 3-state control. Later, it can be configured to be an input, output, or
3-state only. Therefore, three data register bits are provided per IOB. Figure 6-6 is a
representation of the UltraScale FPGA boundary-scan architecture.
TDI
1 sd
D Q D Q
0
LE
1
IOB.I
0
1x sd
00 D Q D Q
01
LE
1
IOB.O 0
IOB.T 0
1x sd
00 D Q D Q 1
01
LE
EXTEST
The update latch is opened each time the TAP controller enters the UPDATE-DR state. Care
is necessary when exercising an EXTEST to ensure that the proper data has been latched
before exercising the command. This is typically accomplished by using the
SAMPLE/PRELOAD instruction.
Internal pull-up and pull-down resistors should be considered when test vectors are being
developed for testing opens and shorts. The PUDC_B pin determines whether the IOB has a
pull-up resistor.
This section describes the order of each non-TAP IOB. The input is first, the output second,
and the 3-state IOB control third. The 3-state IOB control is closest to the TDO. The input-
only pins contribute only the input bit to the boundary-scan I/O data register. The bit
sequence of the device is obtainable from the Boundary-Scan Description Language Files
(BSDL files) for the UltraScale FPGAs. (The BSDL files can be obtained from the AMD
download area and represent an unconfigured FPGA.) The bit sequence always has the same
bit order and the same number of bits and is independent of the design.
For boundary-scan testing with a configured FPGA, AMD offers the write_bsdl utility to
automatically modify the BSDL file for post-configuration interconnect testing. The
write_bsdl utility obtains the necessary FPGA design information from the implemented
design, and generates a BSDL file that reflects the post-configuration boundary-scan
architecture of the device.
Instruction Register
The Instruction register (IR) is connected between TDI and TDO during an instruction scan
sequence. In preparation for an instruction scan sequence, the Instruction register is
parallel-loaded with a fixed instruction capture pattern. This pattern is shifted out onto TDO
(LSB first), while an instruction is shifted into the Instruction register from TDI.
To determine the operation to be invoked, an OPCODE necessary for the UltraScale FPGA
boundary-scan instruction set is loaded into the Instruction register. The IR is 6 bits wide for
monolithic UltraScale FPGAs. See Table 1-5, page 19 for IR length for other devices.
Table 6-3 describes the boundary scan instructions for UltraScale FPGAs. See the BSDL files
for commands and codes for UltraScale+ FPGAs. See Table 8-3, page 132 for eFUSE related
instructions.
Notes:
1. Instruction register is larger for devices based on SSI technology (see Table 1-5, page 19). See the BSDL files for
device-specific information.
Table 6-4 shows the instruction capture values loaded into the IR as part of an instruction
scan sequence.
Table 6-4: UltraScale FPGA Instruction Capture Values Loaded into IR as Part of an Instruction
Scan Sequence
IR[5] IR[4] IR[3] IR[2] IR[1:0]
TDI → TDO
DONE INIT (1) ISC_ENABLED ISC_DONE 01
Notes:
1. INIT is the INIT_B_INTERNAL_SIGNAL_STATUS bit.
2. Instruction register is larger for devices based on SSI technology (see Table 1-5, page 19). See the BSDL files for
device-specific information.
Bypass Register
The other standard data register is the single flip-flop Bypass register. It passes data serially
from the TDI pin to the TDO pin during a BYPASS instruction. This register is initialized to
zero when the TAP controller is in the CAPTURE-DR state.
The least significant bit of the IDCODE register is always 1 (based on JTAG IEEE 1149.1). The
last three hex digits appear as 0x093 (see Table 1-5, page 19).
USERCODE Register
The USERCODE instruction is supported in the UltraScale FPGAs. This register allows a user
to specify a design-specific identification code. The USERCODE can be programmed into
the device and can be read back for verification later. The USERCODE is embedded into the
bitstream during bitstream generation (USERID property) and is valid only after
configuration. If the device is blank or the USERCODE was not programmed, the USERCODE
register contains 0xFFFFFFFF.
VCCINT
VCCAUX
Tie to battery supply when needed, VCCINT
otherwise tie to VCCAUX or GND; VBATT VCCAUX
see VBATT section
Tie to VCCINT or GND;
see Power On Reset section POR_OVERRIDE
Bank 0 VCCO_0
Tie to VCCO_0 or GND;
see Configuration Banks CFGBVS VCCO_0
Voltage Select section
M2
M1
UltraScale VCCO_0 VCCO_0
VCCO_0 FPGA
4.7 kΩ
4.7 kΩ
M0
4.7 kΩ
INIT_B
PROGRAM_B PROGRAM_B
DONE
1 VCCO_0
VREF
TMS
Xilinx Cable Header
TMS
(JTAG Interface)
TCK
TCK
TDO
TDI
TDI TDO
N.C. GND
N.C.
14
Refer to the Notes following this figure for related information. UG570_c6_07_031915
1. The DONE pin is by default an open-drain output. See Table 1-9, page 27 for DONE
signal details.
2. The INIT_B pin is a bidirectional, open-drain pin. An external pull-up resistor is required.
3. The FPGA PUDC_B pin is tied to GND to enable internal pull-ups or it can be tied to
VCCO_0 to 3-state the SelectIO pins after power-up and during configuration. For BSDL
compliance, PUDC_B should be tied to GND. See Table 1-9, page 27 for PUDC_B signal
details.
AMD has USB proprietary programming cables and boundary-scan programming tools for
prototyping purposes. These are not intended for production environments but can be
highly useful for verifying FPGA implementations and JTAG chain integrity.
When trying to access other devices in the JTAG chain, it is important to know the size of the
Instruction register length to ensure that the correct device receives the appropriate
signals. This information can be found in the BSDL file for the device.
JSTART and JSHUTDOWN are instructions specific to the FPGA device architecture and
configuration flow. The TAP controller is not reset by the PROGRAM_B pin and can only be
reset by bringing the controller to the TLR state. The TAP controller is reset on power up.
The configuration flow for FPGA configuration with JTAG is shown in Figure 6-8. A
configured device can be reconfigured by toggling the TAP and entering a CFG_IN
instruction after pulsing the PROGRAM_B pin or issuing the shutdown sequence. The JTAG
state machine must be in the Shift-DR state during configuration with the CFG_IN
instruction.
Power-Up
Clear Yes
Configuration PROGRAM_B
Memory Low?
No
INIT_B = High?
Yes
Shutdown Load
Load CFG_IN JSHUTDOWN
Instruction Sequence
Instruction
Go to Shift-DR;
Load
Bitstream
CRC No
CRC Error
Correct?
Yes
Load JSTART
Instruction
Startup
Sequence
Yes
Operational Reconfigure?
No
UG570_c6_08_08051
Notes:
1. Instruction register is larger for devices based on SSI technology (see Table 1-5, page 19). See the BSDL files for
device-specific information.
2. Assumes a t POR wait time of 20 ms is necessary until INIT_B is High and CFG_IN can be sent. In this example, the TCK cycle
value is based on a TCK frequency of 1 MHz.
3. In the Configuration register, data is shifted in from the right (TDI) to the left (TDO), MSB first. Shifts into the configuration
register are different from shifts into the other registers in that they are MSB first.
4. For 3D ICs based on Stacked Silicon Interconnect (SSI) technology, the JTAG TAP must stay in the SHIFT-DR state until the
entire bitstream is shifted into the device in step 16, with the only exception being bit 0 which is shifted in step 17.
Refer to the state diagram in Figure 6-5 for the following TAP controller steps:
1. On power-up, place a logic 1 on the TMS and clock the TCK five times. This ensures
starting in the TLR (Test-Logic-Reset) state.
2. Load the CFG_IN instruction into the target device (and BYPASS in all other devices).
3. Go through the RTI state (RUN-TEST/IDLE).
4. Load in the configuration bitstream per steps 13 through 17 in Table 6-5.
5. Repeat step 2 and step 3 for each device.
6. Load the JSTART command into all devices.
7. Go to the RTI state and clock TCK 2,000 times.
Design Entry
Introduction
Although most aspects of configuration happen before any design elements can have an
effect on the process, there are some design elements that provide access to
configuration-related features during device operation, after the initial configuration is
complete. These design elements must be instantiated in the design. Most should only be
used in specific situations that require them. User options that impact configuration are
specified through properties (see Design Tools, page 22).
The following design primitives are related to configuration features and described in this
chapter. Information on these and other primitives are also found in the UltraScale
Architecture Libraries Guide (UG974) [Ref 17]. These primitives all require instantiation in a
design. Instantiation templates for each are found in the Libraries Guide and in the Vivado
Language Templates.
RECOMMENDED: The FRAME_ECCE3 and FRAME_ECCE4 (for UltraScale+) should only be instantiated
through use of the Soft Error Mitigation (SEM) IP.
• BSCANE2
• DNA_PORTE2
• EFUSE_USR
• FRAME_ECCE3
• FRAME_ECCE4
• ICAPE3
• MASTER_JTAG
• STARTUPE3
• USR_ACCESSE2
BSCANE2
The BSCANE2 primitive allows access between the internal FPGA logic and the JTAG
boundary scan logic controller. This allows for communication between the internal running
design and the dedicated JTAG test access port (TAP) pins of the FPGA. The BSCANE2
primitive must be instantiated to gain internal access to the JTAG pins. The BSCANE2
primitive is not needed for normal JTAG operations that use direct access from the JTAG
pins to the TAP controller. The BSCANE2 is automatically added to a design when using the
Vivado Logic Analyzer, or when using indirect flash programming in the Vivado Device
Programmer.
For more details on boundary scan and usage of the BSCANE2 primitive, see Chapter 6,
Boundary-Scan and JTAG Configuration.
Primitive
Figure 7-1 shows the primitive.
X-Ref Target - Figure 7-1
BSCANE2
TDO CAPTURE
DRCK
RESET
RUNTEST
SEL
SHIFT
TCK
TDI
TMS
UPDATE
ug570_c7_01_111413
Pin Descriptions
Table 7-1 defines the BSCANE2 pin connections.
Attributes
Table 7-2 describes the BSCANE2 primitive attributes.
Applications
A typical user application requiring instantiation of the BSCANE2 is to create internal,
private scan registers in the FPGA logic. These scan registers propagate through the FPGA
logic, not through the boundary I/O as is true with standard JTAG boundary scan. Each
instance of this primitive supports one JTAG USER instruction, with multiple instantiations
differentiated with the JTAG_CHAIN attribute. To handle all four USER instructions (USER1
through USER4), instantiate four BSCANE2 primitives and set the JTAG_CHAIN attribute
uniquely on each.
For 3D ICs based on SSI technology, the BSCANE2 can only be instantiated in the master
SLR. The tools automatically place the element in the correct SLR. Only the JTAG port on the
master SLR can be accessed by the BSCANE2 primitive. For more details on SSI technology,
see UltraScale Architecture and Product Overview (DS890) [Ref 5].
The BSCANE2 primitive can also be used to control or monitor activity on the JTAG TAP port.
A signal on the TDO input of the primitive passes through an output timing register, where
the TDO input to the primitive is registered on the falling edge of TCK as it is passed to the
external TDO output pin when a USER instruction is active. The associated primitive's SEL
output goes High to indicate which USER1–USER4 instruction is active. The DRCK output
provides access to the data register clock generated by the TAP controller.
The RESET, UPDATE, SHIFT, and CAPTURE pins represent the decoding of the corresponding
state of the boundary scan internal state machine. The TDI port provides access from the
external TDI pin of the JTAG TAP in order to shift data into an internal scan chain. The TCK
and TMS pins are similarly monitored through the BSCANE2 primitive.
The BSCANE2 primitive can be used to disable the external JTAG port by instantiating it and
setting DISABLE_JTAG=TRUE. This prevents re-configuration through JTAG, including with
the Vivado Device Programmer, by breaking the JTAG chain. The design property
set_property disable_jtag yes [current_design] can also be used. These
methods are preferred over the write_bitstream option
Bitstream.general.disable_jtag:Yes (default is No). The primitive has priority;
JTAG cannot be enabled by write_bitstream if it is disabled in the BSCANE2 attribute.
AMD UltraScale+ devices add special internal pin names for the timing of some of the
BSCANE2 pins. For constraints on BSCANE2 timing paths, use the following internal pin
names for listed BSCANE2 pins.
• BSCANE2.TDO
• BSCANE2.TDI
• BSCANE2.TMS
DNA_PORTE2
Each device contains a single unique, 96-bit, embedded, device identifier (device DNA). The
identifier is nonvolatile, permanently programmed by AMD into the device via eFUSE bits,
and is unchangeable, making it tamper resistant. The Device DNA is primarily used to
identify the specific device.
External applications can access the DNA value through the JTAG port. FPGA designs can
access the DNA internally through a Device DNA Access Port, which requires instantiation of
the DNA_PORTE2 primitive. The DNA_PORTE2 primitive controls a dedicated 96-bit shift
register for capturing and shifting the Device DNA value. The DNA_PORTE2 also allows for
the inclusion of supplemental bits of user data, or allows for the DNA data to rollover
(repeat DNA data after initial data has been shifted out).
Primitive
Figure 7-2 shows the DNA_PORTE2 primitive. The DNA_PORTE2 primitive is similar to the
DNA_PORT primitive of earlier families except for the increase in the number of bits from 57
to 96. As a result, the DNA_PORTE2 cannot be directly migrated from the DNA_PORT
primitive.
X-Ref Target - Figure 7-2
DNA_PORTE2
DIN DOUT
READ
SHIFT
CLK
ug570_c7_02_111413
Pin Descriptions
Table 7-4 describes the DNA_PORTE2 primitive pins. Connect all inputs and outputs to the
design to ensure proper operation. All functions are synchronous to the CLK input.
Attributes
The attribute SIM_DNA_VALUE can be optionally set to allow for simulation of a possible
DNA data sequence. By default, the Device DNA data bits are all zeros in the simulation
model. Table 7-5 describes the DNA_PORTE2 primitive attributes.
For more details on the Device DNA and using the DNA_PORTE2 primitive, Chapter 8,
Bitstream Security, eFUSEs, and Device DNA.
EFUSE_USR
Each Kintex® UltraScale™ and Virtex® UltraScale device has a one set of 32 nonvolatile,
user-defined, one-time-programmable eFUSE bits. These bits are commonly programmed
by the user to define a custom user design ID. Programming is done through the JTAG port.
Programming can be done using the AMD configuration tools and cables, or on a
third-party programmer.
These 32 bits define the values in the FUSE_USER configuration register. Depending on the
read/write access bits in the CNTL register, the 32 bits can be programmed and read
through the JTAG port, with bit 0 shifted out first.
For internal access, the EFUSE_USR primitive must be instantiated. EFUSE_USR provides
asynchronous parallel access to all 32 bits.
The EFUSE_USR primitive is identical to that in the 7 series and will directly migrate.
Primitive
Figure 7-3 shows the EFUSE_USR primitive.
X-Ref Target - Figure 7-3
EFUSE_USR
EFUSEUSR[31:0]
ug570_c7_03_081414
Pin Descriptions
Table 7-7 describes the EFUSE_USR primitive pins.
Attributes
Table 7-7 describes the EFUSE_USR primitive attributes.
FRAME_ECCE3
The FRAME_ECCE3 primitive is reserved. This primitive is used when implementing the Soft
Error Mitigation (SEM) IP.
The FRAME_ECCE3 in the Kintex UltraScale and Virtex UltraScale FPGAs is significantly
different than the FRAME_ECCE2 of the 7 series devices. The FRAME_ECCE2 does not
migrate directly to the FRAME_ECCE3, but the SEM IP automatically adapts to the new
architecture. Note that the SEM IP is not supported in the KU025 device.
FRAME_ECCE4
The FRAME_ECCE4 primitive is reserved. This primitive is used when implementing the Soft
Error Mitigation (SEM) IP.
The FRAME_ECCE4 primitive in the UltraScale+ devices is different than the FRAME_ECCE3
primitive in the UltraScale FPGA devices. The FRAME_ECCE3 primitive does not migrate
directly to the FRAME_ECCE4 primitive, but the SEM IP generated for UltraScale+ devices is
designed to use the FRAME_ECCE4 primitive automatically.
ICAPE3
The ICAPE3 provides post-configuration access to the configuration functions of the FPGA
from the FPGA logic. Using this component, commands and data can be written to and read
from the configuration logic. The ICAPE3 interface is similar to that for the external slave
SelectMAP parallel 32-bit interface, including bit swapping (see Parallel Bus Bit Order,
page 142). However, the ICAPE3 has independent input and output buses; the CSIB input
ignores the input bus but the output bus can continue to toggle.
Because users can send an unencrypted partial bitstream and can perform readback
through the ICAPE3 interface, users concerned about security should not connect the
ICAPE3 component to external device pins. Because the improper use of this function can
have a negative effect on the functionality and reliability of the FPGA, you should not use
this element unless you are very familiar with its capabilities.
ICAPE3 can be useful in MultiBoot and active partial reconfiguration applications. ICAPE3
instantiation is required to issue an IPROG command that triggers the device to reload itself
from the address specified in the WBSTAR (Warm Boot Starting Address) register. The
WBSTAR register holds the address that the configuration controller uses after an IPROG
command is issued.
Similar functionality is provided by the MCAP. Like ICAP, the MCAP can only be used after
initial configuration, but it does not support readback. If Persist is set, the ICAPE3 is
disabled. In addition, JTAG and MCAP have priority over ICAPE3.
The ICAPE3 for the UltraScale architecture-based FPGAs supports higher frequency and
more output signals than were available in the ICAPE2 for the 7 series FPGAs. ICAPE3 only
supports 32-bit interfaces, and does not have the ICAP_WIDTH attribute from the ICAPE2.
ICAPE2 instantiations from 7 series designs are automatically migrated to ICAPE3 for the
UltraScale FPGAs. The ICAPE3 is automatically used by some AMD IP, including the Soft
Error Mitigation (SEM) IP.
Primitive
Figure 7-4 shows the ICAPE3 primitive.
X-Ref Target - Figure 7-4
ICAPE3
I[31:0] O[31:0]
CLK AVAIL
CSIB
PRDONE
RDWRB
PRERROR
ug570_c7_05_112513
Pin Descriptions
Table 7-8 describes the ICAPE3 primitive pins.
Attributes
Table 7-9 describes the describes the ICAPE3 primitive attributes.
ICAPE3 Resources
RECOMMENDED: Only one ICAPE3 resource should be instantiated for an UltraScale FPGA, and it
should be automatically placed by the tools.
Each FPGA should be used as if it had one ICAPE3 resource, although there are actually two
ICAPE3 resources per die for improved SEU protection. The tools automatically use the top
ICAPE3 by default. Advanced users can use the write_bitstream option
BITSTREAM.Readback.ICAP_Select to select the bottom resource. Control register 0 Bit 30
(ICAP_SELECT) enables the top ICAPE3 site when set to 0 (default), and enables the bottom
ICAPE3 site when set to 1. User switching can be done by toggling CTL0<30> using the
currently active ICAPE3. The device can automatically switch between the two ICAPE3 sites
if enabled using the ICAP_AUTO_SWITCH attribute, using a sync word on 8 LSBs.
For 3D ICs based on Stacked Silicon Interconnect (SSI) technology, the ICAPE3 of one Super
Logic Region (SLR) is defined as the master, with the ability to read from and write to all
other SLRs. The tools automatically place an instantiated ICAPE3 in the correct master SLR.
Note that when the mode pins are set to JTAG mode, the master SLR ICAP cannot access the
slave SLRs. Because JTAG mode is always available, the mode pins do not need to be set to
JTAG mode for configuration.
MASTER_JTAG
MASTER_JTAG provides control of the JTAG port from the FPGA logic, overriding the
external pins. This is a new feature in the UltraScale architecture-based FPGAs. When
MASTER_JTAG is instantiated, the external JTAG port is disabled at the end of configuration
startup (EOS). Therefore MASTER_JTAG should not be instantiated except for a design
requiring internal access to the JTAG port. This is intended only for eFUSE programming in
advanced secure applications that cannot use the standard eFUSE programming
methodologies. This component can be used for AES key programming (BBRAM or eFUSE),
USER eFUSE programming during runtime, or where external JTAG access is prohibited.
Because the external JTAG port is disabled, MASTER_JTAG prevents the use of the Vivado®
device programmer and the Vivado logic analyzer. For 3D ICs, MASTER_JTAG provides
access only to the SLR in which it is instantiated, with an instruction register length of 6 bits.
Primitive
Figure 7-5 shows the MASTER_JTAG primitive.
X-Ref Target - Figure 7-5
MASTER_JTAG
TCK
TMS
TDI TDO
ug570_c7_06_112513
Pin Descriptions
Table 7-10 describes the MASTER_JTAG primitive pins.
STARTUPE3
The STARTUPE3 design element is used to connect to selected dedicated configuration pins
(located in bank 0). Control of dedicated configuration pins allows post-configuration
access to the flash. When the flash is only used for configuration the FPGA design does not
require the STARTUPE3.
For multi-purpose configuration pins located in bank 65, standard user logic can be
implemented to connect to the pins required for access to the flash, with appropriate
location constraints. For example, when using x8 or wider configuration modes, the
STARTUPE3 is only used for the four LSBs of the configuration bus, D[03:00] located within
bank 0. The higher order pins D[xx:04] can be directly connected as part of the user design.
The STARTUPE3 primitive for the UltraScale architecture-based FPGAs does not provide
specification of the startup clock as was done in the STARTUPE2 for the 7 series. Otherwise,
STARTUPE3 is a superset of STARTUPE2, and designs are retargeted automatically. The
STARTUPE3 adds the ability to control the D00-D03 pins and the FCS_B pin as these pins are
now in the dedicated configuration bank. The bidirectional D00-D03 pins have separate
input and output connections to the STARTUPE3. Additional configuration pins can be
controlled after configuration as standard I/O, including bidirectional I/O.
For devices based on Stacked Silicon Interconnect (SSI) technology, a single STARTUPE3 in
the design is implemented in the master SLR and is automatically replicated to the other
SLRs to provide global control of the device.
Primitive
Figure 7-6 shows the STARTUPE3 primitive.
X-Ref Target - Figure 7-6
STARTUPE3
GSR
GTS CFGCLK
KEYCLEARB CFGMCLK
PACK EOS
USRCCLKO PREQ
USRCCLKTS
USRDONEO
USRDONETS
FCSBO
FCSBTS
DO[3:0] DI[3:0]
DTS[3:0]
ug570_c7_07_073114
Pin Descriptions
Table 7-11 describes the STARTUPE3 primitive pins. The three-state controls default to 1 to
disable the outputs, KEYCLEARB defaults to a 1, and the other inputs default to 0.
Attributes
Table 7-12 describes the describes the STARTUPE3 primitive attributes.
STARTUPE3
USRCCLKTS
USRDONETS
FCSBTS
D03 Pin
DTS[3:0] 4
D02 Pin
D01_DIN Pin
DO[3:0] 4 4 D00_MOSI Pin 4 DI[3:0]
ug570_c7_10_080615
The STARTUPE3 does not support input or output delay constraints. As a result care should
be taken to consider the performance requirements. The performance calculations are
similar to those provided for calculating configuration frequency (see Equation 2-1 for SPI
mode and Equation 4-1 through Equation 4-4 for BPI mode), but with the additional delays
to and through the STARTUPE3 block. The delays between the STARTUPE3 ports and the
device pins are noted in the data sheets [Ref 9] and [Ref 10]]. Because timing constraints are
not supported for STARTUPE3, constrain the routing connected to the STARTUPE3 ports.
The flash clock Low to output valid time (T SPITCO for SPI mode) must take into account the
CCLK delay through the STARTUPE3, TUSRCCLKO. For parallel NOR flash where transfers are
done synchronously, T CHQV is needed to add T USRCCLKO.
Similarly, the FPGA data setup time (TSPIDCC for SPI mode) on D[03:00] is delayed by the
setup time from the pins to the STARTUPE3 DI ports (TDI) plus the routing delays from the
STARTUPE3 DI port outputs to the slice flip-flops used. For the three types of asynchronous
transfers for parallel NOR flash used in BPI configuration, the output delays for address
(TBPICCO) need to be added to the input delay constraints and any flash delays (e.g. TAPA,
T ACC). The output delay for address can be obtained by timing analysis.
Higher-order data pins D[xx:04] are routed directly to general-purpose I/O pins, so the
delays can be constrained using standard input and output timing constraints. When
setting input delays for serial NOR flash used in SPI mode, the clock polarity of the FPGA
design must be taken into account. Data from the serial NOR flash device is launched off the
falling edge of the clock.
USR_ACCESSE2
The USR_ACCESSE2 design element enables access to the 32-bit AXSS register within the
configuration logic. This enables FPGA logic to access static data that can be set from the
bitstream. The primitive and functionality for the UltraScale architecture-based FPGAs are
identical to that for the 7 series.
The USR_ACCESSE2 register AXSS can be used to provide a single 32-bit constant value to
the FPGA logic. The register contents can be defined during bitstream generation, avoiding
the need to re-compile the design as would be required if distributed RAM was used to hold
the constant. A constant can be used to track the version of the design, or any other
information you require. This is an alternative to the JTAG USERCODE instruction, which
reads a 32-bit value defined by the write_bitstream option BITSTREAM.Config.UserID.
USR_ACCESSE2 has the advantage of being directly accessible by the FPGA logic, and can
store an automatically generated timestamp.
The contents of the USR_ACCESSE2 register AXSS can be defined with the write_bitstream
option BITSTREAM.Config.USR_ACCESS, which can be set to NONE (default all zeroes), any
8-character hex value, or TIMESTAMP.
TIMESTAMP inserts the current timestamp into the AXSS register in this format:
ddddd_MMMM_yyyyyy_hhhhh_mmmmmm_ssssss
(bit 31) ……………………………………………………… (bit 0)
Where:
For more details on USR_ACCESSE2, see Bitstream Identification with USR_ACCESS using the
Vivado Design Suite (XAPP1232) [Ref 18].
Primitive
Figure 7-8 shows the USR_ACCESSE2 primitive.
X-Ref Target - Figure 7-8
USR_ACCESSE2
DATA[31:0]
CFGCLK
DATAVALID
ug570_c7_08_112513
Pin Descriptions
Table 7-13 describes the USR_ACCESSE2 primitive pins.
UG570_07_09_031915
The following is an example of constraints for the special internal USR_ACCESSE2 timing
pin, specifically an example for creating a primary clock on the internal pin. For a 200 MHz
(5 ns period), a 50% duty cycle CCLK clock constraint on the USR_ACCESSE2_inst.CCLK clock
source pin covers the timing paths from DATAVALID or DATA[31:0] to the destination
register clocked by USR_ACCESSE2.CFGCLK.
In the following example, the path requirement is changed from a 5 ns (clock period
defined on CCLK) to 3.5 ns:
Introduction
This chapter discusses the available types of FPGA bitstream security including:
• Readback Security
• Bitstream Encryption and Authentication
• eFUSE
° For a user ID
Readback Security
By default, an active FPGA configuration can be read back or reconfigured through the JTAG
port, through the SelectMAP port if Persist is selected, or through the ICAPE3 primitive if it
is instantiated in a design. A basic form of security is to prevent access to the configuration
logic, such as by not allowing the configuration port to persist and not enabling ICAP
connections to external pins. In addition, the bitstream readback security setting
(BITSTREAM.READBACK.SECURITY) can be set to Level1 (disables readback), or Level2
(disables both readback and reconfiguration). The only way to remove a readback security
setting in a configured FPGA is to clear the FPGA program by asserting PROGRAM_B or
cycling power. If the user design is sensitive, bitstream encryption should be considered.
Use of encryption automatically prevents readback via hardware gates and not just
bitstream settings. It is the strongest method to prevent readback and protect your IP. The
bitstream readback security setting does not affect readback for SEU detection. Refer to
Vivado Design Suite User Guide Programming and Debugging (UG908) [Ref 8] for details on
the readback security options.
The FPGA AES system consists of software-based bitstream encryption and on-chip
bitstream decryption with dedicated memory for storing the encryption key. Using the AMD
Vivado® tools, the user generates the encryption key and the encrypted bitstream.
UltraScale architecture-based FPGAs store the encryption key internally in either dedicated
RAM, backed up by a small externally connected battery, or in the nonvolatile,
one-time-programmable eFUSE. The selected option is defined with
BITSTREAM.ENCRYPTION.ENCRYPTKEYSELECT set to BBRAM or EFUSE. The encryption key
can only be programmed onto the device through the external JTAG port or through the
internal MASTER_JTAG primitive. The encryption key cannot be read back. Refer to
XAPP1283 Internal Programming of BBRAM and eFUSEs for more information on using the
internal MASTER_JTAG primitive option.
During configuration, the FPGA device performs the reverse operation, decrypting the
incoming bitstream. The FPGA AES encryption logic uses a 256-bit encryption key.
The on-chip AES decryption logic cannot be used for any purpose other than bitstream
decryption. The AES decryption logic is not available to the user design and cannot be used
to decrypt any data other than the configuration bitstream.
For the step-by-step process to generate an encrypted bitstream and encryption keys using
the Vivado Design Suite, see Using Encryption and Authentication to Secure an
UltraScale/UltraScale+ FPGA Bitstream (XAPP1267) [Ref 20].
The FPGA AES encryption system uses a 256-bit encryption key to encrypt or decrypt blocks
of 128 bits of data at a time. According to NIST, there are 1.1 x 10 77 possible key
combinations for a 256-bit key.
Symmetric encryption algorithms such as the AES algorithm use the same key for
encryption and decryption. The security of the data is therefore dependent on the secrecy
of the key.
Rolling Keys
UltraScale FPGAs allow you to break up the bitstream into multiple AES encryption
modules, each encrypted with its own unique key. The initial key is stored on-chip, while
keys for each successive module are encrypted (wrapped) in the previous module. This
feature, known as rolling keys, increases security against side-channel attacks such as
differential power analysis (DPA). The bitstream option BITSTREAM.ENCRYPTION.KEYLIFE
defines the number of encryption blocks per key. An encryption block is 128 bits (four
32-bit words). Fewer encryption blocks per key offers greater security but exponentially
increases bitstream size and therefore configuration time. Selecting a value such as 1,024 or
higher increases configuration size by about 15%, a value of 64 can increase bitstream size
by 50%, and a value of 32 can double the bitstream size.
When using RSA authentication, certain block RAMs might be used to hold interim rolling
keys, which impacts the ability to initialize those blocks. For a given block RAM column,
each 36K block that resides in the bottom of a clock region is affected; essentially the first
36K block RAM starting at the bottom of a device and then every 12th 36K block RAM after
that in a column (BRAM36_X*Y0, BRAM36_X*Y12, BRAM36_X*Y24, etc.). Those block RAMs
can not be initialized to user-defined values when using RSA authentication. Those block
RAMs are always initialized to 0 after configuration.
To program the key, the device enters a special key-access mode. In this mode, all FPGA
memory, including the encryption key and configuration memory, is cleared. After the key
is programmed and the key-access mode is exited, the key cannot be read out of the device
by any means, and the RAM key cannot be reprogrammed without clearing the entire
device. The key-access mode is transparent to most users.
The key can be programmed into the battery-backed RAM (BBRAM), which is powered by
VCCAUX or VBATT, or into nonvolatile, one-time-programmable eFUSE bits. After
programming, a CRC can be applied to verify proper programming of the key, but the key
itself cannot be read back.
The encryption key itself can be encrypted using a fixed key that is never visible in the
device. Encrypting the key is known as black key store (BKS) or key obfuscation. This option
is disabled by default, and is set with the bitstream property
BITSTREAM.ENCRYPTION.OBFUSCATEKEY ENABLE. When you set the
BITSTREAM.ENCRYPTION.OBFUSCATEKEY property, the Vivado tool bitstream software
creates a new key, ObfuscateKey, in the output NKY file. This obfuscated key is created by
encrypting your AES-256 key with a metalized family key stored in the silicon. All FPGAs in
the UltraScale family share the same family key. All FPGAs in the UltraScale+ family share
the same family key, which is different than the UltraScale family key.
AMD does not provide the family key as part of the Vivado tools. Customers must send a
request and must specify either the UltraScale family key or the UltraScale+ family key to
[email protected]. The corresponding family key will then be distributed to
qualified customers through the Product Licensing site on www.xilinx.com.
To specify the location of the family key you must set the following write_bitstream
property: set_property BITSTREAM.ENCRYPTION.FAMILY_KEY_FILEPATH
C:/<anyDirectory>/familyKey_us.cfg [current_design].
While the device holds an encryption key, a non-encrypted bitstream can be used to
configure the device only after PROGRAM_B or power-on reset (after a power cycle) is
asserted, thus clearing out the configuration memory. In this case the key is ignored. After
configuring with a non-encrypted bitstream, readback is possible (if allowed by the
readback security setting). The encryption key still cannot be read out of the device,
preventing the use of Trojan Horse bitstreams to defeat the FPGA encryption scheme.
An encrypted bitstream can be delivered through any configuration interface: JTAG, serial,
SPI, BPI, SelectMAP, and ICAP. For encrypted bitstreams using an obfuscated key with the
JTAG interface, do not pause bitstream loading by temporary excursion from the JTAG
Shift-DR state to the JTAG Pause-DR state. Instead, stay within the JTAG Shift-DR state and
stop the JTAG TCK clock to pause bitstream loading. For encrypted bitstreams using an
obfuscated key with the SelectMAP or ICAP interfaces, do not pause bitstream loading by
temporary de-assertion of the configuration interface chip-select (CSI_B). Instead, keep
CSI_B asserted and stop the CCLK to pause bitstream loading. See Answer 73656 for details.
Bitstreams can be created with both compression and encryption. After configuration, the
device cannot be reconfigured without toggling the PROGRAM_B pin, cycling power, or
issuing the JPROGRAM instruction. Fallback reconfiguration and IPROG reconfiguration are
enabled even when encryption is turned on. Fallback and IPROG reconfiguration images
loaded from the external configuration port or through ICAP can be encrypted or
unencrypted images, and they do not have to match the original image encryption option.
Partial reconfiguration images loaded from the external configuration port must match the
original image encryption option. For example, if the original image is encrypted the partial
reconfiguration image must be encrypted and if the original image is unencrypted the
partial reconfiguration image must be unencrypted. Readback is available through the
ICAPE3 primitive. None of these events resets the BBRAM key if V BATT or VCCAUX is
maintained.
A mismatch between the key in the encrypted bitstream and the key stored in the device
causes configuration to fail with the INIT_B pin pulsing Low and then back High if fallback
is enabled, and the DONE pin remaining Low.
IMPORTANT: Clear or Program the BBRAM to a known state before attempting to configure with an
encrypted bitstream that uses the BBRAM as the key source. If you attempt to download an encrypted
bitstream on power-up before the BBRAM key is programmed, the FPGA device might lock up. You must
power-cycle the device and then load the BBRAM key before configuring with an encrypted bitstream.
Users concerned about the security of their design should not wire the ICAPE3 interface to
user I/O. Connecting the ICAPE3 clock does not impact security.
Like the other configuration interfaces, the ICAP interface does not provide access to the
key register.
VBATT
When an encryption key is stored in the FPGA battery-backed RAM (BBRAM), the
encryption key memory cells are volatile and must receive continuous power to retain their
contents. During normal operation, these memory cells are powered by the auxiliary
voltage input (V CCAUX), although a separate V BATT power input is provided for retaining the
key when VCCAUX is removed. Because V BATT draws very little current (on the order of
nanoamperes), a small watch battery is suitable for this supply. To estimate the battery life,
refer to VBATT DC Characteristics in the respective data sheet ([Ref 9] or [Ref 10]) and the
battery specifications.
VBATT does not draw any current and can be removed while VCCAUX is applied. V BATT cannot
be used for any purpose other than retaining the encryption keys when VCCAUX is removed.
Bitstream Authentication
The AES-GCM encryption standard also supports built-in authentication, enhancing
security and eliminating the need to specify a separate HMAC key as in the 7 series FPGAs.
Without knowledge of the AES-GCM key, the bitstream cannot be loaded, modified,
intercepted, or cloned. Encryption provides the basic design security to protect the design
from copying or reverse engineering, while authentication provides assurance that the
bitstream provided for the configuration of the FPGA was the unmodified bitstream
allowed to load. Authentication verifies both data integrity and authenticity of the
bitstream. Authentication covers the entire bitstream for all types of control and data. Any
bitstream tampering including single bit flips are detected.
If authentication passes, the configuration goes to completion through the startup cycle. If
authentication fails and fallback is enabled, the fallback bitstream is loaded after the entire
device configuration has been cleared. If fallback is not enabled, the configuration logic
disables the configuration interface, blocking any access to the FPGA. Pulsing the
PROGRAM_B signal or power-on reset is required to reset the configuration interface.
RSA Authentication
The AES-GCM algorithm implements authentication and decryption at the same time.
However, an alternative security method is to authenticate the bitstream data before it is
sent to the decryptor. This method can be used to help prevent attacks on the decryption
engine itself by making sure the data is authentic before performing any decryption.
UltraScale architecture-based FPGAs support RSA-2048 authentication for this purpose.
RSA authentication is not supported in the Kintex UltraScale KU025 device, or when using
serial or selected other configuration modes in the Kintex UltraScale and Virtex UltraScale
FPGAs (see Table 8-1). For RSA authentication there are no configuration mode limitations
in the Artix UltraScale+, Kintex UltraScale+, and Virtex UltraScale+ FPGAs.
Table 8-1: UltraScale Devices and Configuration Modes Supporting RSA Authentication
Kintex UltraScale FPGAs Virtex UltraScale FPGAs Artix
UltraScale+,
VU065 Kintex
Interface Width KU035 KU060 VU080 VU125 UltraScale+,
KU025 KU085 KU095 VU440 and Virtex
KU040 KU115 VU095 VU160
VU190 UltraScale+
FPGAs
32 N/A Yes(1) Yes(1) Yes(1) Yes(1) Yes(1) Yes(1) Yes
SelectMAP 16 N/A Yes(1) Yes(1) Yes(1) Yes(1) Yes(1) Yes(1) Yes
8 N/A No No Yes(1) Yes(1) No Yes(1) Yes
16 N/A Yes Yes(2) Yes Yes Yes Yes Yes
BPI
8 N/A No No Yes(2) Yes(2) No Yes Yes
8 N/A No No Yes Yes No Yes Yes
4 N/A No No No No No Yes Yes
SPI
2 N/A No No No No No No Yes
1 N/A No No No No No No Yes
Notes:
1. Not supported if non-continuous SelectMAP data loading is implemented by deasserting the CSI_B signal.
2. Not supported if asynchronous page read is used.
The actual time increase is dependent upon the mode of configuration. There are two steps
required before loading the RSA bitstream:
1. Load phase: Configuration data is loaded into the FPGA device’s configuration memory
from the selected configuration interface.
2. Read-Decrypt-Write (RDW) phase: Internal operation reads the configuration memory,
optionally decrypts the data, and writes the final data into the configuration memory.
The load phase time is based on the size of the image and the configuration interface
bandwidth. The RSA signature verification is done in parallel, so no additional time is
required for that step. The RDW phase time is based on an internal bus that is always 32 bits
wide and runs on the configuration clock. The number of clock periods the RDW takes is
approximately: 2.5 * (bitstream_size_in_bits / 32 bits) * the configuration clock period / # of
SLRs in device.
eFUSE
eFUSEs are nonvolatile one-time-programmable (OTP) cells used for some device settings,
the factory-programmed Device DNA, and these user-programmable elements:
The fuse link is programmed (or burned or blown) by flowing a large current for a specific
amount of time. The resistance of a programmed fuse link is typically a few orders of
magnitude higher than that of a pristine or unprogrammed fuse. A programmed fuse is
assigned a logic value of 1, and a pristine fuse has a logic value of 0. User-programmable
eFUSEs can be programmed with the AMD configuration tools (see UG908 [Ref 8]). eFUSE
must not be programmed during device configuration activity. When in-system
programming eFUSE, apply the following to minimize system activity and FPGA interface
activity to reduce risks from system noise on JTAG signals or eFUSE circuits:
The Device DNA is a 96-bit eFUSE value that is factory-programmed and unique for each
device. JTAG or the DNA_PORTE2 primitive is used to access the value. See Device Identifier
(Device DNA), page 135 for more details.
The JTAG interface can be used to program the FUSE_USER 32-bit value. JTAG or the
EFUSE_USR primitive is then used to access the data. See EFUSE_USR in Chapter 7, Design
Entry.
The FPGA logic can access only the FUSE_USER register and the Device DNA. All other
eFUSE bits are not accessible from the FPGA logic.
JTAG Instructions
eFUSE registers can be read through JTAG ports. eFUSE programming can be done only via
JTAG. Table 8-3 lists eFUSE-related JTAG instructions.
The FUSE_CNTL and FUSE_SEC control registers are described in Table 8-4 and Table 8-5,
respectively. All register bits are defined by an eFUSE and therefore each selection is
permanent.
Notes:
1. IMPORTANT! When FUSE_SHAD_SEC[0] or RSA_AUTH is programmed, only AES encrypted or RSA authenticated bitstreams,
respectively, can be used to configure the FPGA through external configuration ports. This precludes device configuration
from AMD test bitstreams and AMD pre-built bitstreams. Thus, AMD does not accept return material authorization (RMA)
requests or support indirect flash programming for devices that have the FUSE_SHAD_SEC[0] or RSA_AUTH bit programmed.
2. IMPORTANT! If this bit is programmed, return material authorization (RMA) returns are limited in device analysis and
debug.
TIP: Zynq UltraScale+ devices have distinct eFUSE registers unlike the UltraScale architecture FPGA
eFUSE registers. The UltraScale architecture FPGA eFUSE registers are not supported in Zynq
UltraScale+ devices. For Zynq UltraScale+ device eFUSE registers, see the PS eFUSE section in the Zynq
UltraScale+ Device Technical Reference Manual (UG1085) [Ref 28].
External applications can access the DNA value through the JTAG port and FPGA designs
can access the DNA through a Device DNA Access Port (DNA_PORTE2).
The FPGA application accesses the identifier value using the Device DNA Access Port
(DNA_PORTE2) design primitive, shown in Figure 8-1.
X-Ref Target - Figure 8-1
DNA_PORTE2
DIN DOUT
READ
SHIFT
CLK
ug570_c8_01_111413
SHIFT=1
95 Read = 0 0
DIN 96-Bit Loadable Shift Register DOUT
CLK
READ = 1
95 94 1 0
0 1 96-Bit Unique Device Identifier (Device DNA) 0 1
Factory Programmed, Unchangeable ug570_c8_02_020415
immediately after the load. The READ operation overrides a SHIFT operation, so READ
should be asserted for at least one clock cycle and then removed.
Notes:
1. X = Don’t care.
2. ↑ = Rising clock edge.
To continue reading the identifier values, assert SHIFT followed by a rising edge of CLK, as
shown in Table 8-6. This action causes the output shift register to shift its contents toward
the DOUT output. The value on the DIN input is shifted into the shift register. All shift
register functionality is synchronous to the CLK.
DNA_PORTE2
0 or 1 DIN DOUT
READ
SHIFT
CLK
ug570_c8_03_101414
As shown in Figure 8-4, the length of the identifier can be extended by feeding the DOUT
serial output port back into the DIN serial input port. This way, the identifier can be
extended to any possible length.
X-Ref Target - Figure 8-4
DNA_PORTE2
DIN DOUT
READ
SHIFT
CLK
ug570_c8_04_112513
READ
SHIFT
CLK
UG570_c8_05_112513
The Vivado Device Programmer also supports reading the Device DNA by viewing the
eFUSE registers in the Hardware Device Properties window, or by using the following Tcl
command:
The user-defined eFUSE register FUSE_USER can be read similarly to FUSE_DNA, using the
JTAG FUSE_USER command, the Hardware Device Properties, or reporting the
REGISTER.EFUSE.FUSE_USER property. See Vivado Design Suite User Guide
Programming and Debugging (UG908) [Ref 8] for more details.
Configuration Details
Introduction
You generally do not need to know the details of the configuration format and commands.
However, this detail can be useful for debugging purposes. After initial configuration, you
can send configuration commands to the device through the permanent JTAG interface,
through the SelectMAP port if Persist is selected, or through the Internal Configuration
Access Port if the ICAPE3 primitive is included in the design.
BIT Not bit write_bitstream Binary configuration data file containing header
swapped (generated by default) information that does not need to be downloaded
to the FPGA. Used to program devices from the
Vivado® device programmer tool with a
programming cable. Proprietary format for Vivado
design tool use only.
RBT Not bit write_bitstream -raw_bitfile ASCII equivalent of the BIT file containing a text
swapped header and ASCII 1s and 0s. Eight bits per
configuration bit. Proprietary format for Vivado
design tool use only.
BIN Not bit write_bitstream -bin_file Binary configuration data file with no header
swapped information. Can be used for custom configuration
solutions (for example, microprocessors), or in
Bit swapped write_cfgmem -format BIN
some cases to program third-party memories.
MCS Bit swapped(3) write_cfgmem -format MCS ASCII file format containing address and checksum
or information in addition to configuration data. Used
Vivado device programmer mainly for device programmers and the Vivado
device programmer tool.
HEX Determined write_cfgmem -format HEX ASCII file format containing only configuration
by user or data. Used mainly in custom configuration
Vivado device programmer solutions.
Notes:
1. Bit swapping is discussed in the Bit Swapping section.
2. For complete write_bitstream and write_cfgmem Tcl command syntax, refer to the Vivado Design Suite Tcl Command
Reference Guide (UG835) [Ref 20].
3. MCS files are generally bit-swapped except in SPI or serial configuration mode. The write_cfgmem -interface SPIx1/2/4/8
option is used for serial NOR flash and creates a file that is not bit swapped.
The output from write_cfgmem is typically used to program the selected third-party flash
memory device. The output format supported by your third-party programmer should be
chosen. The write_cfgmem command -interface argument specifies the planned
configuration interface. Valid values include SMAPx8 (default), SMAPx16, SMAPx32,
SERIALx1, SPIx1, SPIx2, SPIx4, SPIx8, BPIx8, and BPIx16. This also determines if bit swapping
is enabled or disabled (see Bit Swapping, page 141). Some parallel flash devices for BPI
configuration require endian swapping to be enabled when creating the file. Refer to the
flash vendor documentation.
Note: A daisy chain that includes AMD UltraScale devices must be composed only of devices that
are supported by the Vivado tools, from the 7 series and later.
Bit Swapping
Bit swapping is the swapping of the bits within a byte. The MCS file format is always
bit-swapped unless the write_cfgmem -interface SPIx1|SPIx2|SPIx4|SPIx8 option is used. The
HEX file format can be bit-swapped or not bit-swapped, depending on user options. The
bitstream files (BIT, RBT, BIN) are never bit-swapped.
The HEX file format contains only configuration data. The other memory file formats
include address and checksum information that should not be sent to the FPGA. The
address and checksum information is used by some third-party device programmers, but is
not programmed into the memory device.
Figure 9-1 shows how two bytes of data (0xABCD) are bit-swapped.
X-Ref Target - Figure 9-1
Hex: A B C D
SelectMAP D0 D1 D2 D3 D4 D5 D6 D7 D0 D1 D2 D3 D4 D5 D6 D7
Data Pin:
Binary: 1 0 1 0 1 0 1 1 1 1 0 0 1 1 0 1
Bit-
Swapped 1 1 0 1 0 1 0 1 1 0 1 1 0 0 1 1
Binary:
SelectMAP
Data Pin:
D7 D6 D5 D4 D3 D2 D1 D0 D7 D6 D5 D4 D3 D2 D1 D0
Bit-
Swapped D 5 B 3
Hex: UG570_c9_01_120213
• In the bit-swapped version of the data, the bit that goes to D0 is the right-most bit.
• In the non bit-swapped data, the bit that goes to D0 is the left-most bit.
Whether or not data must be bit swapped is entirely application dependent. Bit swapping is
applicable for serial, SelectMAP, or BPI files, and for the ICAPE3 interface.
Table 9-2 and Table 9-3 show examples of a Sync word 0xAA995566 inside a bitstream (see
Sync Word). These examples illustrate what is expected at the FPGA data pins when using
parallel configuration modes, such as slave SelectMAP, master SelectMAP, and BPI modes,
and when using the ICAPE3 interface.
Notes:
1. [31:24] changes from 0xAA to 0x55 after bit swapping.
Table 9-3: Sync Word Data Sequence Example for x8, x16, and x32 Modes
CCLK Cycle 1 2 3 4
D[7:0] pins for x8 0x55 0x99 0xAA 0x66
D[15:0] pins for x16 0x5599 0xAA66 ... ...
D[31:0] pins for x32 0x5599AA66 ... ... ...
Configuration Sequence
While each of the configuration interfaces is different, the basic steps for configuring a
device are the same for all modes. Figure 9-2 shows the FPGA configuration process. The
following subsections describe each step in detail, where the current step is highlighted in
gray at the beginning of each subsection.
X-Ref Target - Figure 9-2
Steps
1 2 3 4 5 6 7 8
Clear Load
Device Sample Mode Device ID Startup
Configuration Synchronization Configuration CRC Check
Power-Up Pins Check Sequence
Memory Data
Bitstream
Setup Loading
Start Finish
UG570_c9_02_120213
The setup steps are critical for proper device configuration. The steps include:
Steps
1 2 3 4 5 6 7 8
Clear Load
Device Sample Mode Device ID Startup
Configuration Synchronization Configuration CRC Check
Power-Up Pins Check Sequence
Memory Data
Bitstream
Setup Loading
Start Finish
UG570_c9_03_120213
All JTAG and serial configuration pins are located in a separate, dedicated bank with a
dedicated voltage supply (V CCO_0). None of the I/O voltage supplies except VCCO_0 needs to
be powered for FPGA configuration in JTAG or serial modes (up to SPI x4) when RS[1:0] is
not used. All dedicated input pins operate at the VCCO_0 LVCMOS level. All active dedicated
output pins operate at the V CCO_0 voltage level with the output standard set to LVCMOS,
12 mA drive, fast slew rate.
The multi-function pins are located in bank 65. For all modes that use multi-function I/O
(for example, master BPI, SPI x8, SelectMAP), the associated V CCO_65 must be connected to
the appropriate voltage to match the I/O standard of the configuration device. The pins are
also LVCMOS, 12 mA drive, fast slew rate during configuration. If the Persist option is used
(see Persist Option, page 179), the multi-function I/O for the selected configuration mode
remain active after configuration, with the I/O standard set to the default of LVCMOS,
12 mA drive, fast slew rate.
Table 9-4 shows the power supplies required for configuration. Table 9-5 shows the timing
for power-up. Refer to the data sheet for voltage ratings. Standard I/O voltage levels
supported for configuration are 1.5V, 1.8V, 2.5V, and 3.3V. None of the I/O voltage supplies
except VCCO_0 needs to be powered for configuration in JTAG mode. When configuration
modes are selected that use the multi-function pins (i.e., serial, master BPI, SPI, SelectMAP),
VCCO_65 must also be supplied. In the Virtex UltraScale devices, and in the Kintex UltraScale
KU095, bank 65 is an HP I/O bank, and therefore configuration interfaces requiring bank 65
must operate at 1.5V or 1.8V.
Notes:
1. VBATT is required only when an AES key is stored in the FPGA battery-backed RAM for
decryption of an encrypted bitstream.
Notes:
1. See the data sheet for power-up timing characteristics.
INIT_B1
TICCK
Notes:
1. In UltraScale+ devices the INIT_B pin might be seen as High (because of external resistors on board tied to INIT_B) for a period of
time after power ON. (The initial High time depends on the POR_OVERRIDE setting. With POR_OVERRIDE Low, the High time is
approx. 40 ms. With POR_OVERRIDE High, the High time is approx. 9 ms.)
2. Can be either 0 or 1, but must not toggle during and after configuration.
UG570_c9_04_082819
operating voltages. The T POR specification begins when the last of the monitored supplies
(VCCINT, VCCAUX, VCCBRAM, V CCO_0) reaches 95% of its recommended operating condition
voltage. The actual tPOR delay begins earlier depending on the thresholds of the monitored
voltages, resulting in a smaller minimum specification with a slower ramp. Note that the
recommended power-on sequence in the data sheet, to achieve minimum current draw and
ensure that the I/Os are 3-stated at power-on, has V CCO applied last. The T POR time includes
a built-in delay to allow for voltages to stabilize before beginning configuration. For
applications where power-on time is important, the POR_OVERRIDE pin can be tied to
VCCINT, which shortens the built-in delay. See the data sheet for the resulting TPOR time
when the supplies are ramped quickly and POR_OVERRIDE is tied to V CCINT. Note that V CCINT
is recommended to ramp first. For the standard TPOR delay, tie POR_OVERRIDE to ground.
See Power-On Reset, page 38.
UltraScale devices with multiple SLRs (this does not apply to Ultrascale+ devices) can have
the weak pull-up temporarily enabled on I/Os in the Slave SLR during the configuration
sequence (between power on and assertion of the INIT_B configuration signal). In some
boards, this can cause an undesired 0-1-0 transition on I/O in the slave SLR. It is
recommended that any I/O pins in the slave SLR sensitive to a 0-1-0 transition during
configuration be connected to I/Os in the Master SLR or include external pull-downs of
1 kΩ or stronger to the pin.
Delaying Configuration
To delay configuration, the INIT_B or PROGRAM_B pin should be held Low during
initialization (see Figure 9-4). When INIT_B has gone High, configuration cannot be delayed
subsequently by pulling INIT_B Low.
The signals relating to initialization and delaying configuration are defined in Table 9-6.
Notes:
1. Information on the FPGA status register is available in Table 9-25. Information on accessing the device status register
through SelectMAP is available in Chapter 10, Readback Verification and CRC.
2. The status type is an internal status signal without a corresponding pin.
After power-up, the device can be re-configured by toggling the PROGRAM_B pin Low (see
Figure 9-5).
X-Ref Target - Figure 9-5
UG570_c5_06_072414
Figure 9-5: Re-configuring the Device by Toggling the PROGRAM_B Pin Low
During this time, I/Os are drivers are disabled, except for the configuration and JTAG pins,
through the use of the global three-state (GTS). User I/O pins are High-Z or pulled up
depending on whether PUDC_B is High or Low, respectively. Both the dedicated
configuration bank 0 and the multi-function bank 65 are enabled during configuration,
independent of the mode pins. In devices based on SSI technology, banks 60 and 70 are
also enabled during configuration, although they do not have configuration functions.
INIT_B is internally driven Low during initialization, then released after TPOR (Figure 9-4) for
the power-up case, and TPL for MultiBoot and fallback cases. If the INIT_B pin is held Low
externally, the device waits in the initialization process until the pin is released, and the T POR
or TPL delay is met.
The minimum Low pulse time for PROGRAM_B is defined by the T PROGRAM timing
parameter.
X-Ref Target - Figure 9-6
Steps
1 2 3 4 5 6 7 8
Clear Load
Device Sample Mode Device ID Startup
Configuration Synchronization Configuration CRC Check
Power-Up Pins Check Sequence
Memory Data
Bitstream
Setup Loading
Start Finish
UG570_c9_05_120213
Steps
1 2 3 4 5 6 7 8
Clear Load
Device Sample Mode Device ID Startup
Configuration Synchronization Configuration CRC Check
Power-Up Pins Check Sequence
Memory Data
Bitstream
Setup Loading
Start Finish
UG570_c9_06_110213
• Synchronization (Step 4)
• Device ID Check (Step 5)
• Load Configuration Data (Step 6)
• CRC Check (Step 7)
Synchronization (Step 4)
X-Ref Target - Figure 9-8
Steps
1 2 3 4 5 6 7 8
Clear Load
Device Sample Mode Device ID Startup
Configuration Synchronization Configuration CRC Check
Power-Up Pins Check Sequence
Memory Data
Bitstream
Loading
Start Finish
UG570_c9_07_012714
Notes:
1. Information on the FPGA status register is available in Table 9-25. Information on accessing the device status
register through JTAG or SelectMAP is available in Chapter 10, Readback Verification and CRC.
Steps
1 2 3 4 5 6 7 8
Clear Load
Device Sample Mode Device ID Startup
Configuration Synchronization Configuration CRC Check
Power-Up Pins Check Sequence
Memory Data
Bitstream
Loading
Start Finish
UG570_c9_08_012714
The device ID check is built into the bitstream, making this step transparent to most
designers. The device ID check is performed through commands in the bitstream to the
configuration logic, not through the JTAG IDCODE register in this case.
vvvv:dddddddddddddddd:ccccccccccc1
where:
v = version
d = 16-bit device code
c = company code
Steps
1 2 3 4 5 6 7 8
Clear Load
Device Sample Mode Device ID Startup
Configuration Synchronization Configuration CRC Check
Power-Up Pins Check Sequence
Memory Data
Bitstream
Loading
Start Finish
UG570_c9_09_012714
After the synchronization word is loaded and the device ID has been checked, the
configuration data frames are loaded (Figure 9-10). This process is transparent to most
users.
Steps
1 2 3 4 5 6 7 8
Clear Load
Device Sample Mode Device ID Startup
Configuration Synchronization Configuration CRC Check
Power-Up Pins Check Sequence
Memory Data
Bitstream
Loading
Start Finish
UG570_c9_10_012714
If a CRC error occurs during configuration from a mode where the FPGA is the configuration
master, the device can attempt to do a fallback reconfiguration. In BPI and SPI modes, if
fallback reconfiguration fails again, the BPI/SPI interface can only be resynchronized by
pulsing the PROGRAM_B pin and restarting the configuration process from the beginning.
The JTAG interface is still responsive and the device is still active, only the BPI/SPI interface
is inoperable. In SelectMAP modes, either the PROGRAM_B pin can be pulsed Low or an
ABORT sequence can be initiated (see Chapter 5, SelectMAP Configuration Modes).
AMD devices use a 32-bit CRC check. The CRC check is designed to catch errors in
transmitting the configuration bitstream. There is a scenario where errors in transmitting
the configuration bitstream can be missed by the CRC check: certain clocking errors, such
as double-clocking, can cause loss of synchronization between the 32-bit bitstream packets
and the configuration logic. After synchronization is lost, any subsequent commands are
not understood, including the command to check the CRC, and the device does not
complete configuration. In this situation, configuration fails with DONE Low and INIT_B
High because the CRC was ignored. In BPI Mode asynchronous read, the address counter
eventually overflows or underflows to cause wraparound, which triggers fallback
reconfiguration. BPI synchronous read mode does not support the wraparound error
condition.
Steps
1 2 3 4 5 6 7 8
Clear Load
Device Sample Mode Device ID Startup
Configuration Synchronization Configuration CRC Check
Power-Up Pins Check Sequence
Memory Data
Bitstream
Loading
Start Finish
UG570_c9_11_012714
The specific order of start-up events (except for EOS assertion) is user-programmable
through bitstream options controlled by the BITSTREAM.STARTUP properties (refer to the
Vivado Design Suite User Guide: Programming and Debugging (UG908) [Ref 8]. Table 9-9
shows the general sequence of events, although the specific phase for each of these
start-up events is user-programmable (EOS is always asserted in the last phase). By default,
start-up events occur as shown in Table 9-9.
The start-up sequence can be forced to wait for the MMCMs to lock or for DCI to match with
the appropriate bitstream options. These options are typically set to prevent DONE, GTS,
and GWE from being asserted (preventing device operation) before the MMCMs have
locked and/or DCI has matched.
Note: Using DCI with the Multi-function Configuration Pins. If any of the multi-function
configuration pins in I/O bank 65 are assigned DCI I/O standards in the user design, the DCI
calibration will not happen until after the pins are released from their configuration functions at the
end of start-up. If the DCIUpdateMode is set to AsRequired, there will be an indeterministic delay
after start-up until those pins are calibrated. If DCIUpdateMode is set to Quiet, the pins would never
have their DCI values set. To avoid these issues, the DCIRESET primitive should be included, and the
design should pulse the RST input of DCIRESET and then wait for the LOCKED signal to be asserted
prior to using any user input or outputs on the multi-function pins with DCI standards. For more
details on DCI, see the UltraScale Architecture SelectIO Resources User Guide (UG571) [Ref 21].
The DONE signal is released by the start-up sequencer on the cycle indicated by the user
options, but the start-up sequencer does not proceed until the DONE pin actually sees a
logic High. The DONE pin is an open-drain bidirectional signal. By releasing the DONE pin,
the device stops driving a logic Low, and the pin is pulled up by a default internal pull-up
resistor. There is no setup or hold requirement for the DONE register. Table 9-10 shows
signals relating to the start-up sequencer. Figure 9-13 shows the waveforms relating to the
start-up sequencer.
Notes:
1. Information on the FPGA status register is available in Table 9-25. Information on accessing the device status
register through JTAG or SelectMAP is available in Chapter 10, Readback Verification and CRC.
2. Open-drain output.
3. GWE is asserted synchronously to the configuration clock (CCLK) and has a significant skew across the part.
Therefore, sequential elements are not released synchronously to the user system clock, and timing violations can
occur during start-up. It is recommended that you reset the design after start-up and/or apply some other
synchronization technique.
PROGRAM_B
INIT_B
DONE
GTS
GWE
EOS
CCLK
Table 9-11: I/O Transition at End of Startup in Kintex UltraScale Family (Except KU095)
VCCO_0 VCCO_65 or VCCO_70 Pin State Input Transition
2.5V or 3.3V 1.8V or lower 0 or floating 0-1-0
1.8V or lower Any Any None
Any 2.5V or 3.3V Any None
Any Any 1 None
Configuration Bitstream
The FPGA bitstream contains commands to the FPGA configuration logic as well as
configuration data.
The bitstream data in Table 9-12 shows the 32-bit configuration word for an unswapped
bitstream. For swapped and unswapped formats, see Configuration Data File Formats.
For the x8 bus, the configuration bus width detection logic first finds 0xBB on the D[0:7]
pins, followed by 0x11. For the x16 bus, the configuration bus width detection logic first
finds 0xBB on D[0:7] followed by 0x22. For the x32 bus, the configuration bus width
detection logic first finds 0xBB, on D[0:7], followed by 0x44. See Table 9-13.
If the immediate byte after 0xBB is not 0x11, 0x22, or 0x44, the bus width state machine
is reset to search for the next 0xBB until a valid sequence is found. Then, it switches to the
appropriate external bus width and starts looking for the Sync word. When the bus width is
detected, the SelectMAP interface is locked to that bus width until a power cycle,
PROGRAM_B pulse, JPROGRAM reset, or IPROG reset is issued.
Sync Word
A special Sync word is used to allow configuration logic to align at a 32-bit word boundary.
No packet is processed by the FPGA until the Sync word is found. The bus width must be
detected successfully for parallel configuration modes before the Sync word can be
detected. Table 9-14 shows the Sync word in an unswapped bitstream format.
(words) plus configuration overhead (words). Bitstream length (bits) is roughly equal to the
bitstream length in words times 32. See Table 1-4, page 18.
Bitstream Composition
After synchronization, the configuration logic processes each 32-bit data word as a
configuration packet or component of a multiple word configuration packet. Table
Table 9-15 shows the composition of a sample KU040 bitstream, generated using default
settings.
Configuration Packets
The configuration logic consists of a packet processor, a set of registers, and global signals
that are controlled by the configuration registers. The packet processor controls the flow of
data from the configuration interface to the appropriate register. The registers control all
other aspects of configuration. All FPGA bitstream commands are executed by reading or
writing to the configuration registers.
Packet Types
The FPGA bitstream consists of two packet types: Type 1 and Type 2. These packet types and
their use are described in this section.
Type 1 Packet
The Type 1 packet is used for register reads and writes. Only five out of 14 register address
bits are used. The header section is always a 32-bit word.
Following the Type 1 packet header is the Type 1 data section, which contains the number
of 32-bit words specified by the word count portion of the header. See Table 9-16 and
Table 9-17.
Notes:
1. “R" means the bit is not used and reserved for future use. The reserved bits should be written as 0s.
Type 2 Packet
The Type 2 packet, which must follow a Type 1 packet, is used to write long blocks. No
address is presented here because it uses the previous Type 1 packet address. The header
section is always a 32-bit word.
Following the Type 2 packet header is the Type 2 Data section, which contains the number
of 32-bit words specified by the word count portion of the header. See Table 9-18.
Configuration Registers
Table 9-19 summarizes the Type 1 packet registers. A detailed explanation of selected
registers follows.
The frame address register (FAR) is divided into four fields: block type, row address, column
address, and minor address (see Table 9-20 for UltraScale FPGAs and Table 9-21 for
UltraScale+ FPGAs). The address can be written directly or can be auto-incremented at the
end of each frame. The typical bitstream starts at address 0 and auto-increments to the final
count.
without respecifying the SBITS and PERSIST bits). The name of each bit position in the CTL0
register is given in Table 9-23 and described in Table 9-24.
EFUSE_KEY
ICAP_SELECT
Reserved
OverTempShutDown
Reserved
ConfigFallback
Reserved
GLUTMASK_B
Reserved
DEC
SBITS[1:0]
PERSIST
Reserved
GTS_USR_B
Bit 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Index
Value 0 0 x x x x x x x x x x x x x x x x x 0 x 0 x 1 0 0 0 0 0 x x 1
Reserved
CFG_BUS_WIDTH_DETECTION
Reserved
CFG_STARTUP_STATE_MACHINE_PHASE
SYSTEM_MONITOR_OVER_TEMP
SECURITY_ERROR
IDCODE_ERROR
DONE_PIN
DONE_INTERNAL_SIGNAL_STATUS
INIT_B_PIN
INIT_B_INTERNAL_SIGNAL_STATUS
MODE_PIN_M[2:0]
GHIGH_B_STATUS
GWE_STATUS
GTS_CFG_B_STATUS
END_OF_STARTUP_(EOS)_STATUS
DCI_MATCH_STATUS
MMCM_PLL_LOCKED
DECRYPTOR_ENABLED
CRC_ERROR
Bit 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Index
Value x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
Reserved
ECLK_EN
Reserved
DRIVE_DONE
Reserved
OSCFSEL
Reserved
DONE_CYCLE
MATCH_CYCLE
LOCK_CYCLE
GTS_CYCLE
GWE_CYCLE
Bit 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Index
Value 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 1 1 0 0
Reserved
RBCRC_ACTION
Reserved
RBCRC_NO_PIN
RBCRC_EN
Reserved
BPI_1ST_READ_CYCLE
BPI_PAGE_SIZE
Bit 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Index
Value 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
RS[1:0]
RS_TS_B
START_ADDR
Bit
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Index
Value 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
TIMER_USR_MON
TIMER_CFG_MON
TIMER_VALUE
Bit
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Index
Value 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Reserved
WRAP_ERROR_1
CRC_ERROR_1
ID_ERROR_1
WATCHDOG_TIMEOUT_ERROR_1
INTERNAL_PROG_1
FALLBACK_1
STATUS_VALID_1
Reserved
WRAP_ERROR_0
CRC_ERROR_0
ID_ERROR_0
WATCHDOG_TIMEOUT_ERROR_0
INTERNAL_PROG_0
FALLBACK_0
STATUS_VALID_0
Bit
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Index
Value 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Notes:
1. The default power-up state for all fields in the BOOTSTS register is 0, indicating no error, fallback, or valid configuration
detected. After configuration, a 1 in any bit indicates an error case, fallback, or completed configuration has been
detected.
CAPTURE
Reserved
Reserved
Bit
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Index
Value x x x x x x x x 0 x x x x x x x x x x x x x x x x x x x x x x x
Reserved
BPI_SYNC_MODE
BPI_SYNC_RCR
Reserved
SPI_32BIT_ADDR
SPI_BUSWIDTH
SPI_READ_OPCODE
Bit Index 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Value 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1
Introduction
The AMD UltraScale™ architecture-based FPGAs allow you to read configuration memory
through the SelectMAP, ICAP, and JTAG interfaces. During readback, all configuration
memory cells are read by default, including the current values on all user memory elements
(LUT RAM, SRL, and block RAM).
To read configuration memory, you must send a sequence of commands to the device to
initiate the readback procedure. You can send the readback command sequence from a
microprocessor, CPLD, or FPGA-based system, or use the configuration tools to perform
JTAG-based readback verify. After configuration memory is read from the device, the next
step is to determine if there are any errors by comparing the readback bitstream to the
configuration bitstream. The Verifying Readback Data section explains how this is done.
AMD device programming tools can automatically perform all readback and comparison
functions and report whether there were any configuration errors.
There are two mandatory bitstream settings for readback through the SelectMAP or JTAG
interfaces: the bitstream security setting must not prohibit readback, and bitstream
encryption must not be used. Additionally, if readback is to be performed through the
SelectMAP interface, the port must be set to retain its function after configuration by
setting the persist option in the bitstream generator, otherwise the SelectMAP data pins
revert to user I/O, precluding further configuration operations. Beyond these security and
encryption requirements, no special considerations are necessary to enable readback.
Readback capture provides the ability to read the current user state of internal CLB
registers, block RAM, distributed RAM, and SRL contents to check for proper design
functionality. The feature provides easy access for observing the design state with little
pre-planning and no added design logic resources. The readback capture flow is
demonstrated in Configuration Readback Capture in UltraScale FPGAs (XAPP1230) [Ref 22].
Persist Option
The persist bitstream option (BITSTREAM.CONFIG.PERSIST YES) maintains the configuration
logic access to the multi-function configuration pins after configuration. The persist option
is primarily used to maintain the SelectMAP port after configuration for readback access,
but persist can be used with any configuration mode. Persist is not needed for JTAG
configuration as the JTAG port is dedicated and always available. The persist option can
also be used to reconfigure the device from an external controller without pulsing the
PROGRAM_B pin or using the JTAG port. Persist and ICAP cannot be used at the same time.
Persist is also not recommended for standard Master SPI/BPI configuration mode setups.
The multi-function pins that persist depend on the configuration mode pin settings.
Table 10-1 shows which UltraScale or UltraScale+ multi-function configuration pins persist
on bank 65 when the persist bitstream option is selected. Any I/O pins that persist cannot
be used as I/O in the user design. Use the CONFIG_MODE constraint to reserve the correct
pins during implementation of the design. Persisted I/O use the standard default of
LVCMOS, 12 mA drive, fast slew rate.
A[15:00]_
- - - - - A[15:00] A[15:00] - - - - D[31:16]
D[31:16]
EMCCLK (1) EMCCLK (1) EMCCLK(1) EMCCLK (1) EMCCLK (1) EMCCLK (1) EMCCLK (1) EMCCLK (1) EMCCLK (1) - - -
CSI_ADV_B (1) - - - - - ADV_B (1) ADV_B(1) CSI_B CSI_B CSI_B CSI_B CSI_B
DOUT_CSO_B DOUT - - DOUT DOUT CSO_B CSO_B CSO_B CSO_B CSO_B CSO_B CSO_B
Notes:
1. EMCCLK and ADV_B pins are not persisted for UltraScale devices, but they are persisted for UltraScale+ devices.
The procedure for changing the SelectMAP interface between Write and Read Control is:
1. Deassert CSI_B.
2. Toggle RDWR_B.
3. Assert CSI_B.
4. CSI_B and RDWR_B are synchronous to CCLK.
5. Readback data is valid deterministically three clock cycles after the CSI_B pin is asserted
during readback. Controllers can capture valid readback data on the fourth rising clock
edge.
Read
UG470_c6_01_110413
Figure 10-1: Changing the SelectMAP Port from Write to Read Control
interface, although not all registers offer read access. The procedure for reading the STAT
register through the SelectMAP interface follows:
1. Write the bus width detection sequence and synchronization word to the device
followed by at least one NOOP.
2. Write the read STAT register packet header to the device.
3. Write two NOOP commands to the device to flush the packet buffer.
4. Read one word from the SelectMAP interface; this is the Status register value.
5. Write the DESYNC command to the device.
6. Write two dummy words to the device to flush the packet buffer.
You must change the SelectMAP interface from write to read control between steps 9 and
10, and back to write control after step 10.
To read registers other than STAT, the address specified in the Type 1 packet header in step
7 of Table 10-2 should be modified and the word count changed if necessary. Reading from
the FDRO register is a special case that is described in Configuration Memory Read
Procedure (SelectMAP).
1. Write the bus width detection sequence and synchronization word to the device.
2. Write at least one NOOP command.
3. Write the Shutdown command, and write one NOOP command.
4. Write the RCRC command to the CMD register, and write one NOOP command.
5. Write five NOOP instructions to ensure the shutdown sequence has completed. DONE
goes Low during the shutdown sequence.
6. Write the RCFG command to the CMD register, and write one NOOP command.
7. Write the starting frame address to the FAR (typically 0x00000000).
8. Write the read FDRO register packet header to the device. The FDRO read length is:
One extra frame is read to account for the frame buffer. The frame buffer produces one
dummy frame at the beginning of the read. 10 or 25 extra words are read to account for
pipelining.
Readback commands are written to the configuration logic by going through the CFG_IN
register; configuration memory is read through the CFG_OUT register. The JTAG state
transitions for accessing the CFG_IN and CFG_OUT registers are described in Table 10-4.
Table 10-4: Shifting in the JTAG CFG_IN and CFG_OUT Instructions (Cont’d)
Set and Hold Number
Step Description of Clocks
TDI TMS (TCK)
10 Shift the LSB while exiting SHIFT-DR X 1 1
11 Reset the TAP by clocking five 1s on TMS X 1 5
The MSB of every configuration packet sent through the CFG_IN register must be sent
first. The LSB is shifted while moving the TAP controller out of the SHIFT-DR state.
4. Shift the CFG_OUT instruction into the JTAG Instruction register through the Shift-IR
state. The LSB of the CFG_OUT instruction is shifted first; the MSB is shifted while
moving the TAP controller out of the SHIFT-IR state.
5. Shift 32 bits out of the Status register through the Shift-DR state.
6. Reset the TAP controller.
The packets shifted into the JTAG CFG_IN register are identical to the packets shifted in
through the SelectMAP interface when reading the STAT register through SelectMAP.
The MSB of all configuration packets sent through the CFG_IN register must be sent
first. The LSB is shifted while moving the TAP controller out of the SHIFT-DR state.
8. Shift the CFG_OUT instruction into the JTAG Instruction register through the Shift-DR
state. The LSB of the CFG_OUT instruction is shifted first; the MSB is shifted while
moving the TAP controller out of the SHIFT-IR state.
9. Shift frame data from the FDRO register through the Shift-DR state.
10. Reset the TAP controller.
Readback Data
10 Words Pipeline
Total
Number of
Device Frame Data
Frames
UG570_c10_01_112014
The RBD and MSD files contain an ASCII representation of the readback and mask data
along with a file header that lists the file name, etc. This header information should be
ignored or deleted. The ASCII 1s and 0s in the RBD and MSD files correspond to the binary
readback data from the device. Take care to interpret these files as text, not binary sources.
Users can convert the RBD and MSD files to a binary format using a script or text editor, to
simplify the verify procedure for some systems and to reduce the size of the files by a factor
of eight. See Figure 10-3.
X-Ref Target - Figure 10-3
MSD RBD
Readback File File
Data Stream
10 Words Pipeline File Header File Header
Total
Number of Frame Data Frame Data Frame Data
Device Mask
Frames
UG570_c10_02_112014
Figure 10-3: Comparing Readback Data Using the MSD and RBD Files
The drawback to this approach is that in addition to storing the initial configuration
bitstream and the MSD file, the golden RBD file must be stored somewhere, increasing the
overall storage requirement.
After sending readback commands to the device, comparison begins by aligning the
beginning of the readback frame data to the beginning of the FDRI write in the BIT and MSK
files. The comparison ends when the end of the FDRI write is reached.
This approach requires the least in-system storage space, because only the BIT, MSK, and
readback commands must be stored.
Readback
Data Stream MSK BIT
File File
10 Words Pipeline
File Header File Header
1 Frame Pad Frame
Commands Commands
Total
Number of Frame Data Frame Data Frame Data
Device Mask
Frames
Commands Commands
UG470_c6_03_112110
Figure 10-4: Comparing Readback Data Using the MSK and BIT Files
RECOMMENDED: Use the AMD Soft Error Mitigation (SEM) IP, which automates the implementation of
SEU detection and correction. Note that readback CRC is not supported outside of the SEM IP, and that
the SEM IP is not supported in the KU025 device.
After readback CRC is enabled, the dedicated configuration logic reads back continuously in
the background to check the CRC of the configuration memory content. As some
configuration bits can change during operation, such as distributed RAM, the readback CRC
function masks these bits so that variable locations are ignored. These dynamically
changeable memory locations are masked during background readback:
When enabled, the readback CRC logic automatically runs in the background after
configuration is DONE, and when these conditions hold:
• The FPGA is configured successfully, as indicated by the DONE pin going High.
• The configuration interface has been parked correctly. A normal bitstream has a
DESYNC command at the end that signals to the configuration interface that it is no
longer being used. The DESYNC command clears the CRC_ERROR flag.
• The JTAG interface is not controlling the internal configuration bus via the JTAG CFG_IN
instruction, CFG_OUT instruction, or ISC_ENABLE function.
The SEM IP uses the ICAPE3 and runs the readback CRC on the ICAPE3 CLK source.
In UltraScale devices, any use of configuration readback (Readback CRC, SEM-IP, internal or
external SEU scrubbing, and other configuration readback activities) or partial
reconfiguration exercises data lines that have a small amount of coupling into the VCO of
the MMCM and PLLs. As a result, varying amounts of increased time interval error jitter can
be observed. See AR#71314 for guidance and mitigation techniques. See the Vivado Design
Suite User Guide: Programming and Debugging (UG908) [Ref 8] for write_bitstream
properties information.
Readback Capture
Readback capture uses the same readback process but requires additional commands to be
issued during the readback sequence to read the user state of the internal CLB registers. The
UltraScale FPGAs do not have dedicated capture memory cells, so the CAPTUREE2 primitive
and GCAPTURE commands available in 7 series FPGAs are no longer required or supported
in UltraScale FPGAs. In the UltraScale FPGAs, a write to the mask (MSK) register and control
1 (CTL1) register (CAPTURE bit[23]) are required to enable the readback capture of the CLB
registers. In the UltraScale FPGAs, you must stop or disable the clock associated with the
user state elements being targeted throughout the duration of the readback capture
sequence. The write_bitstream option -logic_location creates an ASCII format .ll file
for bit mapping the locations of the block RAM, distributed RAM, SRLs, and CLB registers
within the readback data.
Introduction
This chapter focuses on full bitstream reconfiguration methods available in AMD
UltraScale™ architecture-based FPGAs.
Fallback MultiBoot
The FPGA MultiBoot and fallback features support updating bitstream images dynamically
in the field. The FPGA MultiBoot feature enables switching between images on the fly. When
an error is detected during the MultiBoot configuration process, the FPGA can trigger a
fallback feature that ensures a known good design can be loaded into the device. The
MultiBoot and fallback feature can be used with all master configuration modes.
When fallback occurs, an internally generated pulse resets the entire configuration logic,
except for the dedicated MultiBoot logic, the WBSTAR (warm boot start address), the BSPI,
and the BOOTSTS (boot status) registers. This reset pulse pulls INIT_B and DONE Low, clears
the configuration memory, and restarts the configuration process from address 0 with the
revision select (RS) pins driven to 00 (in BPI mode). After the reset, the bitstream overwrites
the WBSTAR starting address.
• An IDCODE error
• A CRC error
• A Watchdog Timer timeout error
• A BPI address wraparound error
Figure 11-1 shows the flow for the initial setup of the golden image.
X-Ref Target - Figure 11-1
Trigger MultiBoot
Upper Address
Golden Image
Configuration No
BIT File Stored at Passes?
Address 0
Yes
• There are no hardware specific requirements, except when using RS[1:0] pins for
address control in the BPI mode; see the BPI – Hardware RS Pin Design Considerations
section.
• The IPROG command is embedded in the golden image by the bitstream setting for the
next configuration address (BITSTREAM.CONFIG.NEXT_CONFIG_ADDR), or is issued by
code through the ICAP primitive ICAPE3 instantiated within the golden image design.
The IPROG command can be turned off in a bitstream with
BITSTREAM.CONFIG.NEXT_CONFIG_REBOOT DISABLE, so that the golden image is
loaded on power up.
• The WBSTAR (warm boot start address) register is set to jump to an address in the
bitstream settings or through ICAP.
• The MultiBoot image must be stored in flash at the address in the WBSTAR register.
• The Watchdog Timer is enabled in the bitstream options to recover from incompletely
programmed flash bitstreams.
Figure 11-2 shows the flow for the initial setup of the MultiBoot image.
X-Ref Target - Figure 11-2
Configuration No
Upper Address Passes?
Golden Image
Yes
• The WBSTAR setting in the bitstream options points to the MultiBoot location.
• An IPROG command is inserted through the bitstream options to trigger loading of
MultiBoot at power-up.
• The Watchdog Timer is enabled in the bitstream options.
• ICAPE3 instantiated with code to issue an IPROG command can also be included if the
golden image can repair the flash and trigger another loading from the MultiBoot
image.
• The WBSTAR setting in the bitstream options points to the MultiBoot location.
• The Watchdog Timer is enabled in the bitstream options.
• ICAPE3 instantiated with code to issue an IPROG command can also be included if the
MultiBoot image can upgrade the flash and trigger another loading from the upgraded
MultiBoot image.
power-up bistream to be located in the flash. With this hardware implementation, the
system is exclusive of the WBSTAR address, and the bitstream options are the same for each
image. Refer to RS Pins for further details.
IPROG
The internal PROGRAM (IPROG) command is a subset of the functionality of pulsing the
PROGRAM_B pin. The fundamental difference is that the IPROG command does not erase
the WBSTAR, TIMER, BSPI, and BOOTSTS registers used to initiate MultiBoot and fallback.
The IPROG command triggers an initialization, and both INIT and DONE go Low when the
IPROG command is issued followed by an attempt to configure.
This command can be issued one of two ways. In the first way, the IPROG command can be
issued through the ICAP, which is controlled by user logic. This allows user logic to initiate
device reconfiguration. In the second way, the IPROG command can be embedded during
bitstream generation. In this scenario, the WBSTAR and IPROG commands are set at the
beginning of the golden bit file. At power-up, the device starts reading the BIT file from the
flash and reads in the WBSTAR register and IPROG command. The IPROG command triggers
the device to reload from the address specified. If there is an issue with the upper image,
the base address is loaded again. Now, the IPROG command is skipped by the configuration
controller because the device saw an error. A fallback condition blocks the IPROG command
from being processed, and the device continues to load the golden image. After a
successful configuration, the IPROG command can be issued to the device, which enables
the golden image to trigger configuration from a MultiBoot image.
WBSTAR Register
The WBSTAR (Warm Boot Start Address) register holds the address that the configuration
controller uses after an IPROG command is issued. This can be either in the form of an
address, or values for the RS pins in BPI mode. This register can be loaded from bitstream
options or from the ICAP. If the register is not set in the bitstream options, it is loaded with
a default value of 0s. Therefore, after the golden image sets the WBSTAR value and initiates
a multiboot configuration, the multiboot pattern resets the WBSTAR to 0 by default.
At power-up, the device issues the read command to the flash followed by a start address
of 0. After the WBSTAR command has been loaded and the IPROG command is issued, the
configuration controller issues the read command from the address specified by the
WBSTAR address.
Watchdog Timer
The Watchdog timer has two modes (configuration monitor and user logic monitor), which
are mutually exclusive of each other. In the more common configuration monitor mode,
when the Watchdog Timer times out, the configuration logic loads the fallback bitstream.
For situations where configuration does not begin, or begins properly but does not
complete, such as for an invalid or a partially corrupted configuration source, the Watchdog
timer allows the device to automatically re-attempt configuration after a reasonable delay.
The Fallback MultiBoot section provides more details.
In configuration monitor mode, the TIMER register is set in the BIT file by the bitstream
generator. This timer value is then used for both the configuration of the bitstream, which
sets the value, as well as any subsequent loads triggered by an IPROG command. The TIMER
register needs to be set in all BIT files.
The TIMER register counts down from the start to the bitstream and is disabled by the end
of the start-up sequence. If the count reaches 0, a fallback is triggered. The start-up
sequence can be delayed by the MMCM wait or DCI match settings; these delays need to be
taken into account. The TIMER register runs at the CCLK frequency.
The Watchdog Timer can be enabled in the bitstream or through any configuration port by
writing to the TIMER register. The Watchdog Timer is disabled during and after fallback
reconfiguration. A successful IPROG reconfiguration initiated by a successful fallback
reconfiguration is necessary to re-enable the Watchdog Timer.
In user logic monitor mode, the Watchdog Timer uses a dedicated internal clock, CFGMCLK,
which has a nominal frequency of 50 MHz. The clock is pre-divided by 256, so that the
Watchdog Timer clock period is about 5,120 ns. Given the watchdog counter is 30 bits wide,
the maximum possible watchdog value is about 5,500 seconds. The time value can be set
using the bitstream options.
After it is enabled, the Watchdog Timer starts to count down. If the timer reaches 0 and the
FPGA has not reached the final state of start-up, a watchdog timeout error occurs and
triggers a fallback configuration.
Table 11-1 shows an example bitstream for reloading the Watchdog Timer using the LTIMER
command.
Table 11-1: Example Bitstream for Reloading the Watchdog Timer with LTIMER
Configuration Data Explanation
(hex)
FFFFFFFF Dummy word
AA995566 Sync word
20000000 Type 1 NOOP
30008001 Type 1 Write 1 words to CMD
00000000 NULL
20000000 Type 1 NOOP
30008001 Type 1 Write 1 words to CMD
00000011 LTIMER command
20000000 Type 1 NOOP
30008001 Type 1 Write 1 words to CMD
0000000D DESYNC
20000000 Type 1 NOOP
Table 11-2 shows an example bitstream for directly accessing the TIMER register.
Table 11-2: Example Bitstream for Accessing the TIMER Register (Cont’d)
Configuration Explanation
Data (hex)
20000000 Type 1 NOOP
30008001 Type 1 Write 1 words to CMD
0000000D DESYNC
20000000 Type 1 NOOP
RS Pins
The dual-purpose RS pins are disabled by default. The RS pins drive Low during a fallback
for Master BPI configuration mode. For initial MultiBoot systems, the RS pins are wired to
upper address bits of the flash and strapped High or Low with a pull-up or pull-down
resistor, respectively. At power-up, the system boots to the upper address space defined by
the pull-up resistors on the RS and address line connections. During a fallback, the RS pins
drive Low and the device boots from address space 0. The RS pins should be tied to upper
addresses defined by the system to allow for full bit files to be stored in each memory
segment.
Tying the RS pins to the flash upper address pins allows for easy selection between up to
four images. When using this feature with the basic BPI asynchronous read, the user should
be aware that the bitstream size must be equal or less than one fourth the size of the flash
as the RS pins are held at a static value allowing access to one fourth of the flash with one
selection.
IPROG Reconfiguration
The internal PROGRAM_B (INTERNAL_PROG or IPROG) command has similar effect as
pulsing the PROGRAM_B pin, except IPROG does not reset the dedicated reconfiguration
logic. The start address set in WBSTAR (see Warm Boot Start Address Register (10000) in
Chapter 9) is used during SPI or BPI reconfiguration instead of the default address. The
default is zero. The IPROG command can be sent through ICAP or the bitstream. The IPROG
Using ICAP and IPROG Embedded in the Bitstream sections describe these two procedures.
Note that the ICAP interface is similar to the SelectMAP interface, and therefore the input
configuration bus needs to be bit-swapped (see Parallel Bus Bit Order, page 142). For more
information on ICAPE3, see the UltraScale Architecture Libraries Guide (UG974) [Ref 17].
Table 11-3 shows an example bitstream for the IPROG command using ICAP.
Table 11-3: Example Bitstream for IPROG through ICAP
Configuration
Data (hex)(1) Explanation
Notes:
1. See Parallel Bus Bit Order, page 142.
After the configuration logic receives the IPROG command, the FPGA resets everything
except the dedicated reconfiguration logic, and the INIT_B and DONE pins go Low. After the
FPGA clears all configuration memory, INIT_B goes High again. Then, the value in WBSTAR
is used for the bitstream starting address. The configuration mode determines which pins
are controlled by WBSTAR. See Table 11-4.
Table 11-4: WBSTAR Controlled Pins According to Configuration Mode
Configuration Mode Pins Controlled by WBSTAR
Master SPI START_ADDR is sent to the flash device serially.
Master BPI RS[1:0], A[28:00]
RS[1:0] is controllable by WBSTAR in BPI mode only. The START_ADDR field is only
meaningful for the BPI and SPI modes.
VCCO
FPGA
2.4 kΩ
Parallel Flash
RS[1] 1’ b1
RS[1:0] RS[1:0]
RS[1:0] ADDR[28:27] RS[0] 1’ b1
FWE_B WE_B
2.4 kΩ
FOE_B OE_B
FCS_B CS_B
FPGA Actively
A[26:00] ADDR[26:0]
D[15:00] DATA[15:0] Drives 2’b11
UG570_c11_03_080615
1. All BPI pins, except the CCLK, FCS_B, and D[03:00] pins, are multi-function I/Os. After
configuration is finished (the DONE pins goes High), these pins become user I/Os and
can be controlled by user logic to access flash for user data storage and programming.
2. In this example, RS[1:0] is set to 2'b11. During IPROG reconfiguration, the RS[1:0] pins
override the external pull-up and pull-down resistors. You can specify any RS[1:0] value
in the WBSTAR register using BITSTREAM.CONFIG.REVISIONSELECT.
Dummy
Sync Word
WBSTAR = A1
First Bitstream
IPROG Command
...
...
FPGA
Address = A1
Dummy
Sync Word
WBSTAR = 0
Final Bitstream
NULL Command
...
...
UG570_c11_04_012714
Table 11-5, Table 11-6, and Table 11-7 show the BOOTSTS values in some common
situations.
Table 11-7: IPROG Embedded in First Bitstream, Second Bitstream CRC Error, Fallback Successfully
Reserved WRAP_ERROR CRC_ERROR ID_ERROR WTO_ERROR IPROG FALLBACK VALID
Status_1 0 0 1 0 0 1 0 1
Status_0 0 0 0 0 0 1 1 1
1. Status_1 shows IPROG was attempted, and a CRC_ERROR was detected for that
bitstream.
2. Status_0 shows a fallback bitstream was loaded successfully. The IPROG bit was also set
in this case, because the fallback bitstream contains an IPROG command. Although the
IPROG command is ignored during fallback, the status still records this occurrence.
For an example design, see MultiBoot and Fallback with SPI Flash in UltraScale FPGAs
(XAPP1257) [Ref 24].
Introduction
In applications requiring multiple FPGAs, all of the devices can be configured from a single
configuration source. FPGAs that use the same configuration file can be gang loaded at the
same time. FPGAs that use different configuration files can be loaded sequentially, either
through built-in FPGA logic in a daisy chain, or using external logic. This chapter covers the
following topics:
to the configuration data source is considered the most upstream device, while the device
furthest from the configuration data source is considered the most downstream device.
In a serial daisy chain, the configuration clock is typically provided by the most upstream
device in SPI mode. All other devices are set for slave serial mode. Figure 12-1 illustrates
this configuration.
X-Ref Target - Figure 12-1
M0 M1 M0 M1
Flash
M2 M2
Master Slave
SPI Serial
(1)
4.7 k
PROGRAM_B 4.7 k (2)
PROGRAM_B
PROGRAM_B UG570_c12_01_081122
Figure 12-1: Master/Slave Serial Mode Daisy Chain Configuration Interface Example
1. The DONE pin is by default an open-drain output. See Table 1-9, page 27 for DONE
signal details.
2. The INIT_B pin is a bidirectional, open-drain pin. An external pull-up resistor is required.
3. See Figure 2-2, page 44 for a more detailed view of the master SPI connections.
4. Fallback MultiBoot is not supported in this configuration.
The first device in a serial daisy chain is the last to be configured. CRC checks only include
the data for the current device, not for any others in the chain. (See CRC Check (Step 7) in
Chapter 9.)
After the last device in the chain finishes configuration and passes its CRC check, it enters
the start-up sequence. At the release DONE pin phase in the start-up sequence, the device
places its DONE pin in a high-Z state while the next to the last device in the chain is
configured. After all devices release their DONE pins, the common DONE signal is pulled
High externally. On the next rising CCLK edge, all devices move out of the release DONE pin
phase and complete their start-up sequences.
It is important that all DONE pins in a slave serial daisy chain be connected.
• Select a CCLK frequency that is compatible with all devices in the daisy chain.
• UltraScale architecture-based FPGAs should always be at the beginning of the serial
daisy chain, with 7 series devices located at the end of the chain.
• All UltraScale architecture-based FPGAs have similar bitstream options. The guidelines
provided for UltraScale architecture-based FPGAs bitstream options should be applied
to all devices in a serial daisy chain, when possible.
• The number of configuration bits that a device can pass through its DOUT pin is
limited. For UltraScale architecture-based FPGAs and 7 series FPGAs, the limit is
4,294,967,264 bits. The sum of the bitstream lengths for all downstream devices must
not exceed this number.
GTS should be released before DONE or during the same cycle as DONE to ensure the
device is operational when all DONE pins have been released.
It is important to connect the DONE pins for all devices in a serial daisy chain. Failing to
connect the DONE pins can cause configuration to fail. For debugging purposes, it is
often helpful to have a way of disconnecting individual DONE pins from the common
DONE signal, so that devices can be individually configured through the serial or JTAG
interface.
One device is typically set for master SPI mode (to drive CCLK) while the others are set for
slave serial mode. For ganged serial configuration, all devices must be identical.
M0 M1
Processor
M2
PROGRAM_B
DONE INIT_B
PROGRAM_B
M0 M1
M2
DIN DOUT
CCLK
FPGA
Slave
Serial
PROGRAM_B
DONE INIT_B
UG570_c12_02_020515
1. The DONE pin is by default an open-drain output. See Table 1-9, page 27 for DONE
signal details.
2. The INIT_B pin is a bidirectional, open-drain pin. An external pull-up resistor is required.
3. All devices must be identical (same IDCODE) and must be configured with the same
bitstream.
4. See Figure 3-2, page 55 for a more detailed view of the slave serial connections.
GTS should be released before DONE or during the same cycle as DONE to ensure all
devices are operational when all DONE pins have been released.
It is important to connect the DONE pins for all devices in ganged serial configuration
if one FPGA is used as the master device. Failing to connect the DONE pins can cause
configuration to fail for individual devices in this case. If all devices are set for slave
serial mode, the DONE pins can be disconnected (if the external CCLK source continues
toggling until all DONE pins go High).
After all DONE pins are released, the DONE pin should rise from logic 0 to logic 1 in one
CCLK cycle. If additional time is required for the DONE signal to rise, the DonePipe
option can be set for all devices in the serial daisy chain.
The CCLK signal is relatively slow, but the edge rates on the UltraScale FPGA’s input
buffers are very fast. Even minor signal integrity problems on the CCLK signal can cause
the configuration to fail. (Typical failure mode: DONE Low and INIT_B High.) Therefore,
design practices that focus on signal integrity, including signal integrity simulation with
IBIS, are recommended.
• Signal fanout
Designers must focus on good signal integrity when using ganged serial configuration.
Signal integrity simulation is recommended.
Files for ganged serial configuration are identical to the files used to configure single
devices. There are no special file considerations.
If Readback is going to be performed on the device after configuration, the RDWR_B signal
must be handled appropriately. (For details, refer to Chapter 10, Readback Verification and
CRC.)
Otherwise, RDWR_B can be tied Low. Refer to Bitstream Loading (Steps 4-7) in Chapter 9.
X-Ref Target - Figure 12-3
DATA[7:0]
CCLK
WRITE
M1 M2 M1 M2
M0 M0
Slave Slave
SelectMAP SelectMAP
D[7:0] D[7:0]
CCLK CCLK
RDWR_B RDWR_B
DONE
INIT_B
PROGRAM_B
UG570_c12_03_031915
Figure 12-3: Multiple Slave Device Configuration Interface Example on an 8-Bit SelectMAP Bus
1. The DONE pin is by default an open-drain output. See Table 1-9, page 27 for DONE
signal details.
2. The INIT_B pin is a bidirectional, open-drain pin. An external pull-up resistor is required.
3. An external controller such as a microprocessor or CPLD is needed to control
configuration.
4. The data bus can be x8, x16, or x32 (for slave SelectMAP).
5. See Figure 5-2, page 76 for a more detailed view of the slave SelectMAP connections.
4.7 kΩ
330Ω
330Ω
CSO_B CSO_B CSO_B No
INIT_B INIT_B INIT_B Connect
DONE DONE DONE
FPGA FPGA FPGA
A[28:00] A[28:00]
D[15:00] D[15:00] D[15:00] D[15:00]
Flash CS_B FCS_B CSI_B CSI_B
OE_B FOE_B RDWR_B RDWR_B
WE_B FWE_B CCLK CCLK CCLK
M2 M1 M0 M2 M1 M0 M2 M1 M0
0 1 0 1 1 0 1 1 0
M[2:0] = BPI
M[2:0] = Slave SelectMAP M[2:0] = Slave SelectMAP
UG570_c12_04_031915
1. The DONE pin is by default an open-drain output. See Table 1-9, page 27 for DONE
signal details.
2. The INIT_B pin is a bidirectional, open-drain pin. An external pull-up is required.
3. The FCS_B, FWE_B, FOE_B, CSO_B weak pull-up resistors should be enabled, otherwise
external pull-up resistors are required for each pin. By default, all dual-mode I/Os have
weak pull-downs after configuration.
4. The first device in the chain can be master SelectMAP, slave SelectMAP, or BPI. See
Figure 4-2, page 61 for a more detailed view of the BPI connections.
5. Readback in the parallel daisy chain scheme is not supported.
All devices can be set for slave SelectMAP mode if an external oscillator is available as
illustrated in Figure 12-5, or one device can be designated as the master device.
X-Ref Target - Figure 12-5
M1 M0
M2
FPGA
SelectMAP
DATA[0:7] D[0:7]
Slave
CCLK CCLK
PROGRAM_B PROGRAM_B
DONE
INIT_B INIT_B
RDWR_B
Processor 4.7 kΩ
CSI_B
DONE
M1 M0
M2
FPGA
SelectMAP
D[0:7] Slave
CCLK
PROGRAM_B
4.7 kΩ
INIT_B
RDWR_B
CSI_B
DONE
UG570_c12_05_031915
1. The DONE pin is by default an open-drain output. See Table 1-9, page 27 for DONE
signal details.
2. The INIT_B pin is a bidirectional, open-drain pin. An external pull-up resistor is required.
3. See Figure 5-2, page 76 for a more detailed view of the slave SelectMAP connections
If one device is designated as the master, the DONE pins of all devices must be connected.
The DONE pin is by default an open-drain output. See Table 1-9, page 27 for DONE signal
details. Designers must carefully focus on signal integrity due to the increased fanout.
Signal integrity simulation is recommended.
Readback is not possible if the CSI_B signals are tied together, because all devices
simultaneously attempt to drive the data signals.
Configuration Debugging
Introduction
Some best practices are discussed in this chapter that will help resolve issues that might be
encountered when implementing a configuration solution. Topics discussed include:
This command will display all properties applied to a design. Where there are no values
displayed, the default is applied. Also, review the flash programming file generation
options. Verify that the proper data widths and data ordering options are used for the flash
programming file generation. All DRC warnings received during configuration file
generation should be reviewed and corrected.
In addition to the status signals, there are key configuration pins that provide helpful
information and should be handled carefully to prevent problems during configuration.
These pins are listed below:
PROGRAM_B Pin
The PROGRAM_B pin re-configures the FPGA and is often tied to a push button for easy
access. The pin must be held High during the configuration process.
CFGBVS Pin
The configuration bank voltage select pin (AMD UltraScale FPGAs only) must be tied
appropriately to GND or VCCO to support the 1.8V or 3.3V maximum range required by your
design.
PUDC_B
The PUDC_B pin determines whether or not I/O pull-ups are enabled during configuration.
Configuration Sequence
During the configuration process, there are some basic checks that can be performed to
help isolate an issue. AMD FPGA bitstreams have a unique header. The header includes a
synchronization word and can include an auto detect, a configuration clock type, and a rate
setting. For UltraScale™ architecture-based FPGAs, this sync word is shown:
AA995566
The sync word is a valuable debug parameter. You can scope the data pins and when you
see the synchronization word, you know that the bitstream header is seen. Shortly after this,
there should be transitions of the increased configuration clock rate if the configuration
rate speed-up or external master CCLK (EMCCLK) options are used.
modes or the master mode EMCCLK option, ensure enough clock cycles are supplied to
complete the start-up sequence.
If you do not clock the start-up completely, some of the following symptoms can be
observed:
Multi-function configuration and I/O pins operate in LVCMOS rather than the specified
I/O standard.
• ICAP interface cannot be accessed from the FPGA logic because the configuration logic
is locked.
This will occur if the device has not reached the end of start-up state. The device can be
fully operational before the device reaches this end of start-up state. This can lead to
ICAP read and write failures or multi-function pins not operating in the correct I/O
standard. This event is indicated by the EOS signal being driven High. This can be
observed in the STATUS register or detected in the FPGA using the STARTUPE3 primitive.
For designs accessing the ICAP, it is good design practice to instantiate the STARTUPE3
primitive. This primitive has an EOS pin, which will indicate when the configuration
process has completed and the ICAP is available for read and write access.
• Verify that you have the latest version of the tools. Even if you must use an earlier
version of the implementation tools, the latest configuration tools can be downloaded
for free and used independently by going to the download center and selecting "Lab
Tools."
Documentation Navigator
Documentation Navigator (DocNav) is an installed tool that provides access to AMD
Adaptive Computing documents, videos, and support resources, which you can filter and
search to find information. To open DocNav:
Design Hubs
Design Hubs provide links to documentation organized by design tasks and other topics,
which you can use to learn key concepts and address frequently asked questions. To access
the Design Hubs:
Support Resources
For support resources such as Answers, Documentation, Downloads, and Forums, see
Support.
References
1. UltraScale Architecture Gen3 Integrated Block for PCI Express LogiCORE IP Product Guide
(PG156)
2. UltraScale+ Devices Integrated Block for PCI Express Product Guide (PG213)
3. Integrated Logic Analyzer Product Guide (PG172)
4. SPI Flash Programming Including Bitstream Revision Selection (XAPP1191)
5. UltraScale Architecture and Product Overview (DS890)
6. 3D ICs website
7. Vivado Design Suite Properties Reference Guide (UG912)
8. Vivado Design Suite User Guide: Programming and Debugging (UG908)
9. Kintex UltraScale FPGAs Data Sheet: DC and AC Switching Characteristics (DS892)
10. Virtex UltraScale FPGAs Data Sheet: DC and AC Switching Characteristics (DS893)
11. Vivado Design Suite User Guide: I/O and Clock Planning (UG899)
12. SPI Configuration and Flash Programming in UltraScale FPGAs (XAPP1233)
13. UltraScale FPGA Post-Configuration Access of SPI Flash Memory using STARTUPE3
(XAPP1280)
14. UltraScale FPGA BPI Configuration and Flash Programming (XAPP1220)
15. Using a Microprocessor to Configure 7 Series FPGAs via Slave Serial or Slave SelectMAP
Mode (XAPP583)
16. UltraScale Architecture System Monitor User Guide (UG580)
17. UltraScale Architecture Libraries Guide (UG974)
18. Bitstream Identification with USR_ACCESS using the Vivado Design Suite (XAPP1232)
19. Using Encryption and Authentication to Secure an UltraScale/UltraScale+ FPGA Bitstream
(XAPP1267)
20. Vivado Design Suite Tcl Command Reference Guide (UG835)
21. UltraScale Architecture SelectIO Resources User Guide (UG571)
Revision History
The following table shows the revision history for this document.