100% found this document useful (2 votes)
273 views74 pages

Module 3 - Full

This document discusses different architectures for field programmable gate arrays (FPGAs), including matrix-based, row-based, hierarchical, and sea-of-gates architectures. It also describes the granularity of FPGA logic blocks from fine-grained to coarse-grained. Fine-grained logic blocks resemble basic cells from mask-programmable gate arrays and can implement simple functions. Coarse-grained blocks contain more logic like look-up tables and flip-flops and can implement complex functions but require more routing resources. The granularity of logic blocks impacts the density and performance of the FPGA.

Uploaded by

suma_hari6244
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
273 views74 pages

Module 3 - Full

This document discusses different architectures for field programmable gate arrays (FPGAs), including matrix-based, row-based, hierarchical, and sea-of-gates architectures. It also describes the granularity of FPGA logic blocks from fine-grained to coarse-grained. Fine-grained logic blocks resemble basic cells from mask-programmable gate arrays and can implement simple functions. Coarse-grained blocks contain more logic like look-up tables and flip-flops and can implement complex functions but require more routing resources. The granularity of logic blocks impacts the density and performance of the FPGA.

Uploaded by

suma_hari6244
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 74

ECT 393: FPGA BASED SYSTEM DESIGN

S5 HONOURS
MODULE 3
Typical architectures for FPGAs
• FPGA architecture or organization refers to the manner or topology in which the
logic blocks and interconnect resources are distributed inside the FPGA.
• FPGAs can be classified into four different basic architectures or
topologies:

• Matrix-based (symmetrical array) architectures

• Row-based architectures 2

• Hierarchical PLD architectures

• Sea-of-gates architecture
Matrix based (Symmetrical arrays) architecture:
• Logic blocks in this type of FPGA are organized in a matrix-like fashion
• Most Xilinx FPGAs belong to this category
• This architecture consists of logic elements (called CLBs) arranged in rows and
columns of a matrix and interconnect laid out between them .

3
• These architectures typically contain 8x8 arrays in the smaller chips and 100x100
or larger arrays in the bigger chips.
• This symmetrical matrix is surrounded by I/O blocks which connect it to outside
world.
• The routing resources are interspersed between the logic blocks.
• The routing in these architectures is often called two-dimensional channeled
routing since routing resources are generally available in horizontal and vertical
directions.
4
Row based architecture:
• These architectures were inspired by traditional gate arrays
• Row-based architecture consists of alternating rows of logic modules and
programmable interconnect tracks.

• Input output blocks is located in the periphery of the rows.

5
• Routing tracks are divided into smaller segments connected by anti-fuse elements
between them.
• One row may be connected to adjacent rows via vertical interconnect.
• Traditional mask-programmable gate arrays use very similar architectures.
• The routing in these architectures is often called one-dimensional channeled
routing, because the routing resources are located as a channel in between rows of
logic resources.

• Some Microsemi FPGAs employ this architecture


6
Hierarchical PLDs:
• In some FPGAs, blocks of logic cells are grouped together by a local interconnect,
and several such groups are interconnected by another level of interconnect.
• Thus, there is a hierarchy in the organization of these FPGAs

7
• In Altera APEX20 and APEX II FPGAs, 10 or so logic elements are connected to
form what Altera calls a Logic Array Block (LAB), and then several LABs are
connected to form a MEGALAB.
• These FPGAs contain clusters of logic blocks with localized resources for
interconnection.
• The global interconnect network is used for the interconnections between the
clusters of logic blocks in these FPGAs.
• Each logic module has combinatorial as well as sequential functional
8 elements.
• Each of these functional elements is controlled by the programmed memory.
• Input output blocks surround this scheme of logic blocks and interconnects
Sea of gates:
• The sea-of-gates architecture is yet another manner to organize the logic blocks
and interconnect in an FPGA.
• The general FPGA fabric consists of a large number of gates, and then there is an
interconnect superimposed on the sea of gates as illustrated

9
• Plessey, a manufacturer that was in the FPGA market in the mid-1990s, made
FPGAs of this architecture.
• The basic cell used was a NAND gate, in contrast to the larger basic cells used
by manufacturers such as Xilinx.
• While the terminology sea of gates is the most popular, there are also the
terminologies sea of cells and sea of tiles to indicate the topology of FPGAs
with a large number of fine-grain logic cells.
10
• The Microsemi Fusion FPGAs contain a sea of tiles, where each tile can be
configured as a 3-input logic function or a flip-flop/latch.
Granularity
● FPGA logic blocks differ greatly in their size and implementation capability.
● The two transistor logic block used in the Crosspoint FPGA can only implement an
inverter but is very small in size.
● The look-up table logic block used in the Xilinx 3000 series FPGA can implement
any five-input logic function but is significantly larger.
● To capture these differences, logic blocks are classified by their granularity.
● Granularity can be defined in various ways, for example, as the number of Boolean
functions that the logic block can implement, the number of equivalent two-input
NAND gates, the total number of transistors, total normalized area, or the number of
inputs and outputs.
Fine-Grained Logic Blocks
● Fine-grain logic blocks closely resemble MPGA basic cells.

● The most fine grain logic block would be identical to a basic cell of an MPGA
and would consist of few transistors that can be programmable interconnected.
Each logic block can be used to implement only a very simple function.

● For example, the logic block might be configured to act as any 3 input
function such as primitive logic gates (AND, OR, NAND etc) or a storage
element (D FF, D latch etc).
Fine grained FPGA families are given below.
● The Crosspoint FPGA
● The Plessey FPGA
The Crosspoint FPGA:
● The FPGA from Crosspoint Solutions uses a single transistor pair in
the logic block.
● In addition to the transistor pair tiles, the cross-point FPGA has a second type of logic
block, called a RAM logic tile, that is tuned for the implementation of random access
memory, but can also be used to build random logic functions
The Plessey FPGA
● A second example of a fine-grain FPGA architecture is the FPGA from Plessey
● Here the basic block is a two-input NAND gate and the logic is formed in the usual
way by connecting the NAND gates
● Algotronix :uses a two-input function block which can perform
any function of two inputs. This is implemented using a
configurable set of multiplexers.
● Concurrent Logic:logic block of Concurrent Logic's FPGA
contains a two-input AND gate and a two-input EXCLUSIVE-
OR gate.
ADVANTAGES:
● The useable blocks are fully utilized since it is easier to use small logic gates
efficiently and the logic synthesis techniques for such blocks are very similar
to those for conventional mask-programmed gate arrays and standard cells.
DISADVANTAGES:
● They require a relatively large number of wire segments and programmable
switches.
● Such routing resources are costly in delay and area.
● As a result, FPGAs employing fine-grain blocks are in general slower and
achieve lower densities than those employing coarse grain blocks.
Coarse-grained Logic block

● In the case of coarse- grained architecture, each block contain relatively large
amount of logic compared to their fine-grained counterparts.
● For example, logic block might contain four 4 input LUTs, four multiplexers,
four D-type flip flops and some fast carry logic.
● ACTEL
● QUICK LOGIC
● XILINX
● The Actel logic block is based on the ability of a multiplexer to implement
different logic functions by connecting each of its inputs to a constant or to a
signal
● By connecting together a number of multiplexers and basic logic gates, a
logic block can be constructed which can implement a large number of
functions in this manner.
Quick logic

● The logic block in the FPGA from QuickLogic is similar


to the Actel logic blocks in that it employs a four to one
multiplexer.

● Each input of the multiplexer (not just the select inputs) is


fed by an AND gate, as illustrated

● Note that alternating inputs to the AND gates are inverted.


This allows input signals to be passed in true or
complement form, thus eliminating the need to use extra
logic blocks to perform simple inversions.
● Multiplexer-based logic blocks have the advantage of providing a large degree
of functionality for a relatively small number of transistors. This is, however,
achieved at the expense of a large number of inputs (eight in the case of Actel
and 14 in the case of QuickLogic), which when utilized place high demands on
the routing resources.
● Such blocks are, therefore, more suited to FPGA’s that use programmable
switches of small size such as antifuses.
● The Xilinx Logic Block: The basis for the Xilinx logic block is an SRAM
functioning as a look-up table (LUT).
● The Altera Logic Block: The architecture of the Altera FPGA has evolved
from the PLA-based architecture of traditional PLDs with its logic block
consisting of wide fanin (20 to over 100 inputs) AND gates feeding into an
OR gate with three to eight inputs.

coarse-grained architectures uses a bus interconnect and PEs that perform more
than just bitwise operations, such as ALUs and multipliers.
● Some Coarse grained architecture comprises of array of nodes, where each
node is a highly complex processing element ranging from algorithmic
structure such as FFT all the way up to general purpose microprocessor core.
● These are called medium grained architectures.
● Medium grained architectures are classified as LUT based and MUX based.
Effect of Logic Block Granularity on FPGA Density
and Performance
● Effect of Granularity on Density
○ As the granularity of a logic block increases, the number of blocks needed to
implement a design should decrease.
○ On the other hand a more functional (larger granularity) logic block requires
more circuitry to implement it, and therefore occupies more area.
○ This tradeoff suggests the existence of an “optimal” logic block granularity for
which the FPGA area devoted to logic implementation is minimized.

● Effect of Granularity on performance


○ The granularity of the logic block has a significant effect on the performance of
an FPGA
● Figure a gives the implementation of the logic function f = abd + abc + acd using two-input
NAND gate logic blocks.
○ The longest path requires four levels of blocks.
● Figure b shows an implementation of the same function using three-input lookup tables requiring
only two levels of blocks.
● Assuming a 1.2pm CMOS technology, a two-input NAND gate delay is estimated at 0.7ns while
a three-input LUT delay is estimated at 1.4 ns.
● Clearly, for a nonzero routing delay between the blocks, the higher granularity of the 3-LUT will
result in a faster implementation.
● FPGA logic block functionality Vs Area efficiency
● An important characteristic of a logic block is its functionality, which is defined as the number
of different boolean logic functions that the block can implement. For example, a two-input
NAND gate can implement five different functions: the basic function f=(ab)’, as well as f=a’
and f=b’, and the constants 0 and 1, if the inputs are set appropriately. In contrast, a three input
lookup table can implement any function of its three inputs.
● Different blocks are likely to have different amounts of functionality, and varying costs in
terms of chip area and delay. Also, the functionality of the logic block will affect the amount
of routing resources that are needed in the FPGA.
● The functionality of the logic block has a major effect on the amount of area required to
implement circuits in FPGAs. As functionality increases, the number of blocks needed to
implement a circuit will decrease, but the area per block will increase, because higher
functionality requires more logic.
● The total chip area needed for an FPGA consists of the logic block area plus the
routing area.
● Since routing area typically takes from 70 to 90 percent of the total area, the effect of
logic block functionality on the routing area can be very important.
● Functionality affects the routing area in a number of ways: as functionality increases,
the number of pins per block will likely increase, the number of connections between
logic blocks will decrease and, because there will be fewer blocks, the distance that
each connection travels will decrease.
● Depending on the relative effects on each of these factors, the total routing area will
either go up or down.
● Figure illustrates how this function could be implemented with 2-input, 3-input, or
4-input lookup tables.
● As shown, the 2-LUT implementation requires eight logic blocks, while the 3-LUT
needs only four blocks.
● As an area measure, consider the number of memory bits required for each
implementation. A 2-LUT requires half as many memory bits as a 3-LUT (recall that
the number of bits in a K-LUT is 2^K ), but twice as many 2-LUTs are required.
● Hence, the total block area for both the 2-LUT and 3-LUT cases is the same.
● The 4-LUT case requires only half the number of memory bits compared to the other
two, because the function can be implemented in just one block.
● However, anyone of the three alternatives may result in lowest total chip area,
depending on the amount of routing resources that each one implies.
FPGA logic block architecture

● FPGA contains logic cells replicated in regular array across the chip.
● The logic blocks vary in the basic components they use.
● Look-Up Table (LUT) based logic blocks (Xilinx)
● Multiplexers and logic gates to build their logic blocks (Microsemi/Actel)
● PLD blocks (Altera FPGA).
● Simple building blocks consisted of
● Transistor pairs (e.g., Crosspoint FPGAs).
● NAND gates (e.g., Plessey).
Look-Up-Table–Based Programmable Logic Blocks
● Many look-up-table–based FPGAs use a 4-variable look-up table (often denoted by
the short form LUT4) plus a flip-flop as the basic element and then combine several
of them in various topologies.
● The LUT4 can also be called a 4-variable function generator since it can generate
any function of four variables.
● The inputs to the X-function generator are called X1, X2, X3, and X4
● The functions can be steered to the output of the block (X ) in combinational or latched
form
● The D flip-flop can have clock enable, direct set, and direct reset inputs
● A multiplexer selects between the combinatorial output and the latched version of the
output.
● The memory cell beneath the multiplexer provide appropriate select signals to select
between the latched and unlatched form of the function
● Examples are
○ Xilinx Spartan/Virtex
○ Altera Cyclone II/APEX II
○ QuickLogic Eclipse/PolarPro
○ Lattice Semiconductor ECP
Logic Blocks Based on Multiplexers and Gates
● Any combinational function can be implemented using multiplexers alone.
● A 4-to-1 multiplexer can generate any 2-input function.

● Logic blocks similar to these were used in early Microsemi (Earlier Actel) FPGAs
such as the ACT I and ACT II.
FPGA logic cells

In this section, the basic logic cell architecture of three major FPGA vendors are being
discussed
❖ Xilinx LCA (Xilinx 2000, 3000 and 4000)
❖ Altera Max
❖ Actel ACT 1,2 and 3
Xilinx LCA
● Xilinx LCA (a trademark, denoting logic cell array) basic logic cells, configurable logic
blocks or CLBs , are bigger and more complex than the Actel or QuickLogic cells.
● The Xilinx LCA basic logic cell is an example of a coarse-grain architecture .
● The Xilinx CLBs contain both combinational logic and flip-flops.
● The basis for the Xilinx logic block is an SRAM functioning as a look-up table (LUT).
● The truth table for a K-input logic function is stored in a 2K x 1 SRAM.
● The address lines of the SRAM function as inputs and the output of the SRAM provides the
value of the logic function.
● The advantage of look-up tables is that they exhibit high functionality
○ a K-input LUT can implement any function of K
● The disadvantage is that they are unacceptably large for more than about five inputs, since the
number of memory cells needed for a K-input lookup table is 2k
XC 3000
● The XC3000 CLB has five logic inputs (A–E), a common clock input (K), an asynchronous direct-
reset input (RD), and an enable (EC)
● Using programmable MUXes connected to the SRAM programming cells, you can independently
connect each of the two CLB outputs (X and Y) to the output of the flip-flops (QX and QY) or to the
output of the combinational logic (F and G).

• A 32-bit look-up table ( LUT),


stored in 32 bits of SRAM,
provides the ability to
implement combinational logic
XC 4000
● Two significant architectural changes from the 3000 series block.
○ two differently sized LUT’s are used: a four input LUT and a three input LUT
○ another change in the Xilinx 4000 logic block is the use of two nonprogrammable
connections from the two four-input LUT’s to the three-input LUT.

• If proper use can be made of


these fast connections FPGA
performance can be greatly
improved.
XC 5200
● a Logic Cell or LC is used in the XC5200 family of Xilinx LCA FPGAs
● The LC is similar to the CLBs in the XC2000/3000/4000 CLBs, but simpler
• The XC5200 LC contains
• a four-input LUT
• a flip-flop
• MUXes to handle signal switching.
• The arithmetic carry logic is separate
from the LUTs.
• A limited capability to cascade functions
is provided to gang two LCs in parallel to
provide the equivalent of a five-input
LUT.
Altera MAX
● The architecture of the Altera FPGA has evolved from the PLA-based architecture of
traditional PLDs with its logic block consisting of wide fan-in (20 to over 100 inputs)
AND gates feeding into an OR gate with three to eight inputs.
● In addition to the wide AND-OR logic block, the MAX 5000 employs one other type of
logic block, called a logic expander.
● This is a wide-input NOT - AND gate whose output can be connected to the AND-OR logic
block.
● While a logic expander incurs the same delay as the logic block, it takes up less area and
can be used to increase its effective number of product terms.
F = A' · C · D + B' · C · D + A · B + B · C’
● This function has four product terms and thus we cannot implement F using a macrocell
that has only a three-wide OR array
● If we rewrite F as
F = (A' + B') · C · D + (A + C') · B
= (A · B)' (C · D) + (A' · C)' · B
● Can use logic expanders to form the expander terms (A · B)' and (A' · C)’
● Can even share these extra product terms with other macrocells if needed
● The extra logic gates that form these shareable product terms a shared logic expander , or
just shared expander
Microsemi (Actel) ACT

● The basic logic cell in the Actel ACT family of FPGAs are called logic
modules.
● Actel ACT 1 use one type of logic module and ACT 2 and 3 use two different
types of logic modules.
● It is possible to build a logic function by connecting logic signals to some or all
of the Logic Module inputs, and by connecting any remaining Logic Module
inputs to VDD or GND.
● The figure also shows the implementation of a logic function F = A.B+B.C’+D
in ACT 1 logic module with the help of Shannon’s expansion theorem
(Assignment 1)
● ACT 2 and ACT 3 use two different types of logic modules C-module and S
module.
● ACT 2 C-module
● Combinational module
● 5 input functions
● The S-module (Sequential module)
● same combinational function capability as the C-module but with an
additional sequential element that can be configured as a flip-flop.
FPGA timing model
● Many FPGA and CPLD vendors provide a timing model in their data sheets that
allow estimation of path delays.
● Some example path delays that are of interest:
○ Minimum Pin to Pin (combinational) delay
■ (through input pin, through one combinational logic element, through one output pin.)
○ Minimum Register to Register Delay
■ From clock input pin, through Clock to Q delay through DFF of a logic element, through one
combinational logic element to setup time on DFF input).

● These timing models allow estimates of maximum attainable performance


● Routing delays will always complicate the timing model
● After a design is mapped to an FPGA or CPLD, use a static timing analysis
program to estimate the timing performance.
● Static timing analysis (STA) is a method of validating the timing performance of a design by
checking all possible paths for timing violations. STA breaks a design down into timing paths,

calculates the signal propagation delay along each path, and checks for violations of timing

constraints inside the design and at the input/output interface.

● Another way to perform timing analysis is to use dynamic simulation, which determines the full
behavior of the circuit for a given set of input stimulus vectors. Compared to dynamic

simulation, static timing analysis is much faster because it is not necessary to simulate the

logical operation of the circuit. STA is also more thorough because it checks all timing paths,

not just the logical conditions that are sensitized by a set of test vectors. However, STA can

only check the timing, not the functionality, of a circuit design.


● After breaking down a design into a set of timing paths, an STA tool calculates
the delay along each path. The total delay of a path is the sum of all cell and net
delays in the path.
● “Critical path”
● This is required for calculating the maximum operating frequency and speed
performance of the device
Basic timing parameters
● TPD : propagation delay is the amount of time it takes for signals to pass each
combinational element

● TSUD : Setup time is the amount of time required for the input to a sequential
device to be stable before a clock edge.

● TH : Hold time is similar to setup time, but it deals with events after a clock
edge occurs. Hold time is the minimum amount of time required for the input
to a sequential device to be stable after a clock edge.

● TCO : Clock to output delay (clock-Q)


Actel Timing Model
● The above figure shows a simple timing model since it deals only with the
logic inside the chip and we can only get an estimate the delay as it is not
possible to predict exact delay until the completion of place and route step of
the design process.
● As shown in the figure the internal signal I1 may be an output of a register
(flip-flop), then it pass through a combinational logic C1, and then through a
register S1 and finally through S2. The register-register delay consists of a
clock Q delay, combinational delay between registers and set up time for next
flip-flop.
● The speed of the system then depends on the slowest register-register delay
or the critical path between registers. So that it is not possible to have clock
period longer than this or the signal will not reach the second register in time
to be clocked.
● For the standard speed grade ACT 3, the delay between the input of a C module
and the output is specified in the datasheet as a parameter tPD with a max value of
3 ns.
● The output of C module is input to S1 module, which is configured to implement a
combinational logic and a D FF and the minimum set up time for this D FF is
specified as tSUD as per data sheet with a value equal to 0.8 ns.
● Consider the inside parameters of S module. The set up time and hold time
measured inside the S module is denoted as t’SUD and t’H respectively. And the
clock Q propagation delay is denoted as t’CO. And all these parameters are
measured using the internal clock CLKi.The propagation delay of the
combinational logic inside the S module is t’PD and the delay of the combinational
logic that drives the flip flop clock signal is denoted as t’CLKD.
● Most FPGA vendors sort their chip according to their speed and the process is
known as speed grading or speed binning.

● For Actel family, this sorting is done according to the Logic Module
propagation delay tPD.

● The propagation delay is defined as the average of rising (tPLH) and falling (tPHL)
propagation delays of a Logic Module.
● If the designer is using fully synchronized design techniques, then one more
timing parameter need to be considered called as worst case timing, which is
the maximum delay the design may encounter.
● The critical path delay between registers is given below.
Xilinx LCA Timing Model
● The above figure shows timing model of Xilinx LCA FPGAs.
● Xilinx FPGAs use two speed grade systems.
● The first uses the maximum guaranteed toggle rate of a CLB flip-flop
measured in MHz as a suffix, so higher toggle rate then faster the device.
● The second uses the approximate delay time of the combinational logic in a
CLB in nano second, so lower the delay then faster the device.
● For example, an XC4010-6 has tILO (combinational logic delay) equal to 6.0 ns
(the correspondence between speed grade and tILO is fairly accurate for
XC2000, XC4000 and XC5200 but it is less accurate for XC3000).
Altera MAX timing model
● The above figure shows the Altera MAX timing model for local signals.
● In figure (a), an internal signal I1 enters the local array (LAB interconnect with a
fixed delay t1 = tLOCAL = 0.5 ns, then passes through AND array (delay t2 = tLAD =
4.0 ns and to the macrocell flip-flop (with set up time t3 = tSU = 3.0 ns and clock Q
or register delay t4 = tRD = 1.0 ns. Thus the total path delay = 0.5 + 4.0 + 3.0 + 1.0
= 8.5 ns.
● Figure (c) shows the use of parallel logic expander and figure (e) with a shared
expander.
● Unlike the shared expander, the parallel logic expander, the extra product term is
generated in parallel with other product terms, where as in shared expander it is
generated in series.
● Then the parallel expander delay, tPEXP = 1.0 ns will be added to total path delay
to make it 9.5 ns in this case.
Power Dissipation (Actel)
The power dissipation of FPGAs depends on such factors as utilization, average
operating frequency, and load conditions unlike the most of PALs and PLDs which
have a fixed power consumption.

General Power Equation:


Static and Active power components
● Actel FPGAs have small static power components that result in lower
power dissipation than PALs or PLDs. By integrating multiple PALs/PLDs into
one FPGA, an even greater reduction in board-level power dissipation can be
achieved.
● The power due to standby current is typically a small component of the
overall power. For an ACT 3 device, the standby power is specified as 5
mWatts, worst case
● The static power dissipated by TTL loads depends on the number of outputs
driving high or low and the DC load current. Again, this value is typically
small. For instance, a 32-bit bus sinking 4 mA at 0.33 V will generate 42
mWatts with all outputs driving low and 140 mWatts with all outputs driving
high. The actual dissipation will average somewhere between as I/Os switch
states with time.
● Power dissipation in CMOS devices is usually dominated by the active
(dynamic) power dissipation.
● This component is frequency dependent, a function of the logic and the
external I/O.
● Active power dissipation results from charging internal chip capacitances
of the interconnect, unprogrammed antifuses, module inputs, and module
outputs, plus external capacitance to PC board traces and load device
inputs.
● An additional component of the active power dissipation is the totem-
pole current in CMOS transistor pairs.
● The net effect can be associated with an equivalent capacitance that
can be combined with frequency and voltage to represent active power
dissipation.
Equivalent Capacitance:

The power dissipated by a CMOS circuit can be expressed by the equation:

Power (µWatts) = CEQ * VCC 2 * F (2)

Where: CEQ is the equivalent capacitance expressed in pF. VCC is the power supply in volts. F is
the switching frequency in MHz.

Equivalent capacitance is calculated by measuring ICCactive at a specified frequency and voltage


for each circuit component of interest. Measurements have been made over a range of
frequencies at a fixed value of VCC. Equivalent capacitance is frequency independent so that the
results may be used over a wide range of operating conditions.
I/O block architecture

Xilinx I/O Block:


● The Xilinx I/O cell is the Input/Output Block (IOB).
● It provides an interface between the LCA and the outside world.
● There is one IOB for every programmable pin on the LCA
package.
● Each IOB has input and output buffers to facilitate compatibility
with TTL and CMOS threshold levels.
● The IOB can serve as an input, output or tristated bidirectional
path.
Figure: Xilinx 3000 IOB
Input characteristics:

● Inputs from the pad can be brought into the interior of the chip either directly
or registered or both.
● Polarity of each clock line is programmable.
● Input clamping diodes are provided for electrostatic protection.
● Both direct input (from IOB pin I) and registered input (from IOB pin Q) signals
are available for interconnect.
● For reliable operation, inputs should have transition times of less than 100 ns
and should not be left floating.
● Each user IOB includes a programmable high-impedance pull-up resistor,
which may be selected by the program to provide a constant High for
otherwise undriven package pins.
Output characteristics:

● Outputs can be connected directly to pad or registered or both.


● Output buffer is provided with tri state and skew control.
● Configuration options allow each IOB an inversion, a controlled skew rate
and a high impedance pull-up.
Figure: Xilinx 4000
and 4000A IOB
Input characteristics:

● Inputs are routed to an input register that can be programmed


as either an edge-triggered flip-flop or a level-sensitive
transparent latch.
● The data input to the register can be delayed by several
nanoseconds to compensate for the delay on the clock signal.
● The I1 and I2 signals that exit the block can each carry either
the direct or registered input signal.
Output Characteristics:

● Output signals can be inverted or not inverted, and can pass directly to
the pad or be stored in an edge-triggered flip-flop.
● Optionally, an output enable signal can be used to place the output buffer
in a high-impedance state, implementing 3-state outputs or bidirectional
I/O.
● Under configuration control, the output (OUT) and output enable (OE)
signals can be inverted.
● The slew rate of the output buffer can be reduced to minimize power bus
transients when switching non-critical signals.
● Programmable pull-up and pull-down resistors are useful for tying unused
pins to VCC or ground to minimize power consumption.
Input characteristics:

● The XC5200 inputs can be globally configured for either TTL (1.2V) or CMOS
thresholds, using an option in the bitstream generation software.
● The inputs of XC5200-Series 5-Volt devices can be driven by the outputs of any
3.3-Volt device, if the 5-Volt inputs are in TTL mode.
● The data input to the register can optionally be delayed by several nanoseconds.
● The XC5200 IOB has a one-tap delay element: either the delay is inserted
(default), or it is not. The delay guarantees a zero hold time with respect to clocks
routed through any of the XC5200 global clock buffers. For a shorter input register
setup time, with non-zero hold, attach a NODELAY attribute or property to the flip-
flop or input buffer.
Output characteristics:

● Output signals can be optionally inverted within the IOB, and pass directly to the
pad or can be made registered.
● An active-High 3-state signal can be used to place the output buffer in a high-
impedance state, implementing 3-state outputs or bidirectional I/O. Under
configuration control, the output (OUT) and output 3-state (T) signals can be
inverted. The polarity of these signals is independently configured for each IOB
● The XC5200 devices provide a guaranteed output sink current of 8 mA.
● An output can be configured as open-drain (open-collector) by placing an OBUFT
symbol in a schematic or HDL code, then tying the 3-state pin (T) to the output
signal, and the input pin (I) to Ground.

You might also like