Xcell 38
Xcell 38
Issue 38
Winter 2000
PRODUCTS
New PROMS Simplify
FPGA Designs
APPLICATIONS
Creating High-Performance
Digital Down Converters
SOFTWARE
New Xilinx Foundation
Series ISE Software with
Integrated Design Flows
NEWS
Xilinx Launches
Platform-FPGA Initiative
Cover Story
Synopsys Chief Technology
Officer Discusses Platform-Base R
Business is booming,
and Xilinx is growing rapidly...
P rogrammable logic continues to grow faster than other segments of the semiconductor market, and
Xilinx continues to grow along with it–there is no end in sight. To keep up with this unprece-
dented expansion, we are building several new facilities, acquiring new companies, and incorporating
the best available complementary technologies.
We are leading the industry, not only with the most advanced device and software technologies but
also with the most ambitious plans for future developments. Here are some of our most recent
activities:
• Xilinx purchases two new buildings in San Jose. The buildings will provide approximately 200,000
additional square feet and are expected to house up to 700 new Xilinx employees. The purchase of the
new buildings is the latest in a string of new construction projects we have undertaken in the last few
years. In 1999, we completed construction of a fourth building at our San Jose headquarters, and other
projects are also underway at Xilinx locations in Colorado, Ireland, and California.
• Xilinx acquires Visual Software Solutions, Inc. (VSS). Their expertise will help us further extend our
EDITOR Carlis Collins software leadership and allow us to deliver a variety of customized tools that facilitate HDL-based
[email protected] design using our new Virtex-II FPGAs, thus improving your time-to-market. Included in the acqui-
408-879-4519
sition are the VSS HDL BencherTM and StateCADTM design tools.
SENIOR DESIGNER Andy Larg
• Xilinx acquires RocketChips, a leading developer of ultra-high-speed CMOS mixed-signal trans-
BOARD OF ADVISORS Dave Stieg ceivers serving the networking, telecommunications, and enterprise storage markets.
Mike Seither
Peter Alfke
The RocketChips gigabit and multi-gigabit serial CMOS transceiver technologies provide solutions
for a wide range of serial system architectures, and this technology will be a key feature of our next-
generation FPGA families.
Xcell journal • Xilinx acquires Tornado, a full-function formal verification application deploying state-of-the-art
circuit equivalence checking techniques. Based on many years of research and development efforts by
Xilinx, Inc. Veriphia, this new software adds significant value to our advanced development tools. We plan to
2100 Logic Drive
develop this technology even further and focus it on the Virtex FPGA architectures, in alliance with
San Jose, CA 95124-3450
Phone: 408-559-7778 key EDA partners.
FAX: 408-879-4780
©2000 Xilinx Inc. • Xilinx acquires Integral Design, a privately held design services firm headquartered in Dublin, Ireland.
All rights reserved.
The acquisition enhances our professional design services capabilities in the communications and mul-
Xcell is published quarterly. XILINX, the Xilinx timedia market segments. Recent advances in FPGA performance and capabilities continue to drive
logo, and CoolRunner are registered trademarks
of Xilinx, Inc. Virtex, LogiCORE, IRL, Spartan,
customer needs for additional design resources. Design services enable you to use dedicated designers
SpartanXL, Alliance Series, Foundation Series, with experience in Xilinx solutions to augment your own internal expertise and improve your time-
CORE Generator, IP Internet Capture, IP Remote
Interface, MultiLinx, QPRO, SelectI/0, SelectI/0+, to-market.
True Dual-Port, WebFITTER, WebPACK, ChipViewer,
Select RAM, Block Ram, Xilinx Online, and all XC- These developments continue to enhance our capability to offer you the best programmable logic
prefix products are trademarks, and The devices, development tools, and services in the industry.
Programmable Logic Company is a service mark of
Xilinx, Inc. Other brand or product names are Our current capabilities already give you a significant ease-of-use and time-to-market advantage. As
trademarks or registered trademarks of their
respective owners. the market expands, costs decrease, and many new applications become possible, thus fueling even
more growth. You can see why programmable logic is quickly becoming the technology of choice for
The articles, information, and other materials
included in this issue are provided solely for the many more applications, from low-cost consumer devices to high-performance switching systems-
convenience of our readers. Xilinx makes no
warranties, express, implied, statutory, or other-
there simply is no faster or easier way to create the systems of the future. And, Xilinx is well prepared
wise, and accepts no liability with respect to any to continue leading the way.
such articles, information, or other materials or
their use, and any use thereof is solely at the risk
of the user. Any person or entity using such
information in any way releases and waives any
claim it might have against Xilinx for any loss, Carlis Collins
damage, or expense caused thereby.
Editor
Cover Story Page 8 Contents Fall 2000
Designing With FPGA Platforms - page 8 Platform FPGA-The Future of Logic Design .................................4
The Chief Technology Officer at Synopsys discusses the need for plat- Designing with FPGA Platforms ...............................................8
form-based design in an era of system-on-a-chip FPGAs.
Inferring Multiplexers in FPGA Compliler II and FPGA Express ......11
Spartan-II PCI Development Kit .............................................13
Choosing the ARC User-Configurable Processor .........................14
Re-thinking Your Verification Strategies for
Perspective Page 14 Multimillion-gate FPGAs .......................................................18
Foundation ISE-What's In a Name .........................................21
Choosing the ARC User- Xilinx Foundation Series ISE Software-Delivering
Configurable Processor - page 14 the Benefits of HDL Design ...................................................22
ARC Cores and Xilinx provide everything you need to StateCAD XE for Optimizing State Machine Design ...................24
develop custom processor applications.
HDL Bencher XE for Fast Behavioral FPGA Verification ...............26
New Products Page 22 Guided Design Using BLIS ....................................................28
New High-Density Virtex PROMS and Cost-Effective
New Xilinx Foundation Series ISE Software Spartan-II PROMS ...............................................................30
- page 22 Create Efficient FIR Filters Using Virtex and Spartan FPGAs .........32
Integrated design flows increase your productivity and accelerate your LogiCORE PCI Module Is a Key Element in Voice over
time to market.
IP Applications ....................................................................35
Creating Finite State Machines ..............................................36
New Products Page 30 Design a Low-Power SMBus System Using
CoolRunner CPLDs ...............................................................39
New High-Density and Cost-Effective CoolRunner Power-Saving Tips and Tricks ................................40
PROMS - page 30 Creating a Low Power Serial Peripheral Interface ......................42
Xilinx announces the addition of the XC17V00 and XC17S00A CoolRunner CPLDs Beat the Heat ...........................................43
families to its existing line of PROMS.
FPGAs-The Solution to Ultra-Deep Sub-Micron Design ................44
Implementing a Histogram for Image Processing Applications .....46
High Performance Digital Down-Converters for FPGAs ................48
Digital Image Processing with LogiCOREs ................................52
Applications Page 48
Stackable Development Boards for Spartan-II,
High-Performance Digital Down Converters Virtex, and Virtex-E FPGAs ....................................................54
Platform FPGA-
The Future of
Logic Design
Xilinx and its partners are building
the high-performance technology
platform on which the designs of
the future will emerge.
4
View from the top
5
View from the top
Gigabit Serial I/O Capability The best solution we have for this band- The high-speed SkyRail trans-
width bottleneck is to use point-to-point ceiver is compliant with industry standards
Many new systems today are requiring such as Gigabit Ethernet and Fibre
connections, over a single pair of wires,
much faster data transfer between systems, Channel in addition to the emerging 10-
operating at very high speeds. Currently,
boards, and devices, due primarily to the Gigabit Ethernet (IEEE 802.3ae) standard.
with this technology, you can achieve a
ever increasing demand for faster networks. By integrating quad transceivers, which are
data rate of two to three gigabits per sec-
Very high speed (gigabit per second) serial used to create 10-gigabit attachment unit
ond. The big advantage of this method is
I/O capability promises to solve this diffi- (XAUI) interfaces, a single FPGA can
that you use less wires and less power, and
cult problem. interface to both 10-Gigabit Ethernet and
the total amount of data you can move is in
Traditionally, data has been shared by using fact higher than with a typical parallel bus. OC-192c. The high-speed transceiver is
parallel busses such as PCI (Peripheral also compliant with the 2.5 Gbps
To create a gigabit serial I/O channel, a
Component Interconnect). However, there InfiniBandTM architecture standard being
hard core is needed; you cannot achieve
are inherent limitations with shared busses. created by the InfiniBand Trade
these speeds with soft cores in an FPGA.
To increase the speed of a shared bus, you Association.
The hard core does several functions; it
can either increase the speed of each wire The RocketChips Acquisition
receives and transmits the data, and it also
(which is very difficult to do because there
recovers the clock (because you can recover High-speed serial I/O capability is so
are many of them), or you can increase the
the clock from the data, a single pair of important, we decided not to stop at the
number of wires (which takes more and
wires is all you need for data transfer). The 3.125 Gbps speed offered by the Conexant
more I/O pins). For example, PCI was
hard core must also serialize and de-serial- core–we are developing the technology fur-
once just 32 bits wide; now you also have
ize the data. By de-serializing the data to a ther. That’s why we recently acquired a
64-bit PCI–and that’s not enough. The
16- or 32-bit internal bus, the data speed is company called RocketChips, which is very
problem with this approach to increasing
then reduced by 16 or 32, which an FPGA active in creating high speed serial I/O
bandwidth is that at some point you reach
can easily handle. cores. RocketChips already has a product
a level of decreasing return; the extra pins
and the need for shared bus protocols lim- With gigabit serial I/O, all the de-serializa- that is very similar to the Conexant core,
its the performance and makes it prohibi- tion is done within the chip. When the and they plan to develop even higher speed
tively expensive. work is done, you can serialize the data cores operating at 5 to 10 Gbps.
6
View from the top
7
Cover Story
Platform-based Design
Designing
with FPGA by Raul Camposano
Chief Technology Officer, Synopsys, Inc.
[email protected]
Platforms
As FPGAs move into million-gate densi-
ties, a world of new possibilities and poten-
tial applications is opening up for pro-
grammable logic devices. And along with
these new opportunities comes many new
challenges. The pairing of a low-cost, high
performance PowerPC processor (and
other hard cores), along with the soft cores
and programmable logic circuitry in Xilinx
Virtex-II™ FPGAs means that you will
now be confronting challenges similar to
what ASIC designers encountered when
they made the transition to system-on-a-
The Chief Technology Officer at chip (SoC) ASICs.
Everyone who participates in the multimil-
Synopsys discusses the need for lion-gate segment of the FPGA market is
wrestling with the same issues: increased
8
Cover Story Platform-based Design
ture that is geared towards a specific appli- • Hardware design. a suite of FPGA synthesis tools for this pur-
cation, such as cell phone base stations or pose. FPGA Express™ addresses the push-
• Software design.
set-top boxes, among others; it is cus- button, fast turn-around market, while
tomized through software and by adding • Integration of hardware, software, and IP. FPGA Compiler II™ addresses more com-
customized logic and IP. • Verification of the complete system (on a plex designs and compatibility with the
chip). ASIC design flow. Looking further into the
An FPGA platform enables you to differ-
future, other synthesis technologies such as
entiate your products by adding cus- Synopsys delivers solutions in all four of Synopsys’ Physical Synthesis will enable
tomized logic and IP using the tightly inte- these areas. The design of hardware has full timing closure for platform FPGAs.
grated FPGA fabric. Platforms are impor- been our traditional domain, and we offer
Open SystemC
One of the most difficult aspects of soft-
ware design involves how to interface soft-
ware effectively with hardware. Open
SystemC, a set of C++ class libraries that
enables electronic design at the system
level, provides an important tool for
designing software and hardware in a com-
mon language framework. Based on C and
C++ (the languages of choice for most algo-
rithm developers, system architects, and
software developers) SystemC also includes
all the language elements necessary to effec-
tively address hardware design. In this way,
trade-offs between hardware and software
can be addressed dynamically, even includ-
ing reconfiguration in the field.
Figure 1 - Platform FPGA: SystemC helps you create both systems and
Xilinx Virtex-II FPGA with embedded PowerPC processor chips; the suite of tools and methodologies
Synopsys has developed around SystemC
tant in the era of multimillion-gate FPGAs significantly accelerate the design of elec-
because they enable you to focus on adding
value through custom IP rather than wast-
ing time and resources by recreating stan-
dard components.
Platform-Based Design
A central piece of any platform is the
embedded processor, such as the IBM
PowerPC processor core in the Xilinx
Virtex-II platform. A typical platform
might also include a bus, DSP, input/out-
put channels, mixed signal functions,
memory, and some configurable logic such
as shown in Figure 1. FPGA design thus
becomes platform design; rather than sim-
ply designing with gates, you must now
focus on designing entire systems.
For you to effectively exploit a platform by
designing at the system level, four primary
design considerations must be addressed: Figure 2 - SystemC tools
9
Cover Story Platform-based Design
10
Applications Software
Inferring Multiplexers
in FPGA Compiler II
and FPGA Express
How to get better results by automatically inferring multiplexers that fully
utilize architecture-specific FPGA resources.
by Alan Ma
Senior Corporate Applications Engineer, Synopsys, Inc.
[email protected]
In general, multiplexers can be implemented beneficial when the number of inputs meets inputs for the target architecture are met (as
by using Look Up Tables (LUTs). To obtain certain requirements. Table 1 illustrates the shown in Table 1), FCII/FE maps the design
the best quality of results (QoR), Synopsys multiplexer sizes and the primitives FCII/FE to architecture-specific multiplexer resources
FPGA Compiler IITM and FPGA ExpressTM utilizes for Xilinx Virtex-II, Virtex, and if at least 75% of all possible cases are speci-
(FCII/FE) take it one step further by utiliz- XC4000 FPGAs (and their derivatives). fied.
ing the built-in multiplexer resources in FCII/FE automatically maps to these hard-
high-density FPGAs, which produces signif- Figure 1 shows an example of an eight-to-
ware resources (primitives) when you follow
icantly better results in both area and speed. one multiplexer in Verilog. Figure 2 illus-
the recommended coding guidelines.
trates its VHDL equivalent. Note that the
The Process Coding Guidelines control signal sel has three bits so there can
During elaboration, the process of translat- be as many as eight possible cases. As a result,
Synopsys recommends the use of CASE
ing the text-based description of a design to at least six (75% of eight) cases need to be
statements to describe multiplexer logic.
an architecture-independent gate-level repre- specified for multiplexers to be inferred.
When the requirements on the number of
sentation, FCII/FE infers a generic primitive
called MUX_OP when it encounters multi-
plexers in the Hardware Description
Language (HDL). It is during optimization Architecture Min. Inputs Max. Inputs Primitives Used
where MUX_OPs are mapped to architecture-
specific multiplexer resources. The following Virtex-II 4 256 LUT, MUXF5, MUXF6
sections describe the requirements for
MUX_OP to be inferred. Virtex 4 256 LUT, MUXF5, MUXF6
General Implementation XC4000 4 256 FMAP, HMAP
Our research indicates that using architec-
ture-specific multiplexer resources is only Table 1 - Multiplexer size requirements for automatic inference
11
Applications Software
12
New Products Development Tools
by Jim Beneke
Technical Marketing Manager, Insight Electronics
ured, and tested. Figure 1 shows a block dia- grams are provided with C++ source code so
[email protected]
gram of the Spartan-II PCI card included in you can understand how the examples work.
When Xilinx introduced the SpartanTM-II the kit.
To assist in the development and debugging
FPGA family in January 2000, they not only
of Windows device drivers, the Spartan-II
offered the lowest cost FPGA devices with The New Reference Design Center
PCI Kit includes Compuware’s NuMega
system-level features, they also enabled pro- In addition to the Spartan-II FPGA, the driver development software. The NuMega
grammable logic to effectively replace off-the- PCI board also includes the new Xilinx package simplifies the task of writing and
shelf ASSP devices for 32-bit PCI applica- XC18V01 in-system programmable con- configuring Windows drivers through a series
tions. Combined with the proven PCI32 figuration PROM. This allows PCI appli- of GUI windows.
LogiCORETM interface from Xilinx, the cation designs to be quickly downloaded
Spartan-II PCI solution was the common- The Xilinx Spartan 32/33 PCI Core
multiple times to the board and saved in
sense choice for most PCI designs. non-volatile memory. The Insight Spartan-II PCI Development
Unfortunately, designers wishing to target Kit includes the new single-use version
a Spartan-II device for a PCI project, were of the Xilinx Spartan-only 32-
not able to prototype their design with an bit, 33 MHz PCI core. The sin-
off-the-shelf PCI platform. Insight gle-use license allows the kit
Electronics recognized the need for this owner to support a single produc-
type of development board, and intro- tion PCI core implementation. If
duced the Spartan-II PCI multiple PCI core solutions are
Development Kit. The kit includes a required, then the core license can
Spartan-II prototype board, single- be upgraded to an unlimited
use Spartan PCI32 LogiCORE license for a nominal fee. The 32-
license, Windows driver develop- bit Spartan PCI core is configured
ment software, one-day (eight and downloaded through the
hours) of Insight Design Services Xilinx PCI Lounge. The download-
support, reference designs, able core netlist is fully PCI v2.2
Figure 1 - Spartan-II
Windows-based applications, PCI development board compliant and supports initiator
example Windows 98/NT driv- and target functions with zero-wait-
ers, source code, and hardware documenta- state burst operation.
With this re-configurable feature, Insight is
tion. The demonstration board is based on including access to its new Reference Design Conclusion
the 150K-gate Spartan-II FPGA, in a 208- Center. At the Reference Design Center,
pin plastic quad flat package (PQFP). By providing exactly what is needed to com-
owners of the Spartan-II PCI kit can down-
plete a PCI design, the Spartan-II PCI Kit
Implementing the full initiator/target PCI load pre-configured PCI application designs
meets the demands of both experienced and
interface in the FPGA only consumes about and run them on the demonstration board.
new designers of programmable logic-based
ten percent of the logic resources, leaving Developed by Insight Design Services, these
PCI interfaces. Several versions of the
approximately 135K gates for custom user off-the-shelf application designs can be used
Spartan-II PCI Development Kit are avail-
back ends. Unlike other PCI prototype cards, as is, or can be customized to meet certain
able from Insight Electronics. Prices range
the Spartan-II PCI board does not contain application needs. In addition to providing
from $145 for a PCI card only kit, to $3,995
back-end application circuits to complicate reference design bit streams and their associ-
for the complete Spartan-II PCI
your custom design. Instead, all user I/Os are ated source code files, the Reference Design
Development Kit. For more information, go
brought out to expansion connectors for easy Center also provides example Windows driv-
t o w w w. i n s i g h t e l e c t r o n i c s . c o m /
access and interfacing. This allows your ers and Windows-based application pro-
solutions/kits/xilinx/spartan-iipci.html.
designs to be quickly implemented, config- grams. Both drivers and application pro-
13
Perspective Configurable Processors
14
Perspective Configurable Processors
15
Perspective Configurable Processors
Integrated Software Environment ARC provides the flexible ARChitect vides high performance at lower clock
Graphic User Interface (GUI) that can be speeds, while still maintaining a software
Because of the typical complexity of the
used to safely create your custom config- programmable solution.
software code, ARC offers the Metaware
ured processor. This is very helpful when
development environment. This profession- Instruction extensions are available from
using a soft processor in an FPGA, and it
al set of software development tools ARC and some third parties. Plug-ins can
allows you to experiment with different
includes a C/C++ compiler, assembler/link- be used and implemented directly in the
options and configurations within minutes.
er, and the SeeCode™ source-level debug- design. For additional capability, you can
ger. Most importantly,
it offers you the ability
to debug the embedded
software running on • The ARC IP is
the processor in the deeply embedded
FPGA. It is critical that with the rest of the
logic and interface
the core and its host
directly with other
interface include execu-
customer logic func-
tion control capabilities tion in the Xilinx
like breakpoint check- FPGA
ing so you can break
the program execution • "Gate-hungry"
or monitor reads and complex system buses
writes to program vari- and associated logic
ables. are no longer needed
to reach high-per-
As the software content formance because of
of a design increases, the tight integration
another important fac-
tor is the range of appli-
cations supported and
the available systems "ARC, Third Generation IP"
software. For example,
if a design requires sev-
eral hundred Kbytes of
code along with standard communications Instruction Set Flexibility also create your own specific instructions.
software, such as TCP/IP protocols, you Custom instruction extensions offer you a
The instruction set is one of the most
can save several months or more of design particularly powerful way to accelerate
important aspects to consider when choos-
time by purchasing a real-time operating application performance while retaining
ing a configurable processor. One poten-
system (RTOS) that includes prepackaged programmability. Consider the example of
tial disadvantage of soft processors is that
protocols. ARC supports a large variety of a DES (Digital Encryption Standard)
they cannot attain the high clock speeds of
commercially distributed RTOS from lead- encryption application: by adding special-
a hard processor. For a conventional
ing vendors and is constantly increasing ist bit-permutation, cipher instructions
processor design, the clock speed is essen-
their ease of integration. and additional registers to hold the keys, it
tially the key determinant of performance.
is possible to greatly accelerate a range of
In addition to the software tools and appli- The ARC processor changes this equation
encryption algorithms.
cations described above, another critical fac- by offering a configurable instruction set
tor in choosing the ARC core is its level of and the ability to add custom instructions. To provide a truly configurable instruction
flexibility. Unlike other configurable This enables you to accelerate an algo- set, it is also important that the number of
processors available today, which sometimes rithm by selecting or adding a few appro- clock cycles for an instruction extension is
require you to manually “hack” the HDL priate (but powerful) instructions specifi- configurable. For example, the ARC
code, the ARC processor core enables you cally needed for the application that is processor enables the addition of multi-
to easily select special options for configur- being executed. Thus, you can get the best cycle instructions to the pipeline where
ing the processor. Hacking the HDL code of both RISC (Reduced Instruction Set desired, and single-cycle operations to pro-
after configuring the processor core might Computer) and CISC (Complex ceed in parallel with long latency ones.
break the core, or even make it incompati- Instruction Set Computer) processor This is an advantage over architectures that
ble with the software development tools. design architectures. This approach pro- enforce a strict RISC paradigm where
16
Perspective Configurable Processors
every instruction must execute in a single iliary bus has a very simple interface that lenge in providing a truly configurable
cycle. Such restrictions may make it impos- virtually enables peripherals to be connect- processor solution.
sible to add very powerful, complex instruc- ed with just a few wires. This is well suited
ARC and Xilinx are responding to this
tions that require multiple cycles to execute. to FPGAs where there is no actual bus,
challenge by offering a complete “plug and
allowing peripherals to be efficiently con-
Interaction with Other Logic Functions play” solution to FPGA designers. In addi-
nected in a point-to-point manner.
tion, the ARC tools suite allows you to
The ARC processor can further improve
Tool Configurability enhance the original configurations
performance by enabling tight integration
offered in a simple manner.
between the processor core and other logic Any processor that offers a high degree of
on the FPGA. Traditional processor cores configurability must also offer equally con- Conclusion
typically communicate with peripheral figurable software tools and a debugging
Soft processor cores give you the ability to
hardware via a system bus. To send data to environment that work in coordination. It
include processors in standard FPGAs.
the processor, the peripheral interrupts the is of no use to add new instructions to the
Configurable cores can help you achieve
processor, which then processes the inter- processor if there is no way of telling the
higher performance at lower clock rates
rupt using a software routine known as an compiler and assembler about them so that
through instruction extension and periph-
ISR (Interrupt Service Routine). In addi- actual software programs can take advan-
eral logic integration. ARC and Xilinx
tion to supporting this approach, ARC tage of them. In a similar vein, the com-
offer the perfect combination of a config-
processor enables you to add new core piler must let you specify which instruc-
urable core with powerful extensions and
extension registers. If desired, the new regis- tions will be present in the processor, as
third party “plug-ins,” in addition to a
ters can be directly accessed by peripheral well as be able to take advantage of features
complete development environment and
logic, enabling such devices to communi- such as multipliers or barrel shifters when
operating system support, ready to use
cate with the processor directly. These alter- they are included. In fact, software tool
with Xilinx FPGA technology.
native approaches can improve performance configurability is one of the greatest chal-
and reduce gate count by eliminating the
need to duplicate a complex system bus and
its arbitration logic in an FPGA.
ARChitect creates a … and a com-
It is no longer necessary to pass data via a
HDL descrip- plete software
bus or to interrupt the processor to have it
tion of the tool chain to
load data from a memory-mapped register. CPU program it!
Since the special registers are unique to a
particular piece of peripheral logic, there is
no need for any decoding or arbitration
logic. The firmware simply selects the spe-
cial purpose registers to communicate with
the peripheral.
In addition to providing extension registers,
configurable processors like the ARC core
can also simplify integration with addition-
al logic by providing multiple buses. This
approach enables operations residing on
separate buses, such as instruction fetches,
load/stores, and communication with C, C++, ASM, profil-
peripheral logic. As a result, the bus proto- er, linker, simulator,
cols of each bus can be relatively simple VHDL, Verilog debugger, etc…
since there is no need to arbitrate between
multiple devices attempting to control one
bus. The ARC processor has four buses,
consisting of instruction and data buses
(Harvard architecture), a bus directly into Tools configurability: ARChitect, the
the processor registers (primarily used for ARC Graphic User Interface that make
debugging), and an auxiliary bus (typically it all possible
used to connect peripheral logic). The aux-
17
Perspective Design Verification
Re-thinking Your
Verification Strategies
for Multimillion-gate
FPGAs.
How do you alter your verification
techniques to meet today's high gate
count requirements? It depends on
your background and experience.
by Thomas D. Tessier
President, t2design Incorporated
[email protected]
late signals and view the resulting wave- large enough to handle the design com-
FPGA verification is essential for success- plexity that was previously achievable only
form responses. Because this process is
ful on-time product delivery, and today's with an ASIC. When ASIC engineers
time consuming, error prone, and diffi-
million-gate FPGAs require you to re- begin to use high density FPGAs, they
cult to repeat, engineers often spend min-
think your old verification strategies. take their verification approaches with
imal time in simulation and quickly move
Many engineers continue to use simula- them. Those who use a validation process
to debugging in the lab. Multimillion-
tor-specific approaches for verification; with robust tools and a complete self-
gate FPGAs implement functions far too
the simulation tools are primarily used for checking testbench environment find that
complex to rely on this ad-hoc method.
module testing, while the lab is used for continuing to use their familiar testing
system-level integration. This approach Designers are choosing million-gate approaches now causes them to loose
requires the engineer to manually stimu- FPGAs because they are fast enough and valuable design cycle time. ASIC
18
Perspective Design Verification
Designers can benefit from a carefully list offers examples of the type of informa- tions that are essential to simulate and
defined and executed verification plan tion you need to identify: those that can be tested during in-system
that takes FPGA reprogramability into test. The execution of the Verification
• External interfaces
consideration. Time that was once well Plan requires simulation and in-system
spent in exhaustive verification at the RTL - Stimulus and response test on the target PC board–the final
level with an ASIC, now becomes costly - Transaction level, such as Read vs. stages of the pyramid.
for a high density FPGA. Write operations Verification Simulation
What is Verification? - Timing requirements Simulation has two components:
Verification is not synonymous with sim- • HDL models available to assist in test- • Dynamic simulation describes behav-
ulation. It is a strategy to make sure all bench development ioral HDL, RTL, and gates.
parts of the system conform to the specifi-
cation document; simulation is a tool used - Packaged with proposed Intellectual • Static analysis encompasses Static
in the verification effort. The basic com- Property (IP) Timing Analysis (STA), Formal
ponents of verification are shown in • Tools available to the project Verification and Signal Integrity
Figure 1. Analysis.
- Simulators
Specification In-System Test
- Static Analysis
A detailed and complete specification is During in-system test you have a distinct
essential for producing working products, - Lab-based tools advantage when using FPGAs over ASICs.
on schedule. The specification document • Performance Requirements, such as: An obvious benefit is the ability to repro-
is the foundation of the verification plan, need 32 block data write @ 66 MHz gram the FPGA until the desired func-
and describes the features to be imple- with a latency of less than 300 ns. tionality is achieved. You also have an
mented, under what conditions they additional advantage with the Xilinx
occur, and what their expected outputs Execution ChipScope Integrated Logic Analyzer
should be. This documentation should A verification strategy that best suits your which enables you to observe internal
not determine implementation–that is left design means breaking out those func- nodes of the chip, on your PC board,
to the experience of the RTL designers. while running at system speeds.
Verification Plan
RTL engineers and verification engineers
share the responsibility for implementing
the test plan. The level of test granularity
(or detail) is outlined at: transactions, pro-
Execution
tocol, interfaces and timing. Essential
functions are identified. A determination
of the number of testbenches needed, Simulation & In-System
their complexity, and test module depend- Test
encies is made. Static: Dynamic:
• Signal Integrity • Behavioral
Any discrepancies in design implementa- • Static Timing • RTL
tion versus testbench results should be • Formal Verification • Gates
referred back to the specification for clari-
fication. This is not a new concept but
Verification Plan
often overlooked in the rush to produce a
Breadth of Detail
product. When all elements described
• Transaction
within the test plan are checked off, the • Protocol
verification effort has been completed to • Interfaces
the required level of confidence. To opti- • Timing
mize your verification effort the following • Correct by Construction
Specification
Foundation ISE –
Year 2000 European Event Schedule
Dec 5 Embedded Computing/Real Time
Show Tel Aviv, Isreal
22
Applications Software
one, or more of these design modules are Integrated Environment to efficiently transfer design data automati-
modified. for Design Optimization cally. What’s more, front to back design flow
strategies are used, enabling the individual
The Foundation Series ISE software will You usually have some overall design strategy
tool’s features to be leveraged to their greatest
manage all modules in the design for you. that you are looking to optimize in your
benefits. In a non-integrated environment
For example, it knows about all of the HDL design flow. For example, your strategy may
these communications tasks and decisions
code in your design, and it knows when the place highest priority on fitting the design in
are left to you.
code has changed; therefore it will
know, and can tell you, when Integrated Environment
HDL-generated netlists must be for Collaboration
updated, and processes re-run. To facilitate the efficient flow of design data
Then it will clearly display all constraints and strategies, it is far more effi-
design sources and implementation cient if teams of software developers work in
results, and provide easy access to collaboration. An integrated environment
the appropriate editing tool for makes possible, and enhances, collaborative
every source file. work, which is critical during the project
Many HDL compilers, as well as development phase. However, collaboration
schematic entry tools, require that presents a new challenge.
you specify a device family library Designers, working with an integrated tool,
up front, to provide appropriate in an integrated environment, depend on
library symbols and components Figure 1 - Foundation Series ISE –well-integrated HDL solu- software quality. When your in-house
for a given architecture. tion designers collaborate with third party part-
Additionally, if your design is retar- ners for example, and use different tools,
geted to a new device architecture in the the smallest possible device, or on getting the interoperability problems may occur; you
middle of your design project, then you must fastest performance. A synthesis tool can be can only hope solutions are available from
change the project libraries to match the new used to optimize the design’s performance each tool’s vendor.
architecture. The Foundation Series ISE soft- based on timing requirements, but for the
When you use the Foundation Series ISE
ware makes the changes for you. You’re left best results, the place and route tools
software, you are assured of software quality
with nothing to do but select must then receive
because it has been tested thoroughly for tool
the device family, once. Your the same informa-
interoperability, across the project creation
selection will set the appropri- tion to complete
lifecycle.
ate device libraries for design the design. This
entry. And automatically pass can mean setting Conclusion
device information forward to requirements Foundation Series ISE
place and route tools. twice. However, provides you with a com-
In the course of a design plete HDL design
cycle, it’s highly likely a environment.
design will be implemented Now you can
many times. For example, manage and opti-
revisions may be made to mize your design projects, and your
timing constraints, target engineers can work collaboratively,
device, and place and route Figure 2 - Foundation Series ISE with confidence in Xilinx quality
options, in pursuit of the best project snapshots for effective project management and technical support.
overall design implementa- Learn more about how Xilinx
tion. The Foundation Series ISE software Foundation Series ISE meets your require-
with the Foundation Series ISE software, you
provides revision control by archiving each ments for integrated design automation. See
only have to define the settings once, so you
implementation, along with all design flow and hear the Xilinx internet presentation,
can optimize your design strategy faster and
control files and design constraint files, for “Xilinx Foundation Series ISE: Delivering
more reliably.
future reference or use. With this informa- the Benefits of HDL Design to
tion, you can consult or deploy an archived The Foundation Series ISE software ensures Pr o g r a m m a b l e L o g i c D e s i g n e r s , ”
implementation anytime, without recompil- that the software tools work well together; by going to www.netseminar.com/tbd/tbd.
ing your entire design (see Figure 2). the tools must communicate with each other
23
New Products Software
StateCAD XE
you to create your design in the manner
best suited to your target application.
The way an HDL is structured dramati-
cally impacts the speed, area, and power
consumption of the synthesized device.
for Optimizing
When doing finite state machine design,
the best results can only be achieved by
careful consideration of the resources
available, and by having the flexibility to
experiment with different alternatives.
State Machine
Automated FSM Design Using StateCAD XE
A quicker way to implement state
machines optimized for Xilinx devices is
to use the Xilinx ISE software, which
includes StateCAD XE. This tool allows
Design
you to draw complex state diagrams,
choose design specific optimizations, and
generate synthesizable VHDL, Verilog, or
Abel-HDL. StateCAD allows you to
change optimizations (including state
assignment mode, registering output, and
signal loading), then reproduce the HDL
automatically.
Now you can implement faster, more compact One advantage of automatic state machine
state machines, with ease. translation is the ability to change opti-
mizations and regenerate code in seconds.
By trying different code styles, state
assignment modes, and optimizations, you
can find which combination yields the
optimal solution for your design.
by Andy Bloom State Machine Example
Manual FSM Design
Director of Engineering, Visual Software Solutions
([email protected]) Until recently, you had to specify control By comparing implementations of a sim-
[email protected] logic manually; you had to draw state dia- ple state machine, we can see the impact
grams by hand (or with a graphics pack- on state machine design. The small state
Ricky Escoto machine in Figure 1 will be implemented
age), and then manually translate them to
Director of Marketing, Visual Software Solutions with both registered and combinatorial
schematics or to an HDL. Timing and
([email protected]) outputs, illustrating the impact of output
logic problems identified during simula-
[email protected] optimization on implementation:
tion resulted in modifications to the orig-
Control logic is usually implemented as inal design, which then needed to be re-
finite state machines (FSMs), which usual- verified, step-by-step.
ly require you to work through multiple
This approach tends to be slow, repetitive,
levels of design and optimization, often RESET S0
and error-prone. Translation errors invari- EVEN
within tight development schedules. And,
ably creep in and require substantial effort
as designs grow larger, the complexity of
to eliminate.
implementing control logic increases cor-
respondingly, forcing you to migrate from Hardware Description Languages (HDLs) S2
S1
schematics to hardware description lan- allow more logic to be specified and main- EVEN
guages (HDLs). StateCAD® XE automates tained with less effort, and they can be
the state machine development process, synthesized in numerous ways. You can
saving you a lot of time and trouble. control how synthesis operates, allowing Figure 1 - Example state machine
24
New Products Software
Output Optimization
REGISTERED OUTPUTS COMBINATORIAL OUTPUTS
Outputs can be optimized for
speed (registered) or for area
PROCESS (sreg, RESET) BEGIN PROCESS (sreg, RESET) BEGIN
(combinatorial decode). Com-
next_EVEN <= ‘0’; next_sreg<=S0; EVEN <= ‘0’; next_sreg<=S0;
binatorial decoded outputs
IF ( RESET=’1’ ) THEN IF ( RESET=’1’ ) THEN
become active by decoding
next_sreg<=S0; next_EVEN<=’1’; next_sreg<=S0; EVEN<=’1’;
state registers (Moore) or by
ELSE ELSE
decoding state registers and
CASE sreg IS CASE sreg IS
inputs (Mealy). Registered
WHEN S0 => WHEN S0 => next_sreg<=S1;
outputs are calculated prior to
next_sreg<=S1; EVEN<=’1’;
the active edge of the clock,
WHEN S1 WHEN S1 => next_sreg<=S2;
and typically improve speed
=> next_sreg<=S2; next_EVEN<=’1’; WHEN S2 => next_sreg<=S0;
because a level of propagation
WHEN S2 => next_sreg<=S0; EVEN<=’1’;
delay is removed, but usually
next_EVEN<=’1’; END CASE;
require more area than combi-
END CASE; END IF;
natorial implementations.
END IF;
Registered outputs are insensi-
END PROCESS;
tive to input glitches or to
multiple state bit changes.
Design Results
In Table 1 you can see the reg-
istered design has outputs that
change at the same time as the
state bits, and are stable
between clocks. The output
delay time is the clock to out-
put delay of the register. All
decoding necessary for the Table 1 - Comparison of output styles
output occurs before the
clock, at the same time as the
decoding for the next state. The decode and associated logic. State diagrams can • StateCAD is fully integrated within the
time is effectively “buried” in the state include states, transitions, Mealy and Xilinx ISE software, and produces HDL
decode time, producing a faster design. Moore outputs, resets, counters, optimized for Xilinx devices, guarantee-
shifters, multiplexers, and much more. ing you the best possible results.
In comparison, the combinatorial design
No HDL knowledge is required to spec-
requires time to decode the state bits, yield- • StateCAD can import FSMs created with
ify control flow.
ing a slower implementation. The advan- previous releases of the Xilinx
tage for the combinatorial design is the • StateCAD exhaustively analyzes state dia- Foundation Series software.
smaller area: 5 logic elements compared to grams for inconsistencies, automatically Conclusion
8 for the registered design. identifying more than 200 problems,
such as stuck-at-states, conflicting out- Using StateCAD XE you can quickly
Additional StateCAD Benefits puts, and non-deterministic control flow. implement state machines optimized for
StateCAD provides additional benefits to Xilinx devices. As design parameters
• StateCAD includes a built-in simulator change, just select a new set of optimiza-
Xilinx customers: called StateBench, for behavioral verifica- tions, then regenerate code suited for the
• By automating the complete state tion and identification of problems at the new requirements.
machine development process, the Xilinx state diagram level.
ISE software and StateCAD eliminate StateCAD XE is available at no charge
• StateCAD automatically translates state to Xilinx customers, and is included
manual coding, translation errors, stale
diagrams to synthesizable VHDL and with the Xilinx ISE software or can be
documentation, and logic bugs.
Verilog. Optimizations include one-hot downloaded from www.xilinx.com
• StateCAD includes wizards tailored for state assignment, registered outputs, and (download StateCAD from the WebPack
designing concurrent state machines prioritized transitions. BackPack section).
25
New Products Software
HDL Bencher XE
Bencher links the error reported to the
offending line in the HDL source.
Create Self-Checking Testbenches
The testbenches include component instan-
FPGA Verification
are flagged automatically. All the necessary
timing constraints are faithfully represented
in the resulting testbench.
Verify Timing
By adding timing constraints, you can gener-
ate VHDL or Verilog testbenches for post-
synthesis verification. Synthesized netlists
differ from behavioral HDL because data
types are remapped, I/O modes are changed,
unused signals are dropped, and generics are
Now you can develop complete, timing constrained VHDL flattened. HDL Bencher automatically re-
and Verilog testbenches in minutes. maps behavioral testbenches to simulate with
synthesized netlists.
26
New Products Software
Within the Xilinx ISE software, you start annotates the expected response into the
COUNT : inout integer RANGE 0 TO 7
by selecting the HDL file from the source waveform. If no expected response was spec-
Test Signals Defined
window, then you choose HDL Bencher ified, HDL Bencher back annotates the
SIGNAL CLK : std_logic;
from the process window. The design is response obtained by ModelSim. Otherwise,
:
expected and actual respons-
SIGNAL COUNT : integer RANGE 0 TO 7;
es are compared, and dis-
Instantiates Unit Under Test
crepancies are highlighted.
UUT : counter PORT MAP (
Once your design is synthe- CLK => CLK,
sized, its behavioral test- :
bench may be incompatible COUNT => COUNT
with the resulting VHDL Clock Process Created
netlist generated during the BEGIN
post-route process. In this CLOCK_LOOP : LOOP
Figure 1 - Initial timing dialog case, the resulting netlist CLK <= transport ‘0’;
uses std_logic_vector instead WAIT FOR 10 ns;
of integers. To make the CLK <= transport ‘1’;
synthesized netlist simulate, Creates Check Procedures
you would switch back to PROCEDURE CHECK_COUNT(
HDL Bencher, re-associate NEXT_COUNT : INTEGER
the waveform with the Reports Errors In Expected Values
synthesized netlist, and IF (COUNT /= NEXT_COUNT) THEN
Figure 2 - Stimulus for the design example
re-export the testbench. write(TX_LOC,string’(“Error at
Finally you would switch time=”));
back to the ISE software Applies Input Stimulus
and re-simulate. RESET <= transport ‘1’;
The Resulting Testbench CE <= transport ‘0’;
Validates Timing
The exported testbench in WAIT FOR 100 ns; — Time=820 ns
this example is 183 lines of Verifies Outputs
code, and took under 1 CHECK_COUNT(7,820); — 7
Figure 3 - ModelSim running the testbench and design minute to create and simu- Reports Success/Failure
late. The following portions ASSERT (FALSE) REPORT
of the testbench highlight “Simulation successful. No prob-
automatically imported, and you are given some of the aspects of automatic testbench lems detected. “
the opportunity to select worst-case global generation: Draw Expected Behavior
timing parameters:
A waveform is created next (Figure 1), which
includes all the signals for the unit under test Automatically Commented
(UUT). Individual waveforms are then mod- — VHDL TestBench created by
ified directly on the screen by clicking on the — Visual Software Solution’s HDL Conclusion
signals to show the expected behavior, or by Bencher 2.00
using the built-in pattern generator. Libraries Extracted
With HDL Bencher you can verify the
LIBRARY IEEE;
operation of VHDL and Verilog designs in
Next, HDL Bencher automatically exports a
USE IEEE.std_logic_1164.all;
minutes; no HDL scripting is needed. The
self-checking testbench. The testbench
Log File Created
resulting testbenches are self-checking, and
includes all stimulus (Figure 2), output asser-
FILE RESULTS: TEXT IS OUT
are compatible with the Xilinx ISE soft-
tions, timing constraints, and check routines
“results.txt”;
ware. HDL Bencher XE is available at no
needed to verify the operation of the design.
Components Instantiated
charge to all Xilinx customers, and is
The testbench is added to the ISE project,
COMPONENT counter
included with the ISE software or can be
then auto-simulated through the Xilinx ISE
PORT (
downloaded from www.xilinx.com
software and ModelSim (Figure 3).
CLK : in std_logic;
(download the HDL Bencher “BackPack”
An advanced version of HDL Bencher is from the WebPack section).
:
now available which automatically back-
27
New Products Software
Guided
Design
Using
BLIS
With Block Level Incremental
Synthesis (BLIS), your design
implementation times will
improve dramatically.
by Karen Fidelak
Technical Marketing Engineer, Xilinx Xilinx High-Level Floorplanning, BLIS Block Level Incremental Synthesis
[email protected] provides the most robust incremental
Incremental design changes (due to design capability ever offered. As you make design changes, BLIS recog-
ECOs, specification changes, and nizes “blocks” of the design which have
BLIS, a part of the Synopsys FPGA
repeated design iterations) can cause sig- been changed at the source, and intelligent-
Express/FPGA Compiler II v3.4 software
nificant delays if you have to synthesize ly synthesizes only those portions of the
(FE/FCII), is now available in the Xilinx
and place and route your entire design design. In this flow, a block is defined as a
ISE 3.2i development tools.
after each change. Ideally, your synthesis module/entity and any
and place-and-route software tools hierarchy tree beneath it.
should recognize where changes have To enable BLIS, you
been made in your overall design and choose blocks in your
recompile just those portions that have design that you want to
changed. That’s what you get with BLIS, denote as “Block Roots”
a unique synthesis and place-and-route through the FE/FCII
capability, developed by Synopsys for Constraint Editor GUI or
Xilinx, that provides a guided synthesis scripting language, as
methodology. Used in conjunction with Figure 1 - Constraint Editor, specifying Block Roots shown in Figure 1.
28
New Products Software
A Block Root is a block which is intelli- design. The existing placed-and-routed design ing during guided placement and increased
gently updated by FE/FCII in incremen- was used as a template when re-implement- signal matching during routing.
tal synthesis runs, and has the following ing the design. Any portions of the design Additionally, the synthesis tool does not
characteristics: which existed in both the “Guide” design and rewrite the EDIF netlists for the unchanged
the new modified design (determined by blocks, further reducing runtime, because
• A separate netlist is created by FE/FCII for
matching net and component names) were no file re-translation is needed.
each Block Root.
placed in the same location in the new imple-
Guide Improvements
• Only those Block Roots whose correspon- mentation as they were in the “Guide”
ding source has been modified are re-syn- design. New or changed logic was imple- When a design is placed-and-routed using
thesized. mented around existing, “Guided” logic. the Guide feature, the success of the Guide
can be determined by the “Design
• The Block Root has hard boundaries Runtime Improvements
Components Matched” statistics available
around it–no optimization occurs with
Runtime improvements of up to 50% in the Place-and-Route report. The higher
neighboring modules.
(with an average of 47%) were observed the percentage of matched components,
The Advantages of BLIS when using BLIS with Xilinx Guided Place- the closer the incremental design is to the
and-Route in an incremental design flow; original results, leading to better pre-
There are two main advantages to using this
dictability of timing and placement results.
type of incremental flow.
Runtime Reductions When using the BLIS incremental design
• Runtime for both synthesis and place-and- 50%
flow, Guide success rates reached levels of
route will be improved because only the
47% at least 95%, and averaged 97%. When
modified portion of your design will be re- 40%
BLIS was not used to guide the design,
synthesized and re-netlisted. The remain-
% Runtime Reduction
80%
ponent names being changed in the final
Benchmarks netlist.)
60%
66%
We compared the results of incremental Conclusion
design flows using BLIS against the more tra- 40%
When utilizing FE/FCII Block Level
ditional methodology of re-synthesizing and
Incremental Synthesis in a Xilinx guided
re-routing the entire design. With the BLIS 20%
design, runtimes as well as timing and
flow, incremental changes are made to a small
placement consistency exhibit significant
number of design blocks (Block Roots). With 0%
Without BLIS With BLIS improvements over a more traditional
the traditional flow incremental changes are
Figure 3 - BLIS design efficiency design flow. These enhancements help you
made to the same design blocks, however
achieve a higher level of productivity by
they are not specified as Block Roots.
Figure 2 shows averaged design results. allowing you to synthesize and implement
After our example design synthesis was com- Because FE/FCII does not re-elaborate or incremental design changes, with a signif-
pleted, the design was placed and routed re-optimize unchanged blocks of the icantly reduced runtime, while preserving
using the Guide feature of the Xilinx imple- design, synthesis runtime was reduced. the unchanged portions of your design.
mentation tools, which allow you to specify This new design flexibility allows you to
an existing placed-and-routed design to be Implementation runtime was improved realize the productivity necessary to com-
used as a “Guide” when implementing a due to increased design component match- plete large or small FPGA designs faster.
29
New Products PROMs
families to its existing line of one- specifically for use with Xilinx FPGAs, there-
fore we offer a complete, pre-engineered,
drop-in configuration solution that works
time programmable (OTP) PROMs. perfectly the first time; and you are spared
the time-consuming task of designing your
own. We recently introduced two new fami-
lies, one for our Virtex FPGAs and one for
our Spartan FPGAs
30
New Products PROMs
Virtex Configuration PROMs Device Density 8-pin 20-pin 20-pin 44-pin 44-pin The Most Cost-effective Solution
VOIC SOIC PLCC PLCC VQFP
Our low-cost XC17V00 The new XC17V00 family also offers signif-
XC17V01 1.6Mb ✔ ✔ ✔
PROMs support Virtex icant savings in board space, design time,
XC17V02 2Mb ✔ ✔ ✔
and Virtex-E FPGAs, and cost. Using one 17V16 to configure the
XC17V04 4Mb ✔ ✔ ✔
up to 3.2 million system gates, and are new 3.2 million system-gate Virtex
XC17V08 8Mb ✔ ✔
offered in 1-Mb to 16-Mb densities. The XCV3200E FPGA requires less than one
XC17V16 16Mb ✔ ✔
available packages are shown in Table 1. fourth the board space of any previous
Table 1 - Virtex PROM packages Xilinx configuration PROM solution. To
The 16-Mb 17V16 PROM, a four-fold
get the equivalent functionality from our
increase in maximum bit density, extends the
XCV300E nearest competitor would require 14 chips
Xilinx leadership in configuration memories
and more than 2x the board space, as illus-
and provides a one-chip configuration solu- XCV400E
trated in Figure 1.
tion for our entire line of Virtex FPGAs. XCV405E
XCV600E
Xilinx process expertise has also allowed us
Key Features
to use smaller packages, further reducing the
XCV812E
The XC17V00 serial/parallel PROM need for board space.
family is based on our proven, OTP archi- XCV1000E
Configuration of Multiple FPGAs
tecture that provides a stable, low-cost, XCV1600E
highly-reliable one-chip configuration XCV2000E The XC17V16 can also be used to configure
solution with the following features: multiple, daisy-chained FPGAs. This allows
XCV2600E
you to store configuration data for up to
• 1-Mb to 16-Mb densities. XCV3200E eight FPGAs in a single PROM, as illustrat-
• Simple, fast, serial FPGA interface that ed in Table 2.
Table 2 - Number of Virtex-E FPGAs
requires only one user I/O pin. configurable by one 16Mb PROM
Spartan-II Configuration PROMs
• Parallel configuration up to 264 Mbps
FPGA PROM 8-pin 8-pin 20-pin 44-pin
(17V16 and 17V08 only). Solution PDIP VOIC SOIC VQFP Our XC17S00A PROM
XC2S15 XC17S15A ✔ ✔ ✔
Family provides a high-per-
• Available in SOIC, VOIC, VQFP, and formance, low-cost configu-
XC2S30 XC17S30A ✔ ✔ ✔
PLCC packages. ration solution, optimized
XC2S50 XC17S50A ✔ ✔ ✔
• Low-power CMOS floating gate process. for use with Spartan-II FPGAs. This family
XC2S100 XC17S100A ✔ ✔ ✔
offers a dedicated PROM for each gate den-
XC2S150 XC17S150A ✔ ✔ ✔
• Programming support by leading program- sity in the Spartan-II family for ease-of-
XC2S200 XC17S200A ✔ ✔ ✔
mer manufacturers. selection and guaranteed compatibility, as
Table 3 - Spartan-II PROM packages and shown in Table 3. This family also offers
• Cascadable for storing longer or multiple device compatibility extended availability of the smallest package
bitstreams.
offered by Xilinx, the 8-pin VOIC.
Key Features
• Simple, fast, serial Spartan FPGA interface
that requires only one user I/O pin.
• Available in DIP, VOIC, SOIC, and
VQFP packages.
• Advanced, low-cost CMOS process.
• Programming support by leading pro-
grammer manufacturers.
Conclusion
With the new XC17V00 and XC17S00A
PROMS, there is no easier, faster, or less
expensive way to configure Xilinx FPGAs.
Figure 1 - The Xilinx solution beats the competition For more information see: www.xilinx.com.
31
Applications FPGAs
by Rotem Gazit
Design Engineer, MystiCom LTD. samples. The input sample
[email protected] storage holds the last N
input samples. For every new
A Finite Impulse Response (FIR) filter works
sample entering the filter, N
by multiplying a vector of the most recent N
multiply operations will be
data samples by a vector of coefficients and
performed, each multiplying
summing the elements of the resulting vec-
the filter coefficient by the
tor. In every cycle the filter receives a new
respective input sample.
sample of data and shifts out the oldest
sample. FIR filters are very common The result of each multiply oper-
in FPGA-based Digital Signal Figure 1 - F ation is added to the partial result
IR filter blo
Processing applications. ck diagram storage to produce a new partial
result. This newly calculated par-
The design concept described
tial result is then saved in the par-
here is suitable for systems
tial result storage by replacing the
with relatively low input rates
previous partial result. After N such
(0.5 to 8 MHz), which
multiply and add operations, the
require a FIR filter imple-
partial result storage content is driv-
mentation with hun-
en out of the filter. The partial result
dreds of taps; this is
storage content is then cleared to
common in modem
begin processing a new data sample.
and demodulation
A block diagram of serial FIR filter
applications.
structure is shown in Figure 2.
Figure 2 - Se
FIR Filter Design rial FIR filter st
ructure The hardware responsible for the com-
Concepts
bination of multiplying, adding, and
By examining the FIR block diagram in Serial FIR Filters storing is called a MAC (Multiply
Figure 1, you can see that if the filter is Accumulate) unit. Due to the serial nature
implemented in a straight forward manner, a Assuming that the performance capability of
of the filter, the MAC will operate on M
multiplier will be required for every filter tap the FPGA is M times faster than the data
taps of the filter. In the case where N is
(N multipliers for an N-tap filter). In addi- input rate, we will examine the case where
greater than M, several serial filters can be
tion, an adder with N inputs will be needed M is ≥ N (where N is the required number
chained together. The oldest data sample
to sum all multipliers outputs. However, if of filter taps).
leaving the first filter in the chain is used as
the data input rate is slower than the per- To implement a serial N-tap filter uses only the new data sample in the next filter, and
formance capability of the FPGA, the filter one multiplier, a 2-input adder, and storage so on. The results of all the chain filters
can be implemented much more efficiently. for the partial results and the filter input must be added together.
32
Applications FPGAs
The MAC unit consists of an adder, a mul- output [21:0] out; // MAC output.
reg [21:0] out; // MAC output changes whenever a new data is being processed.
tiplier, and result storage. Careful design of
the adder and multiplier is very important wire [16:0] mul_out; // mac_multiplier output.
for area efficiency. wire [21:0] add_out; // mac_adder output.
33
Applications FPGAs
35
Applications FPGAs
Creating Finite
State Machines
Using
UsingTrue
TrueDual-Port
Dual-PortFully
Fully
Synchronous
SynchronousSelectRAM
SelectRAMBlocks
Blocks
Create very dense, high-performance, highly efficient designs that require no logic resources.
by Edgard Garcia
Senior Engineer, Multi Video Designs apply to the Virtex 4K-bit block easily be expanded to more complex designs
[email protected] SelectRAM™ which incorporates output that could require two or more blocks.
registers. You can use a single 4K-bit RAM Synchronous FSMs and sequencers have
The latest Virtex, Virtex-E, and Spartan-II
block as a 512 x 8 clocked ROM, to imple- some important characteristics in common:
FPGA families offer a broad range of unique
ment a very fast FSM working at more than
features, including block RAM, that give you • They are clocked by a single clock.
150 MHz, and it uses no CLBS. You can
dramatic speed and density improvements.
implement the following, for example:
The dedicated RAM blocks allow you to • A feedback path allows you to (partially)
build fast and dense bidirectional data • 16 states + 4 additional outputs and 5 define what the next step will be.
buffers and FIFOs, with built-in data width inputs + Enable and Synchronous Reset.
conversion. This RAM can also be used to • They may need a clock enable to suspend
• 32 states + 3 additional outputs and 4
implement very fast and efficient sequencers the operations.
inputs + Enable and Synchronous Reset.
and Finite State Machines (FSMs), which
frees your logic gates for other tasks. Design Example • They must have a reset to go back to a pre-
defined state.
A well known approach to building The following example shows how to imple-
sequencers consists of a ROM-based design ment FSMs or sequencers with a single 4K- Figure 1 shows a typical FSM or sequencer
with output registers. The same method can bit block SelectRAM. The same method can logic diagram.
36
Applications FPGAs
Simulation
Converting the FSM behavior to a truth
table can be a very tedious and time con-
suming task; debugging and modifying the
design can turn into a nightmare.
The method used here takes advantage of the
modern design entry, simulation, synthesis,
and implementation tools, combining their
complementary respective power. The goal is
to use a VHDL simulator to automatically
generate the constraint file for initialization
of the block SelectRAM used as ROM.
One of the problems encountered when sim-
ulating a design with VHDL, is that an out-
Figure 1 - Synchronous FSM simplified diagram put can’t be forced to any logic level. An easy
way to avoid this inconvenience consists of
breaking the feedback loop, which allows
you to enter patterns into the inputs, includ-
ing current and illegal states. Figure 2 illus-
trates one way of designing the FSM VHDL
code so it can be simulated more easily. A
top-level file can provide the feedback for a
classical simulation of the FSM.
Optimization
If the number of inputs (including binary
encoding states feedback) is nine or less, and
the number of outputs (including binary
encoding states) is eight or less, a single 4K-
bit block SelectRAM can be applied by using
structural VHDL with the scheme shown in
Figure 3.
RAM Initialization
Figure 2 - Modified FSM simplified block diagram By initializing the contents of the memory
with the appropriate values, the behavior of
any synchronous FSM can be reproduced.
Binary encoded state The initialization of the RAM block is done
by an NCF (constraint) file that will be used
by synthesis and Xilinx implementation
ADDR[8:5] DO[7:4] tools. A very easy way to initialize the mem-
Inputs
ADDR[4:0]
ory with the correct values is to make an
automatic generation of the NCF file, by
WE DO[3:0] Outputs using a VHDL simulator and another test-
Tied to GND
DI[7:0] bench.
Clock Consider the behavioral VHDL code of the
CK
Enable FSM without feedback. A simple 9-bit pseu-
EN
Reset
RST RAMB4_S8 do counter (generated by a testbench) can
(512 x 8 primative) provide all the 512 possible states of the
inputs, including illegal states. Each associat-
Figure 3 - Using fully synchronous RAM blocks for FSM implementation ed result (state output and FSM outputs) can
37
Applications FPGAs
thus be converted to a text file obeying the by taking advantage of the innovative fea- design productivity is greatly improved, and
NCF file format. See Figure 4. tures such as block SelectRAM. By combin- complex designs can be easily implemented
ing the power of the Xilinx architecture and or modified.
Resources Required
implementation tools, with the associated
For more information, please e-mail:
Table 1 summarizes some examples of typi- VHDL synthesizers and simulators, your
[email protected]
cal FSMs or sequencers in terms of perform-
ance and logic resources. All these designs
can be implemented with a single RAM
block, but the same method can easily
expanded to more complex functions, pro-
viding similar improvement. As you can see,
a RAM-based FSM is much faster and uses
no logic resources.
True Dual-Port RAM Advantages
Block SelectRAM provides true dual port
capability; a single location can be read at
the same time by the two ports. Therefore, a
single block can be used to implement two
identical synchronous FSMs, with separate
inputs, synchronous reset, and clock enable;
you can also implement separate clocks, if
needed. Figure 5 shows the architecture of a Figure 4 - Testbench for automatic
NCF file generation
Binary encoded states of FSM_A
CoolRunner CPLDs
TM
39
Applications CoolRunner CPLDs
CoolRunner
Power-Saving
Tips and Tricks
These techniques can lower your CoolRunner power consumption by 40%.
by Frank Wirtz
Staff Applications Engineer, Xilinx Inc.
[email protected] niques you can use to further reduce fixed geometry of the device determines
power consumption in CoolRunner both the speed-sensitive paths and the
With the advent of Fast Zero Power tech-
CPLDs, which are already the lowest power-sensitive paths.
nology and CoolRunner CPLDs, you can
power-consuming CPLDs in the world.
now create portable, high-performance, The following design implementation
Here are some highlights from that appli-
low-power, programmable devices, effort- techniques are just a sampling of the
cation note.
lessly. And, with some additional effort, information you can use to slash power
you can reduce your power consumption Tips and Tricks for Reducing Power consumption to a minimum:
by as much as 40%. However, to accom- Terminate!
The CoolRunner XPLA architecture gives
plish ultra low power reductions, you
you a flexible logic allocation tool that You must properly terminate all inputs to a
must first understand the mechanics of
allows you to decrease power consump- CMOS buffer. A single floating pin can
CPLD logic generation.
tion by placing your logic in optimum result in an increase of quiescent current by
Xilinx has published a new application locations. To use this tool, you need to 13mA. Slow input transitions will also
note, “Low Power Tips for CPLD Design” understand the basic architecture of the cause unnecessary power use. Test data
(XAPP346), that describes design tech- device, and you need to know how the shows that input buffer power consump-
40
Applications CoolRunner
tion doubles if input rise time increases other internal devices depend upon the Binary Counters
from 800ps to 5ns per input. output voltage of the buffer for driving
Typical binary counters will have their out-
their inputs.
Congregate! puts changing state at a rate of:
In some cases, mixed voltage interfacing is
You can see how your design is imple-
necessary. Slight modifications to differ- 2 n+1 - 2
mented by reviewing your fitter report, Percent of bits toggling = X 100
ing VCCs can drastically reduce power n2 n
and then adjust the fit to constrain your
consumption in these instances. For
high frequency signals to a single logic
example, the XPLA3 devices may be pow-
block. This will decrease the distribution Where the number of bits of the counter = n
ered at 3.3V +10%, and 5V devices may
of high-speed nets and further decrease
be powered at 5V -10%. This changes the So, a typical 8-bit binary counter would
power consumption.
differential between Voh and Vih by have approximately 25% of its bits chang-
Modulate! 800mV per input, and will significantly ing state for any single clock edge.
reduce wasted power. However, examine
The application note details special clock LFSR Counters
the data sheet to ensure safe and reliable
considerations and explains how asynchro-
operating conditions. LFSR (Linear Feedback Shift Register)
nous clocking can provide low power ben-
counters are wonderful solutions for FPGA
efits. Typically, asynchronous clocking Default System Conditions
users who need to keep look-up table fan-
increases power consumption. Modulation
Attention to default system operating con- in to a minimum. However, because of the
in this instance refers to only applying a
ditions may provide an insight into ways internal hardwired feedback of CPLDs,
clock signal to a register when it is required.
you can further decrease power consump- this type of counter consumes much more
Many designs have registers that infre-
tion. As an example, a CPLD may be inter- power than the other counter examples
quently change state, yet the clock signal is
faced to a CMOS microcontroller with described here.
continually present and applied to the reg-
programmable (polarity sensitive) inter-
ister. While asynchronous design For example, an 8-bit LFSR counter has
techniques are usually discouraged, approximately 50% of its bits chang-
they do provide designers with ing on average for any single clock
additional flexibility when low edge. In comparison, an 8-bit binary
power (or sometimes high speed) counter changes at a 25% bit rate.
characteristics are required.
Grey Code Counters
As an example of this technique,
Because of their characteristic step
consider a counter circuit. In the
pattern of a single changing bit, Grey
case of a binary counter, not all of
code counters offer designers the low-
the registers change state on each
est power consumption of these three
significant clock edge. Designers can use rupts. If it is necessary to interface a 3.3V counter methods. The average bit change
a high speed clock for the LSBs of the CPLD to a 5V interrupt, system power can rate for an 8-bit Grey code counter is
counter, and then use a prescaled clock be saved by programming the microcon- approximately 13% as defined by the equa-
for the higher order bits, so the total troller interrupt such that the system oper- tion:
amount of power required by the clock ates with the interrupt level normally low.
buffers is decreased. This decreases the amount of time that the 1
Percent of bits toggling = X 100
interrupt is active (high) which will reduce n
Mixed Voltage Interfacing
the overall amount of power consumed
When interfacing devices that have differ- when under driving a CMOS input.
ent VCC levels, consider the impact caused The Grey code design implementation is the
The Effects of Implementation Style most difficult, however, because next-state
by under driving a CMOS input. Because
a CMOS input buffer is comprised of at Implementation style affects power con- information must be coded for each count
least two primary transistors, a P-channel sumption. For example, consider how dif- value.
pull up and an N-channel pull down, there ferent types of counters are implemented; The Bottom Line
exists a region of input voltage where both binary, Grey, and LFSR counters are cre-
transistors are slightly on, and current flows ated in different ways and require differ- Take full advantage of the CoolRunner low
from VCC to GND through these buffers. ent amounts of resources. Keep in mind power benefits by downloading the free “Low
This causes power to be wasted, and since that a minimal number of changing sig- Power Tips for CPLD Design” application
the output of this buffer may also be in the nals will always deliver the lowest dynam- note (in PDF format) from the Xilinx website
linear region, it can cause problems because ic power solution. at: www.xilinx.com/xapp/xapp346.pdf.
41
Applications Software
is attempting to be a master and address the µC to configure and control the opera-
by Anita Schreiber
this device as a slave. Assertion of SS auto- tion of the SPI master. Status of the current
Staff Applications Engineer, Xilinx
matically disables SPI output drivers in the transfer is provided to the µC via a status
[email protected]
master device if more than one device register in the Register File. Registers are also
The CoolRunner implementation of a Serial attempts to become master. included to contain the µC data to be trans-
Peripheral Interface (SPI) Master described mitted on the SPI bus and data received
The SCK, MOSI, and MISO pins of all SPI
here can be used to add an SPI controller to from the SPI bus. The SPI Control State
devices on the SPI bus are connected togeth-
microprocessors or microcontrollers that do Machine controls the shifting and loading of
er in parallel.
not provide this interface. It will permit SPI data in the SPI shift registers, and the
direct inter-processor communication and CoolRunner SPI Master Implementation generation of the slave select signals. The
communication with numerous commer- SCK clock logic generates an internal SCK
This SPI master design supports the follow-
cially available peripherals. based on the settings in the control register
ing features:
Serial Peripheral Interface Protocol for clock phase, division, and polarity.
• Microcontroller interface.
SPI is a full-duplex, synchronous, serial data Conclusion
• Multi-master bus contention detection
link. A single SPI device is configured as a CoolRunner CPLDs operate at the lowest
and interrupt.
master; all other SPI devices on the SPI bus standby power (<100µA) of any CPLD avail-
are configured as slaves. • Eight external slave selects. able today, and they are an ideal program-
The SPI bus consists of four wires: • Four transfer protocols available with mable logic solution for providing interface
selectable clock polarity and clock phase. controllers in portable or power sensitive
• Serial Clock (SCK) - Driven by the SPI applications. See www.xilinx.com/apps/
master and regulates the flow of data bits. • SPI transfer complete interrupt. epld.htm#CoolRunner for an SPI reference
The SPI specification allows a selection of • Four different bit rates available for SCK. design which contains a detailed application
clock polarity and a choice of two funda- note (XAPP348), VHDL source code, and
mentally different clocking protocols on an A high-level block diagram is shown in VHDL testbenches.
8-bit oriented data transfer. Figure 1. The microcontroller (µC) interface
is a VHDL module that you can
• Master Out Slave In (MOSI) - Data out- easily modify to support other
put from the SPI Master and input to the microcontrollers.
SPI Slaves.
The Address Decode/Bus
• Master In Slave Out (MISO) - Data input Interface logic interprets
to the SPI Master and output from the the bus cycles of the
selected SPI Slave. Only one selected slave microcontroller
device can drive data out from its MISO and performs the
pin. read/write opera-
• Slave Select (SS) - Selects a particular slave tions to the
via hardware control. Slave devices that are Register File.
not selected do not interfere with SPI bus The Register
activities. The SS control line can be used File is the inter-
as an input to the SPI master indicating a face between
multiple-master bus contention (SS_IN). the µC and the Figure 1 - CoolRunner SPI master
If the SS signal to the master is asserted, it SPI master logic,
indicates that some other device on the bus and allows
42
Perspective Power Consumption
43
Perspective Reliability
by Austin Lesea
The Solution
[email protected]
Hitesh Patel,
Manager Alliance EDA Marketing, Xilinx
[email protected]
44
Perspective Reliability
Substrate Bounce have to worry about whether the silicon are mature and sophisticated resulting in
will work. ASIC IC designers must also accurate delay prediction. The placement
Substrate bounce is caused by the switch-
expend the same effort, but often the and routing tools are smart enough to
ing of fast, high-current I/O transistors.
expense is too high and resources are determine when to update the
It can cause double-clocking, as well as
inadequate, resulting in designs that are timing/slack information dynamically,
indeterminate and invalid logic states.
not reliable. based on changing interconnect delays
The substrate bounce effects on Xilinx
during layout.
FPGAs are modeled precisely, and our The Software Solution
designs are guaranteed to provide ade- The newer Virtex and Spartan-II genera-
Using the current ultra-deep sub-micron
quate isolation. For all of our FPGAs, tion FPGAs, are co-developed by the
design rules, interconnect delays account
Xilinx models and minimizes substrate Xilinx software and hardware teams. This
for approximately 75% of the path
noise in the design prior to making the process naturally results in a highly pre-
device masks for fabrication. dictable architecture, and predictable
interconnect delays. This allows the soft-
Interconnect Crosstalk
ware to make correct placement and
Interconnect crosstalk between the routing decisions early in the design
tiny wires, on multiple metal lay- process, even during the synthesis
ers, now requires a 3D field phase where there is maximum
solver for extraction; anything flexibility to influence the
less is not accurate enough to design performance.
completely model the inter-
The quality of the synthesis
connections on the chip.
wireload models plays a key
Models must take into
role in the timing pre-
account the potential for
dictability after synthesis.
crosstalk induced delay so
Advancements in FPGA syn-
that any possible user circuit
thesis technology (such as
will behave predictably and
improved wire delay estima-
reliably, regardless of process
tion by synthesis-driven place-
and silicon variation. Xilinx IC
ment tools) enable highly accu-
designers perform extensive
rate timing predictability, and is on
interconnect modeling using field
average 20% to 25% more accurate
modeling to ensure that our FPGAs
than ASIC technology. In addition, re-
do not have crosstalk problems.
synthesis capabilities for critical path
Device Fabrication optimization reduce the number of
design iterations for faster time to timing
IC Designers must also model devices
delays. Therefore, design modeling and closure.
carefully to avoid yield problems, speed
timing closure become significant factors,
grading issues, and design failures. Xilinx Conclusion
and development tools are critical.
manufactures millions of devices for each
With FPGAs, you can now implement
FPGA family, and the manufacturing For FPGAs, interconnect delays have
multi-million gate designs without being
process utilizes device monitors, test always been a significant portion of path
plagued by the DSM problems inherent in
structures, and other process related struc- delays because of the existence of switches
ASICs. Our advanced FPGA architectures
tures that are measured on every wafer. in the routing paths. However, FPGAs
shelter you from physical design issues
The models are incrementally refined so have specific, fully characterized routing
such as crosstalk and ground bounce, and
that all process corners are modeled. The resources and, therefore, accurate delay
the latest synthesis and implementation
use of highly accurate device models models can be achieved using exhaustive
software delivers timing predictability
enables Xilinx IC designers to rigorously empirical methods and functional analysis.
early in the design flow, giving you a sig-
verify and characterize all of our devices.
FPGA software tools are extremely nificant reduction in design closure time,
The combination of extensive and accu- mature in the area of handling intercon- and a shorter time to market.
rate modeling, simulations, and verifica- nect delays. Therefore the ultra-DSM
tion requires hundreds of engineer-years. technology does not pose a new problem
Xilinx has made the investment and pre- for FPGA place and route tools.
engineered all of our devices so you don’t Specifically, the delay estimation methods
45
Applications FPGAs
Implementing a
Histogram for Image
Processing Applications
Use the VirtexTM or SpartanTM-II True Dual-PortTM RAM and DLLs to create a real-time histogram.
by Edgard Garcia
Engineer, Multi Video Designs
[email protected]
Image processing is key for many automated of the 256 counters will be active at each A 16-bit incrementer will allow you to
industrial inspection applications. However, valid pixel clock (only one value will be update the RAM contents during a Read-
even the most sophisticated algorithms can’t updated). Therefore, the registers of the 256 Modify-Write operation, where the video
extract the right information if the image x 16 bit counters can be replaced by a mem- data inputs are used as the address of the
contents are not available in a convenient ory array, such as a 4K-bit block memory block. Figure 1 shows the block dia-
format. By using a histogram, you can ensure SelectRAM™ organized as 256 x 16 gram of a basic hardware implementation.
that the image content can be easily (RAMB4_S16).
processed.
What is a Histogram?
For each possible pixel value, the histogram
algorithm counts the number of times the
value was encountered in the current image.
For example, the histogram of an 8-bit-per-
pixel image will contain 256 values (28), each
one representing the number of pixels found
at this value. This allows a microprocessor or
DSP to quickly get the profile of the image,
and take the appropriate decisions, by ana-
lyzing just those 256 precomputed values.
You can do this easily, in real time and at low
cost, in a Virtex or Spartan-II FPGA.
A Basic Hardware Implementation
For an 8-bit-per-pixel image, 256 different
values are possible for each pixel, so 256 16-
bit counters would be necessary to complete
the real time histogram. However, only one Figure 1 - Basic hardware implementation
46
Applications FPGAs
VIDEO_CLOCK
CLK2X
CLK90
BLANKING#
DOA 0 0 0 1 0 x y
READ READ READ READ READ READ READ READ READ READ
WRITE WRITE WRITE WRITE WRITE WRITE WRITE WRITE WRITE
PIXEL CYCLE
Figure 3 - Timing
47
Applications Digital Radio
High Performance
Digital Down-Converters
for FPGAs Virtex FPGAs surpass off-the-shelf ASSPs
in design flexibility and system integration.
48
Applications Digital Radio
49
Applications Digital Radio
Noise is generated by an imperfect rendi- the width. To keep the size of the table that makes this task fairly easy in hard-
tion of the sinusoid at the output of the reasonable without sacrificing frequency ware. (See www.andraka.com/cordic.htm
NCO. That noise can be phase errors resolution, we must truncate the phase for details on CORDIC.) The algorithm
(angular distortions) or amplitude errors. accumulator output, using only the MSBs simultaneously generates a sine and cosine
The phase accumulator generates only a at the cost of degrading the SFDR. The value by rotating a unit vector from the
phase angle, so there is no amplitude size of a table grows exponentially with “I” axis to the desired phase angle using a
error. Errors caused by quantization of the phase resolution, so for even moderate series of successively smaller elemental
phase increment can cause a frequency SFDR requirements, the table becomes rotations. The angles of those elemental
error, but not a changing phase error. larger than what we would like to use in rotations are specifically selected for a
an FPGA. shift-and-add implementation. The “I”
Waveform Synthesis
(real or in-phase) and “Q” (imaginary or
Simple amplitude and phase symmetry
The phase accumulator produces a quadrature) components of the rotated
allows us to reduce the table size by a fac-
“wrapped” phase angle that must be con- vector are proportional to the cosine and
tor of 4 by reusing the first quadrant data
verted to a sampled complex sinusoid. sine of the phase angle respectively.
for the other quadrants. The same table is
The accuracy of the conversion directly
used for the both sine and cosine values, The Mixer
affects the noise performance of the
so if clock cycles per sample permit, the
DDC. The noise introduced by the NCO The function of the mixer is to multiply
same ROM can be read twice per sample.
is caused by amplitude and phase errors, the incoming signal by the locally generat-
In Virtex devices, you can use the dual-
which manifest themselves as reduced sig- ed sinusoid to shift the spectrum of the
port feature of the block RAM to simulta-
nal-to-noise-ratio (SNR) and degraded signal. A straightforward implementation
neously obtain both the sine and cosine
spurious free dynamic range (SFDR) uses two multipliers, one each for the sine
values from a shared ROM. Large ROMs
respectively. Each additional bit of phase and the cosine. The multipliers produced
in FPGAs are expensive in terms of
improves the SFDR by about 6dB and by the CORE Generator tool can easily be
resources used so, for phase resolutions of
extra amplitude resolution adds to the used for this application.
more than 8 to 10 bits, other methods
SNR by about 6dB.
should be used. If we use CORDIC for the wave shape
The most obvious conversion circuit is a conversion, however, we can obtain the
The large ROMs can be avoided by algo-
simple lookup table of sine values by mixer function for free. The combination
rithmically generating the sine and cosine
phase angle, which is addressed directly by of the NCO and the mixer multiplies the
on the fly. While that sounds difficult,
the phase accumulator. The phase resolu- incoming signal by cos(t)-jsin(t) = e-jt.
there is a simple shift-add algorithm based
tion determines the depth of the table, Because the NCO and mixer generate a
on vector rotation called CORDIC
while the amplitude precision determines complex phasor, the net effect is to rotate
(COordinate Rotation DIgital Computer)
50
Applications Digital Radio
the incoming signal by a constantly sufficiently narrow, the rejection of the arithmetic (see www.andraka.com/dis-
changing phase angle. Rather than rotat- aliased image is quite good, much better tribu.htm for a tutorial on distributed
ing a unit vector to get I and Q scale val- than might be expected otherwise. We can arithmetic).
ues, we can use the CORDIC to directly also cascade several sections to lower the
Identical filters must be applied to both
rotate the input signal. This eliminates the amplitude of the side lobes. The passband
the I and Q channels. Even using the slow-
two multipliers and avoids the potential of this filter does exhibit a pronounced roll-
est speed grade Virtex FPGAs, the DDC
for additional quantization noise. off that usually must be corrected by the
design described here can be clocked at
clean-up filter. Keeping the passband of the
A more subtle advantage to using more than 130 MHz if the design is care-
final filter narrow not only improves the
CORDIC is that it actually rotates the fully executed and floor planned. This
alias rejection, but also makes the roll-off
vector rather than multiplying the compo- high potential clock rate permits us to
compensation easier.
nents separately. This means it does not time multiplex the I and Q data through
add noise to the signal other than the The advantages of using a CIC filter in this the same filters by interleaving the I and Q
spectral spurs caused by the phase quanti- implementation are: samples on a clock-to-clock basis. Thus for
zation. The CORDIC hardware occupies • It is a computationally easy filter to real-
very little additional overhead, we can
about the same area as a pair of multipli- ize.
handle both the I and Q data in the same
ers with the same input width in the filter. We can also use the same technique
Virtex architecture. Thus, in effect, we • The same filter structure works for a very to handle several independently tuned
have a net area savings about equal to wide range of decimation ratios by simply channels with a single instance of the
what we would have used for the sine and changing the timing of the clock enables DDC design.
cosine wave shape conversion. The on the comb section.
An advantage of using an FPGA for the
CORDIC rotator also accepts a complex • The filter response referred to the output DDC is that we can customize the filter
input, so no additional hardware is need- sample rate is nearly independent of the chain to exactly meet our requirements.
ed for applications requiring a complex decimation ratio, so one clean-up filter With an off-the shelf chip, we would have
signal input. can be used for all decimation ratios. to either fit our requirements to the chips’
The Filter and Decimator The gain of the CIC filter is a function of features or add additional post-processing
the decimation ratio. Therefore, a barrel to modify the output to our needs.
The mixed signal has to be filtered to iso-
shifter is required after the CIC filter in Conclusion
late the portion of the spectrum containing
applications where the decimation ratio has
the signal of interest. The filter typically We’ve briefly discussed implementation of
to be changeable without changing the cir-
has to be a narrow-band filter with a fairly a high performance DDC in an FPGA. If
cuit. This is an issue in an ASSP DDC, as
high rejection of unwanted spectrum. This we apply these techniques to a 16-bit
it is a one-size-fits-all solution. Most of the
translates to an expensive filter if it is done DDC with a 64 MS/sec input and a 100 dB
time in FPGAs, we can hardwire the shift,
at the input sample rate. Instead, we can SFDR requirement, we come up with a
or at worst, use a limited barrel shift,
use a multi-rate approach in which the sig- design that occupies about 550 Virtex
because we can customize the DDC for our
nal is first decimated to a much lower sam- CLBs (configurable logic blocks). The
application.
ple rate using a less computationally inten- occupied area is heavily influenced by spe-
sive filter. Then the signal is cleaned up “Clean-Up” Filter cific requirements of the application. The
with a second more complex filter working The output of the CIC filter has a sinc cited design, shown in Figure 2, consists of
at the decimated sample rate. shape, which is not suitable for most appli- an NCO and mixer implemented as a
cations. A “clean-up” filter can be applied CORDIC rotator and a programmable
High Ratio Decimator
at the CIC output to correct for the pass- decimating filter. The filter is a 4th order
A high-ratio decimation can be performed band droop, as well as to achieve the CIC filter followed by a 63-tap symmetric
very efficiently using a cascaded integrator- desired cut-off frequency and filter shape. Finite Impulse Response (FIR) filter.
comb (CIC) filter. The CIC filter is a This filter typically decimates by a factor of Backing off on any of the requirements
recursive implementation of the “boxcar” 2 or 4 to minimize the output sample rate can substantially reduce the area occupied
or moving average filter. The spectral after the passband has been limited and by the DDC. Because we are using an
response of such a filter is the sinc (sinx/x) shaped. An application-specific filter FPGA, we have the luxury of picking the
function. In a CIC filter, the number of response, such as a raised cosine Nyquist features and performance to match our
effective taps is an integer multiple of the filter, can either be combined into the cor- application. If we were to use an ASSP
decimation ratio, so the filter nulls alias rection filter or be applied at a subsequent component, we would have to mold our
onto the passband when the spectrum is filter stage. The clean-up filter is compact- requirements and design around the capa-
folded by decimation. If the passband is ly implemented using serial distributed bilities of the selected device.
51
New Technology Cores
Processing
that solve difficult design problems.
with LogiCOREs
by David Mann
Multimedia ASVC Marketing, Integrated Silicon Systems
[email protected]
Color Space Conversion Color Space Converter LogiCOREs
The real-time manipulation of high-resolu-
tion images (either moving-picture video Color Space Conversion (CSC) is one of Xilinx presently offers a family of four dif-
or still-frame image streams) usually the standard image processing techniques; ferent Color Space Converter LogiCOREs,
demands custom digital video processing it’s a trick that allows you to more-effi- as shown in Table 1.
in hardware. But why use a bunch of dif- ciently use the digital image data, associat- Other useful pre-processing functions such
ferent ASSPs for common video/image ed with a color pixel, by switching color as Gamma Correction are also incorporat-
processing tasks-such as Color Space domains. Processing an image in the Red- ed in these particular ISS-designed
Conversion (RGB to YCrCb or vice versa), Green-Blue color space with a set of (R, G, LogiCOREs, so you spend less time devel-
or Discrete Cosine Transform, when you B) values for each and every pixel really oping your CSC design using these solu-
can do it all in a Virtex, Virtex-E, or isn’t very efficient. The RGB representa- tions.
Spartan device? And the performance of tion has a significant downside: although
these FPGA-optimized “Application- its the natural paradigm for rendering full- Here’s just a few application areas for the
Specific Virtual Components” is very color pictures using display technologies Color Space Conversion family of
attractive. that emit mixtures of the three primary LogiCOREs:
There are several standard video processing colors (such as CRTs, LCDs, LEDs, etc), it • Video output conversion to digital RGB.
functions that are common to many vision is not as efficient as special alternative
schemes. • Image filtering.
systems; these systems include video broad-
cast, machine vision, and image filtering The standard alternative representations • Machine vision.
applications. Now, thanks to Integrated use de-correlated components–luminance • Video and still-image processing.
Silicon Systems’ ASVC technology, the IP and chrominance. Thus CSC only comes
cores powering these applications can be DCT Engine LogiCORE
into play whenever it’s time to present a
implemented in FPGAs, and Xilinx can picture to the human visual cortex, or after Figure 1 shows where one of the
supply all of the vital links in your cus- a real-world image is captured using a RGB2YUV or RGB2YCrCb Color Space
tomized video compression/decompression scanner or camera followed by processing Converter LogiCOREs fits into a typical
chain (as shown in Figure 1). in the digital domain. video/image processing flow. This example
52
New Technology Cores
data path incorporates a Discrete Cosine architecture because the 2-D architecture Using the Combined Forward/Inverse DCT
Transform block–a necessary element in uses row-column decomposition to separate LogiCORE makes it very easy to create
image compression algorithms. the transform into two distinct 1-D opera- your own design, even if you don’t have the
tions. Each operation generates a set of inter- engineering bandwidth or DCT expertise.
Now, there’s a new LogiCORE which com-
mediate results that are written into trans- And, the Xilinx software tools make it easy.
bines both Forward DCT and Inverse DCT
pose memory. Data is “burst” into the
functions in one, and it’s ISO/IEC 10918-1 Designing with LogiCOREs
DCT/iDCT core as blocks of 64 values, and
JPEG compliant. This high-performance
the results of the transform are presented in If you’re familiar with HDL-based design
DCT/iDCT engine offers 1-symbol/cycle
the same format. and simulation, component instantiation,
processing power thanks to its fully pipelined
script-based logic synthesis, and the use of
architecture. The design is highly-tuned for When in Forward DCT mode, this
testbenches, then you’re all set to design
optimal performance across the various LogiCORE takes 8-bit input data words and
using LogiCOREs. All the LogiCORE
Xilinx FPGA technologies. It requires only produces an 11-bit output. In the Inverse
modules described here are available under
1756 slices in Virtex, 1759 in Virtex-E, or mode, the converse is true. You’ve got 14-bit
a standard license agreement from Xilinx.
1728 in Spartan devices; and only 48 IOBs cosine coefficients, and a 15-bit representa-
You get the code and test vectors, together
are needed for interfacing. tion in transpose memory, so there’s no need
with installation and instantiation instruc-
to worry about precision.
This design is very efficient in the Xilinx tions as part of the LogiCORE deliverables.
Conclusion
LogiCORE RGB to YUV YUV to RGB RGB to YCrCb YCrCb to RGB
Digital video/image processing applica-
Slices used 230 147 211 186 tions can be very difficult to develop.
IOBs used 50 50 49 49 However, the new Xilinx LogiCOREs offer
feature-enhanced Color Space Conversion
System clock and Forward/Inverse Discrete Cosine
Virtex >75 MHz >75 MHz >60 MHz >75 MHz Transforms that give you a time-to-market
advantage.
Spartan-II >80 MHz >65 MHz >65 MHz >70 MHz
There are more LogiCOREs in develop-
Virtex-E >100 MHz >100 MHz >90 MHz >90 MHz
ment for digital video applications, includ-
Features used Carry logic Carry logic Carry logic Carry logic ing standalone M-JPEG Codec solutions
for Virtex and Virtex-E. Talk to an ISS rep-
Precision 10-bit 10-bit 10-bit 10-bit resentative or your local Xilinx FAE about
Datasheet Yes Yes Yes Yes your particular application.
To find out more, or to access the
Availability Now Now Now Now
datasheets, visit the Xilinx dedicated IP
Table 1 - LogiCORE specifications Center at: www.xilinx.com/ipcenter.
53
New Products Development Boards
Stackable
by Dr. Stefan Schafroth
Hardware and software development engineer,
ErSt Electronic GmbH
[email protected]
To help you increase your productivity
Boards for
Virtex-based board, the new boards pro-
vide all the necessary basic components
needed in most of FPGA-based designs.
In addition, we incorporated an optional
large ZBT RAM to satisfy the needs of
Spartan-II, Virtex,
modern telecommunication and imaging
applications. All I/Os are routed to head-
er connectors where you connect your
special purpose interfaces. By stacking
several boards you can easily cope with
54
New Products Development Boards
you can stack them together and reuse our Push buttons, DIP switches, and LEDs Figures 2 and 3 show the top and bottom
power module (PWR3) as the supply for form a user interface that allows you to view of a development board module
the various required supply and reference provide configuration data and monitor equipped with an XCV1000E FPGA.
voltages. display status information from the run-
Applications
ning system.
Key Features
The board is very well suited to:
Each development board
• Evaluate the larger members
uses either a Spartan-II,
of the Spartan-II, Virtex,
Virtex, or Virtex-E FPGA
and Virtex-E FPGA families
in a PQ-208 package
in the PQ-208 or HQ-240
(Spartan-II) or an HQ-
packages, respectively.
240 package (Virtex,
Virtex-E). Vital compo- • Experiment with different
nents for a basic system low voltage I/O standards.
are placed around the • Implement custom designs
FPGA, including two using the full power of the
crystal oscillators, three Virtex architecture.
push buttons, eight DIP
switches, and nine status • Test algorithms under real-
LEDs. An optional ZBT time conditions and watch
RAM helps to support the signals with a logic ana-
any memory demanding lyzer.
applications. Figure 2 - Top view of the development board module
• Quickly and easily expand
All configuration modes of the complexity of the sys-
the FPGA are supported, tem by stacking several
and you can provide con- boards.
figuration data either by Conclusion
using serial configuration
PROMs (SCPs) sitting in The EVALXC2S, EVALXCV,
onboard sockets, in-system and EVALXCVE develop-
programmable (ISP) PROMs, ment board series gives you an
or by connecting a Xilinx ideal platform for evaluating,
MultiLINX, XChecker, or implementing, testing, and
JTAG cable. The ISP extending custom designs
PROMs and the FPGA using Spartan-II, Virtex, or
form a single JTAG chain. Virtex-E devices. Using the
A functional diagram optional ZBT RAM you can
detailing the building even implement applications
blocks of the prototyping calling for large amounts of
boards is shown in Figure 1. Figure 3 - Bottom view of the development board module memory. You can also easily
integrate the board into a
The crystal oscillators are larger system. Like their pred-
housed in standard DIL-8 or DIL-14 size You can configure all eight I/O banks ecessor, the boards can be combined with
metal cans plugged into sockets, so you can independently of each other, and you can the PWR3 power module to form a com-
easily change the frequency. To facilitate the pact unit that runs from a single power
select their VCCO and reference voltages
distribution of very fast clocks, we mounted supply. This makes it ideal for teaching,
individually with jumpers. Two different
four SMB coaxial connectors, next to the seminars, and courses.
reference voltages (derived from the
clock pins of the FPGA, which may be ter-
minated with optional resistors to ground. FPGA core voltage) can be generated
The synchronous clock input of the ZBT onboard by means of trim potentiometers. For additional information on
RAM is also connected to one of these con- Up to eight reference voltages can be con- EVALXC2S/XCV/XCVE see:
nectors. Alternatively you can use an FPGA- nected from an external source, such as www.erst.ch, or
generated clock driven on an I/O pin. our PWR3 power module. contact us at [email protected].
55
Perspective Services
Xilinx
by Jannis McReynolds
Business Development Manager
Xilinx, Global Services Division
[email protected]
It’s a fact of life–the market waits for no
one. Your new product idea won’t be nearly
as successful if your competitors get to mar-
Global
ket first. But success is not just about mov-
ing quickly; design methodologies are get-
ting more and more complex with each new
advance in technology, so you must also
move intelligently.
Xilinx Global Services is a portfolio of serv-
ices, and tools designed to keep you on the
fast track. From technical support to educa-
Services
tion to design consulting, Xilinx Global
In addition, you receive eight Xilinx educa- Recorded e-Learning allows you to log-in At support.xilinx.com you’ll find:
tion credits you can use to further reduce from anywhere, listen to recorded educa-
• The Answers Database, with over 4,000
design time. tion sessions and
proven design solutions.
learn at your own
Of course, our free
services already give “The Xilinx Design Services teams bring pace. The e-Learning • Problem Solvers, to troubleshoot device
modules can be configuration, software installation, and
you access to a
wealth of technical a lot to the table – unique expertise, accessed anytime, day JTAG issues.
or night, and are less
support, service • A Web support interface, which allows
packs, and software experience with Xilinx tools and expensive than the
you to open a Platinum priority case.
live e-Learning classes.
updates, but our
Platinum Technical development, and the fruits of All of our Education • Discussion forums through which you
Service gives you Services provide can interact with other designers.
top priority and is investments Xilinx continuously excellent, hands-on
As you might expect, support.xilinx.com is
always available training, suitable for
when you are: in makes in R&D,” said Dave DeMarinis, all skill levels, from
available seven days a week, 24 hours a day,
365 days a year, so you can troubleshoot
North America the novice to the
between 7 a.m. and the Business Development Manager for expert. Classes are led
your design when it’s convenient for you–
anytime, anywhere, support.xiinx.com has
5 p.m. Pacific by instructors who
Standard Time Global Services. “With our team, you pay are themselves experi-
your answers.
by Renne Ricciardi We can help you determine which e-learn- or if you prefer a date or time that is more
Business Development Manager, ing modules are right for you through our convenient for you, simply call us and we
Xilinx Educational Services brief self-assessment pre-tests. These tests will schedule an instructor to deliver a
[email protected] help you gauge your knowledge and deter- module of your choice for your group
Despite enormous technical advances since mine what you need to know so you won’t only. The only requirement is that you
the days of chalkboards and spiral note- be spending money on training you’ve have a minimum of six people who want
books, traditional instructor-led classroom already had. to attend the session. The maximum num-
training is still the best way to learn when ber of people is 100.
Live e-Learning Environment
you need an in-depth understanding of a A private session gives you more control
specialized topic. In the day-to-day race to Live instructors present classes and mod-
over the pace of delivery. Just ask the
market, however, time is a luxury few ules in real time. During each session, you
instructor to speed up or slow down. In
designers can afford. will have the opportunity to interact with
addition, the questions and discussion can
the instructor, as well as collaborate with
Online Learning, On the Spot be focused on issues and ideas important
online subject experts. You may pose ques-
to you.
With Xilinx e-Learning, you can choose tions to the instructor, view slides, share
from more than 70 online classes or mod- whiteboards, and discuss issues with other If you have designers located in remote
ules covering a broad range of topics and students in chat rooms. Pop quizzes locations, or if you have designers who are
skills involving Xilinx products and servic- appear periodically throughout the ses- interested in different modules, call us and
es. For example: sion–and you get instant feedback. ask about a Bundle Package Program.
With a bundle purchase, your people can
• Introduction to FPGA Design. Participating in an e-learning session is
complete modules when it’s most conven-
simple. You access the e-learning module
• Timing Constraints. ient for them. Any of the designers can
using your Web browser and a phone con-
sign in to attend a session at any time
• Spartan-II Architecture. nection. No additional software is
throughout the year, or until you have used
required. Ten minutes before the class,
• ModelSim XE. up all the modules you purchased.
you call into the conference, log on to the
• Virtex-EM Architecture. URL, download training documents, and Conclusion
you’re ready to go.
Each module is an hour in length, and Xilinx e-Learning is the most cost-effective
enrollment is quick and easy. Modules are Xilinx e-Learning classes are open to solution to help you keep your technical skills
taught weekly and presented at different everyone, and each e-learning module sharp and up-to-date. To learn more about
times throughout the day to support costs $100 per session. Xilinx e-Learning, visit the Xilinx e-Learning
worldwide access. Moreover, Xilinx website at www.support.xilinx.com/support/
Customized e-Learning
e-Learning won’t interfere with your proj- education-home.htm or call the registrar at
ect timeline, because there’s no lost pro- If you have multiple designers who want 877-959-2527.
ductivity due to travel time. to take the same course at the same time,
58
Perspective Partnerships
59
Reference Software
Software Solutions
Version 3 Development Systems
Quick Reference Guide
Xilinx development systems give you the speed you need. With the initial release of our
version 3 solutions, Xilinx place and route times are as fast as two minutes for our
200,000 gate, XC2S200 Spartan™-II device, and 30 minutes for our one million gate,
system-level XCV1000E Virtex™-E device. That makes Xilinx developmen systems the
fastest in the industry.
And with the push of a button, our timing-driven tools are creating designs that sup-
port I/O speeds in excess of 800 Mbps, and internal clock frequencies in excess of 300
MHz. With each quarterly release, we are further accelerating your design process.
Base and Base Express Configurations
Xilinx desktop design solutions combine powerful technology with an easy to use inter- The Base and Base Express configurations
face to help you achieve the best possible designs within your project schedule, regard- provide push button design flows and sup-
less of your experience level. For more information on any Xilinx products, visit port a broad array of FPGA and CPLD
www.xilinx.com devices targeted for low density and high
Alliance Series Solutions: volume applications.
The Alliance Series Solutions contain powerful open systems implemen-
tation tools that are engineered to plug and play your existing design Standard and Express Configurations
flow. This combination of advanced features delivers high performance The Standard and Express configurations
results on the toughest designs. combine push button flows with powerful
auto-interactive tools. These tools give
designers more influence and control over
Foundation Series ISE Solutions: implementation while maintaining the ben-
Foundation Integrated Synthesis Environment (ISE) is Xilinx next gen- efits of design automation.
eration design environment, optimized to deliver the benefits of an
HDL methodology. Foundation ISE is packed with technologies that Elite Configurations
help you bring your product to market faster. The Elite configurations are designed to
support powerful design flows that deliver
high-performance designs for even the high-
Foundation Series Solutions: est density, multi-million gate FPGA devices
The Foundation Series solutions are complete, ready-to-use design envi- from Xilinx.
ronments for programmable logic design based on industry-standard
schematic, HDL, and pushbutton design flows.
Xilinx Web-based Design Solutions provide designers the ability to engage in digital
design activities, on-line, using Xilinx application servers, or download design and
implementation software modules for use in their own design environment. These
applications include:
WebFITTER: WebFITTER URL:
The WebFITTER is a free Web-based design tool that allows system Go to the Xilinx website
designers to evaluate their designs using Xilinx XC9500 Series http://www.xilinx.com and jump to
CPLDs. "WebFITTER"
WebPACK:
The WebPACK is a collection of four free downloadable software
WebPACK URL:
modules including ABEL v7.1, VHDL and Verilog synthesis,
Go to the Xilinx website
design implementation tools, and device programming software.
http://www.xilinx.com and jump to
WebPACK now includes support of the entire Spartan-II FPGA "WebPACK"
family as well as the 300,000 system gate Virtex XCV300EFPGA.
60
Reference Software
61
Reference Virtex
PCI Compliant
Output Drive
Logic Gates
Gate Range
CLB Matrix
Logic Cells
Flip-Flops
Max. I/O
1.8 Volt
2.5 Volt
3.3 Volt
5.0 Volt
and supports 20 I/O
CLBs
(mA)
standards including DEVICES KEY FEATURES
LVPECL, LVDS, and XC4013XLA 1368 13K 10K-30K 18K 24x24 576 1536 192 12/24 Y – – – X
XC4020XLA 1862 20K 13K-40K 25K 28x28 784 2016 224 12/24 Y – – – X
Bus LVDS differential
XC4028XLA XC4000 Series: 2432 28K 18K-50K 33K 32x32 1024 2560 256 12/24 Y – – – X
signaling. Density
XC4036XLA 3078 36K 22K-65K 42K 36x36 1296 3168 288 12/24 Y – – – X
Leadership/
• The Virtex-EM XC4044XLA High Performance/ 3800 44K 27K-80K 51K 40x40 1600 3840 320 12/24 Y – – – X
Extended Memory XC4052XLA SelectRAM 4598 52K 33K-100K 62K 44x44 1936 4576 352 12/24 Y – – X *
XC4062XLA Memory 5472 62K 40K-130K 74K 48x48 2304 5376 384 12/24 Y – – X
family consists of two *
XC4085XLA 7448 85K 55K-180K 100K 56x56 3136 7168 448 12/24 Y – – X *
devices that have a
XCV50 1728 21K 34K-58K 56K 16x24 384 1536 180 2/24 Y – – X *
high RAM-to-logic – – X *
XCV100 2700 32K 72K-109K 78K 20x30 600 2400 180 2/24 Y
gate ratio that is target- XCV150 Virtex Family: 3888 47K 93K-165K 102K 24x36 864 3456 260 2/24 Y – X I/O *
ed for specific applica- Density/
XCV200 Performance 5292 64K 146K-237K 130K 28x42 1176 4704 284 2/24 Y – X I/O *
tions such as gigabit XCV300 Leadership 6912 83K 176K-323K 160K 32x48 1536 6144 316 2/24 Y – X I/O *
per second network XCV400 BlockRAM 10800 130K 282K-468K 230K 40x60 2400 9600 404 2/24 Y – X I/O *
Distributed RAM
switches and high defi- XCV600 15552 187K 365K-661K 312K 48x72 3456 13824 512 2/24 Y – X I/O *
SelectI/O
nition graphics. XCV800 4 DLLs 21168 254K 511K-888K 406K 56x84 4704 18816 512 2/24 Y – – X *
XCV1000 27648 332K 622K-1,124K 512K 64x96 6144 24576 512 2/24 Y – – X *
XCV50E 1728 21K 47K-72K 88K 16x24 384 1536 176 2/24 Y X I/O I/O **
XCV100E 2700 32K 105K-128K 118K 20x30 600 2400 196 2/24 Y X I/O I/O **
XCV200E Virtex-E Family: 5292 64K 215K-306K 186K 28x42 1176 4704 284 2/24 Y X I/O I/O **
Density/
XCV300E 6912 83K 254K-412K 224K 32x48 1536 6144 316 2/24 Y X I/O I/O **
Performance
XCV400E Leadership 10800 130K 413K-570K 310K 40x60 2400 9600 404 2/24 Y X I/O I/O **
XCV600E BlockRAM 15552 187K 679K-986K 504K 48x72 3456 13824 512 2/24 Y X I/O I/O **
XCV1000E Distributed RAM 27648 332K 1,146K-1,569K 768K 64x96 6144 24576 660 2/24 Y X I/O I/O **
SelectI/O+
XCV1600E 8 DLLs 34992 420K 1,628K-2,189K 1062K 72x108 7776 31104 724 2/24 Y X I/O I/O **
XCV2000E LVDS, BLVDS, 43200 518K 1,857K-2,542K 1240K 80x120 9600 38400 804 2/24 Y X I/O I/O **
XCV2600E LVPECL 57132 686K 2,221K-3,264K 1530K 92x138 12696 50784 804 2/24 Y X I/O I/O **
XCV3200E 73008 876K 2,608K-4,074K 1846K 104x156 16224 64896 804 2/24 Y X I/O I/O **
XCV405E Virtex Extended 10800 130K 1,068K-1,307K 710K 40x60 2400 9600 404 2/24 Y X I/O I/O **
XCV812E Memory Capabilities 21168 254K 2,569K-3,062K 1414K 56x84 4704 18816 556 2/24 Y X I/O I/O **
* I/Os are 5V tolerant
** 5 Volt tolerant I/Os with external resistor
X = Core and I/O voltage
62 I/Os = I/O voltage supported
Reference Spartan
PCI Compliant
Output Drive
Logic Gates
Gate Range
CLB Matrix
Logic Cells
Maximum
Flip-Flops
Max. I/O
1.8 Volt
2.5 Volt
3.3 Volt
5.0 Volt
CLBs
(mA)
CPLD Family
Low-Power
Macrocells
Individual
Max. I/O
System
100 microamps, oper-
Device
OE Ctrl
JTAG
Ultra
ating currents 50-67%
lower than traditional XC9536XV 36 36 3.5 278 √ √ –
Best Pin-Locking
CPLDs, and pin-to-
2.5 VOLT XC9500XV XC9572XV JTAG w/Clamp 72 72 4 250 √ √ –
pin speeds of 5.0 ns. High Performance
ISP XC95144XV 144 117 4 250 √ √ –
High Endurance
XC95288XV 288 192 5 222 √ √ –
XC9536XL Best Pin-Locking 36 36 5 222 √ √ –
XC9572XL JTAG w/Clamp 72 72 5 222 √ √ –
XC9500XL
XC95144XL High Performance 144 117 5 222 √ √ –
High Endurance
3.3 Volt XC95288XL 288 192 6 208 √ √ –
ISP XCR3032XL 32 36 5 175 – √ √
XCR3064XL Ultra Low Power 64 68 6 145 – √ √
XPLA3 XCR3072XL JTAG 512 TBD TBD TBD – √ √
XCR3128XL Increased Logic 128 108 6 145 – √ √
Flexibility
XCR3256XL 256 164 7.5 140 – √ √
XCR3384XL 384 220 7.5 127 – √ √
XC9536 36 34 5 100 √ √ –
XC9572 72 72 7.5 83.3 √ √ –
Best Pin-Locking
5 Volt XC95108 108 108 7.5 83.3 √ √ –
XC9500 JTAG
ISP XC95144 144 133 7.5 83.3 √ √ –
High Endurance
XC95216 216 166 10 66.7 √ √ –
XC95288 288 192 10 66.7 √ √ –
64
Reference PROMs
XC17V
XC17S
65
Reference QPro
PCI Compliant
Output Drive
Logic Gates
Gate Range
CLB Matrix
Logic Cells
Maximum
Flip-Flops
Max. I/O
1.8 Volt
2.5 Volt
3 Volt
5 Volt
CLBs
(mA)
Device Key Features
**XQR/XQ4013XL XC4000 Series: 1,368 13K 10K-30K 18K 24x24 576 1,536 192 12/24 Y – – X *
Density
**XQR/XQ4036XL Leadership/ 3,078 36K 22K-65K 42K 36x36 1,296 3,168 288 12/24 Y – – X *
High Performance/ – – X
**XQR/XQ4062XL SelectRAM 5,472 62K 40K-130K 74K 48x48 2,304 5,376 384 12/24 Y *
XQ4085XL Memory 7,448 85K 55K-180K 100K 56x56 3,136 7,168 448 12/24 Y – – X *
XQV100 Virtex Family: 2,700 32K 72K-109K 78K 20x30 600 2,400 180 2/24 Y – X I/O *
Density/
**XQVR/XQV300 Performance 6,912 83K 176K-323K 160K 32x48 1,536 6,144 316 2/24 Y – X I/O *
Leadership
**XQVR/XQV600 BlockRAM 15,552 187K 365K-661K 312K 48x72 3,456 13,824 512 2/24 Y – X I/O *
Distributed RAM
**XQVR/XQV1000 SelectI/O 4 DLLs 27,648 332K 622K-1,124K 512K 64x96 6,144 24,576 512 2/24 Y – X I/O *
66
Reference IP
67
Introducing the Xilinx 3.2i software release . . . the fastest in the industry
Watch your designs go supersonic. With the new Xilinx 3.2i release, you can
place and route your next 100,000-gate design using a Spartan® FPGA in just
one minute, or your next one-million-gate design using
a Virtex™-E FPGA in only thirty minutes.
PN: 0010526-Q400 © 2000, Xilinx, Inc. All rights reserved. The Xilinx name, Xilinx logo and Spartan are registered trademarks, Virtex, and all XC and XS designated products are trademarks,
and The Programmable Logic Company is a service mark of Xilinx, Inc. All other trademarks and registered trademarks are the property of their respective owners.