0% found this document useful (0 votes)
97 views68 pages

Xcell 38

fpga

Uploaded by

Pedro Ramirez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
97 views68 pages

Xcell 38

fpga

Uploaded by

Pedro Ramirez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 68

Xcell journal

Issue 38
Winter 2000

THE AUTHORITATIVE JOURNAL FOR PROGRAMMABLE LOGIC USERS

PRODUCTS
New PROMS Simplify
FPGA Designs

APPLICATIONS
Creating High-Performance
Digital Down Converters

SOFTWARE
New Xilinx Foundation
Series ISE Software with
Integrated Design Flows

NEWS
Xilinx Launches
Platform-FPGA Initiative
Cover Story
Synopsys Chief Technology
Officer Discusses Platform-Base R

FPGA Design Issues


L E T T E R F R O M T H E E D I T O R

Business is booming,
and Xilinx is growing rapidly...
P rogrammable logic continues to grow faster than other segments of the semiconductor market, and
Xilinx continues to grow along with it–there is no end in sight. To keep up with this unprece-
dented expansion, we are building several new facilities, acquiring new companies, and incorporating
the best available complementary technologies.
We are leading the industry, not only with the most advanced device and software technologies but
also with the most ambitious plans for future developments. Here are some of our most recent
activities:
• Xilinx purchases two new buildings in San Jose. The buildings will provide approximately 200,000
additional square feet and are expected to house up to 700 new Xilinx employees. The purchase of the
new buildings is the latest in a string of new construction projects we have undertaken in the last few
years. In 1999, we completed construction of a fourth building at our San Jose headquarters, and other
projects are also underway at Xilinx locations in Colorado, Ireland, and California.
• Xilinx acquires Visual Software Solutions, Inc. (VSS). Their expertise will help us further extend our
EDITOR Carlis Collins software leadership and allow us to deliver a variety of customized tools that facilitate HDL-based
[email protected] design using our new Virtex-II FPGAs, thus improving your time-to-market. Included in the acqui-
408-879-4519
sition are the VSS HDL BencherTM and StateCADTM design tools.
SENIOR DESIGNER Andy Larg
• Xilinx acquires RocketChips, a leading developer of ultra-high-speed CMOS mixed-signal trans-
BOARD OF ADVISORS Dave Stieg ceivers serving the networking, telecommunications, and enterprise storage markets.
Mike Seither
Peter Alfke
The RocketChips gigabit and multi-gigabit serial CMOS transceiver technologies provide solutions
for a wide range of serial system architectures, and this technology will be a key feature of our next-
generation FPGA families.

Xcell journal • Xilinx acquires Tornado, a full-function formal verification application deploying state-of-the-art
circuit equivalence checking techniques. Based on many years of research and development efforts by
Xilinx, Inc. Veriphia, this new software adds significant value to our advanced development tools. We plan to
2100 Logic Drive
develop this technology even further and focus it on the Virtex FPGA architectures, in alliance with
San Jose, CA 95124-3450
Phone: 408-559-7778 key EDA partners.
FAX: 408-879-4780
©2000 Xilinx Inc. • Xilinx acquires Integral Design, a privately held design services firm headquartered in Dublin, Ireland.
All rights reserved.
The acquisition enhances our professional design services capabilities in the communications and mul-
Xcell is published quarterly. XILINX, the Xilinx timedia market segments. Recent advances in FPGA performance and capabilities continue to drive
logo, and CoolRunner are registered trademarks
of Xilinx, Inc. Virtex, LogiCORE, IRL, Spartan,
customer needs for additional design resources. Design services enable you to use dedicated designers
SpartanXL, Alliance Series, Foundation Series, with experience in Xilinx solutions to augment your own internal expertise and improve your time-
CORE Generator, IP Internet Capture, IP Remote
Interface, MultiLinx, QPRO, SelectI/0, SelectI/0+, to-market.
True Dual-Port, WebFITTER, WebPACK, ChipViewer,
Select RAM, Block Ram, Xilinx Online, and all XC- These developments continue to enhance our capability to offer you the best programmable logic
prefix products are trademarks, and The devices, development tools, and services in the industry.
Programmable Logic Company is a service mark of
Xilinx, Inc. Other brand or product names are Our current capabilities already give you a significant ease-of-use and time-to-market advantage. As
trademarks or registered trademarks of their
respective owners. the market expands, costs decrease, and many new applications become possible, thus fueling even
more growth. You can see why programmable logic is quickly becoming the technology of choice for
The articles, information, and other materials
included in this issue are provided solely for the many more applications, from low-cost consumer devices to high-performance switching systems-
convenience of our readers. Xilinx makes no
warranties, express, implied, statutory, or other-
there simply is no faster or easier way to create the systems of the future. And, Xilinx is well prepared
wise, and accepts no liability with respect to any to continue leading the way.
such articles, information, or other materials or
their use, and any use thereof is solely at the risk
of the user. Any person or entity using such
information in any way releases and waives any
claim it might have against Xilinx for any loss, Carlis Collins
damage, or expense caused thereby.
Editor
Cover Story Page 8 Contents Fall 2000

Designing With FPGA Platforms - page 8 Platform FPGA-The Future of Logic Design .................................4
The Chief Technology Officer at Synopsys discusses the need for plat- Designing with FPGA Platforms ...............................................8
form-based design in an era of system-on-a-chip FPGAs.
Inferring Multiplexers in FPGA Compliler II and FPGA Express ......11
Spartan-II PCI Development Kit .............................................13
Choosing the ARC User-Configurable Processor .........................14
Re-thinking Your Verification Strategies for
Perspective Page 14 Multimillion-gate FPGAs .......................................................18
Foundation ISE-What's In a Name .........................................21
Choosing the ARC User- Xilinx Foundation Series ISE Software-Delivering
Configurable Processor - page 14 the Benefits of HDL Design ...................................................22
ARC Cores and Xilinx provide everything you need to StateCAD XE for Optimizing State Machine Design ...................24
develop custom processor applications.
HDL Bencher XE for Fast Behavioral FPGA Verification ...............26
New Products Page 22 Guided Design Using BLIS ....................................................28
New High-Density Virtex PROMS and Cost-Effective
New Xilinx Foundation Series ISE Software Spartan-II PROMS ...............................................................30
- page 22 Create Efficient FIR Filters Using Virtex and Spartan FPGAs .........32
Integrated design flows increase your productivity and accelerate your LogiCORE PCI Module Is a Key Element in Voice over
time to market.
IP Applications ....................................................................35
Creating Finite State Machines ..............................................36
New Products Page 30 Design a Low-Power SMBus System Using
CoolRunner CPLDs ...............................................................39
New High-Density and Cost-Effective CoolRunner Power-Saving Tips and Tricks ................................40
PROMS - page 30 Creating a Low Power Serial Peripheral Interface ......................42
Xilinx announces the addition of the XC17V00 and XC17S00A CoolRunner CPLDs Beat the Heat ...........................................43
families to its existing line of PROMS.
FPGAs-The Solution to Ultra-Deep Sub-Micron Design ................44
Implementing a Histogram for Image Processing Applications .....46
High Performance Digital Down-Converters for FPGAs ................48
Digital Image Processing with LogiCOREs ................................52
Applications Page 48
Stackable Development Boards for Spartan-II,
High-Performance Digital Down Converters Virtex, and Virtex-E FPGAs ....................................................54

for FPGAs - page 48 Xilinx Global Services ..........................................................56


Virtex FPGAs surpass off-the-shelf ASSPs in design flexibility and system e-learning Makes the Grade ..................................................58
integration.
Xilinx in the Community-Champions for Change .......................59
Product Reference ...............................................................60

New Products Page 54

Stackable Development Boards -


Xcell journal
page 54 For a Free Subscription to the Xcell Journal
A new series of prototyping boards to help you quickly test and E-mail your request to: [email protected].
implement your FPGA designs. Please include:
1. Your full name and mailing address.
2. Your job title.
3. Your e-mail address.
4. Your company name.
5. Is this a new subscription or a renewal?
View from the top

Platform FPGA-
The Future of
Logic Design
Xilinx and its partners are building
the high-performance technology
platform on which the designs of
the future will emerge.

4
View from the top

The IBM Partnership We are implementing the PowerPC proces-


sor and other dedicated functions (such as
Our recent partnership with IBM brings us
memory, clock management, multipliers,
two immediate and dramatic benefits: the
and I/O interfaces) as hard cores, to give
power of the PowerPCTM hard core, and the
you the best possible performance. We will
advanced CMOS manufacturing capability
compliment these hard cores with over 50
of IBM’s state-of-the-art facilities. IBM
soft core peripheral functions. By keeping
gets intellectual property (IP) from Xilinx
most of the peripherals as soft cores, you
to help reduce defect densities and improve
can choose only those functions that you
manufacturing productivity. This partner-
need, and create custom designs with ease.
ship has far-reaching implications, and
gives both companies a significant compet- Advanced CMOS Manufacturing
By: Wim Roelandts, CEO, Xilinx itive advantage. Capability
The revolution in logic design con- The PowerPC Core IBM is one of the most advanced CMOS
tinues, bringing dramatic perform- semiconductor companies in the world,
IBM’s PowerPC module and CoreConnectTM
ance improvements and new capa- with device manufacturing technologies
bus will soon be integrated into Virtex-II
bilities that help you create the sys- that are typically a year ahead of most
FPGAs. With this powerful combination,
tems of the future, and get them to other companies. Our partnership with
you can achieve performance that was
market faster than ever before. IBM gives Xilinx access to this manufac-
never possible before, and you can quickly
FPGAs were once just interconnect turing technology, and a tremendous
develop unique system-level applications
routing and logic gates; then we competitive advantage. To be competitive
with greater ease. We found that the
added dedicated hard cores for in our marketplace we have to push man-
PowerPC was the most-used processor in
memory, clock management, and ufacturing technology to the limits. By
high-end designs; our communication and
I/O. Now, FPGAs are becoming using the most advanced manufacturing
the platform on which a combina- computing customers use the PowerPC
because it has good performance, and it has processes, we can reduce the size and cost
tion of complex hard cores and
a lot of peripherals and other functions that of transistors, which enables us to contin-
flexible soft cores combine with an
make it easy to use. ue building bigger and bigger FPGAs and
abundance of programmable logic
reduce the costs of existing devices.
gates to give you the best possible Processors like the PowerPC are often used
performance, along with the ease- as logic engines for low speed, very com- For IBM, FPGAs are the ideal “process
of-use and time-to-market advan- plex logic; they allow you to write detailed drivers” to test and refine their advanced
tages for which FPGAs are well programs that perform intricate condition manufacturing processes. Because FPGAs
known. Plus, we can bring you checking and control functions. However, have very regular structures and they allow
these advantages at a lower cost because a processor basically executes one us to address almost every square micron
than ever before. instruction at a time, it’s slower than actual of space on a chip, it makes them ideal for
Xilinx has entered a number of gates which can operate in parallel. troubleshooting problem process areas.
strategic partnerships and has So, Xilinx gets advanced manufacturing
Now, in one Platform FPGA, you will have
acquired key technologies for creat- technology and IBM gets devices that help
the best of both worlds; you have the dedi-
ing the new programmable logic them drive their manufacturing process to
cated PowerPC processor for complex con-
platform. Here is an overview of maturity. Thus, we can achieve better
trol applications, and you have program-
our recent activities. yields, faster, and that means lower costs.
mable logic gates for very-high-speed data
paths. The big advantage of having this all Xilinx has been the leader in developing
on one chip is that you can very quickly programmable logic technology, and we
move data from the PowerPC processor to have expanded the market dramatically–
on-chip peripherals or custom logic, which today, the PLD business is growing 40%
may be hard cores, soft cores, or unique faster than the regular IC business. Our
designs created with programmable gates. current technologies, bigger densities,
This will give you much higher perform- higher speeds, and lower costs, are expand-
ance than you get using separate chips, ing the market much faster than in the
which must pass signals through their slow- past; with IBM, we are pushing it even fur-
er I/O interfaces. ther.

5
View from the top

again and send it out to other


devices, over a single pair of pins.
This is a very efficient and low
cost way to transfer data.
We recently made several
announcements regarding our
commitment to gigabit serial I/O.
The ConexantTM Partnership
Xilinx recently entered into a
strategic development and licens-
ing partnership with Conexant
Systems, to integrate their
SkyRailTM 3.125 Gbps serial
transceiver technology into our
next generation Virtex-II FPGAs.
This hard core is the fastest core
available in CMOS today, and
will be available in the second
half of 2001 in a select offering of
several different Virtex-II devices.
In our Virtex-II architecture you
will get more than 20 different
I/O standards, plus several of
these gigabit serial I/O channels.

Gigabit Serial I/O Capability The best solution we have for this band- The high-speed SkyRail trans-
width bottleneck is to use point-to-point ceiver is compliant with industry standards
Many new systems today are requiring such as Gigabit Ethernet and Fibre
connections, over a single pair of wires,
much faster data transfer between systems, Channel in addition to the emerging 10-
operating at very high speeds. Currently,
boards, and devices, due primarily to the Gigabit Ethernet (IEEE 802.3ae) standard.
with this technology, you can achieve a
ever increasing demand for faster networks. By integrating quad transceivers, which are
data rate of two to three gigabits per sec-
Very high speed (gigabit per second) serial used to create 10-gigabit attachment unit
ond. The big advantage of this method is
I/O capability promises to solve this diffi- (XAUI) interfaces, a single FPGA can
that you use less wires and less power, and
cult problem. interface to both 10-Gigabit Ethernet and
the total amount of data you can move is in
Traditionally, data has been shared by using fact higher than with a typical parallel bus. OC-192c. The high-speed transceiver is
parallel busses such as PCI (Peripheral also compliant with the 2.5 Gbps
To create a gigabit serial I/O channel, a
Component Interconnect). However, there InfiniBandTM architecture standard being
hard core is needed; you cannot achieve
are inherent limitations with shared busses. created by the InfiniBand Trade
these speeds with soft cores in an FPGA.
To increase the speed of a shared bus, you Association.
The hard core does several functions; it
can either increase the speed of each wire The RocketChips Acquisition
receives and transmits the data, and it also
(which is very difficult to do because there
recovers the clock (because you can recover High-speed serial I/O capability is so
are many of them), or you can increase the
the clock from the data, a single pair of important, we decided not to stop at the
number of wires (which takes more and
wires is all you need for data transfer). The 3.125 Gbps speed offered by the Conexant
more I/O pins). For example, PCI was
hard core must also serialize and de-serial- core–we are developing the technology fur-
once just 32 bits wide; now you also have
ize the data. By de-serializing the data to a ther. That’s why we recently acquired a
64-bit PCI–and that’s not enough. The
16- or 32-bit internal bus, the data speed is company called RocketChips, which is very
problem with this approach to increasing
then reduced by 16 or 32, which an FPGA active in creating high speed serial I/O
bandwidth is that at some point you reach
can easily handle. cores. RocketChips already has a product
a level of decreasing return; the extra pins
and the need for shared bus protocols lim- With gigabit serial I/O, all the de-serializa- that is very similar to the Conexant core,
its the performance and makes it prohibi- tion is done within the chip. When the and they plan to develop even higher speed
tively expensive. work is done, you can serialize the data cores operating at 5 to 10 Gbps.

6
View from the top

RocketChips’ gigabit and multi-giga- • Gigabit Ethernet + 10 Gbit


bit serial CMOS transceiver tech- Ethernet - This includes devices
nologies provide solutions for a wide The role of the FPGA is changing; compliant with the IEEE 802.3
range of serial system architectures in alliance.
networking, telecommunications,
and enterprise storage markets. Their it is becoming a platform on which • ATM (OC-12, OC-48, OC-192) -
This includes support for OC-12
products include serial backplane
(622 Mbps), OC-48 (2.4 Gbps),
transceivers (Single and Quad 3.125
Gbps transceivers), telecom trans-
the combination of soft cores, and OC-192 (10 Gbps).

ceivers (SONET OC-48 and OC- • RapidIOTM - A next-generation


192), enterprise storage transceivers
(Fibre Channel, Ethernet), and net-
programmable logic, and hard cores switched-fabric interconnect archi-
tecture for embedded systems that
working transceivers (Gigabit is optimized for both high band-
Ethernet, 10 Gbps Ethernet, and
InfiniBand).
gives you the best possible design width and low latency. Initial
implementations are expected to
exceed 1.0 Gbps throughput based
PMC-Sierra Partnership
Xilinx recently announced the avail-
solution--the speed of a on clock rates of 250 MHz and
higher.
ability of POS-PHYTM Level 3 Link
Layer and Physical Layer cores. These custom ASIC and the time-to-market These standards all use the same
physical interface, so you can use our
cores provide solutions for the
hard I/O cores for all of them. Then,
emerging Packet Over SONET
(POS-PHY) applications, and both advantages of a flexible FPGA. we implement the level-2 protocols
in programmable logic (soft cores), so
cores are compatible with the POS-
you can quickly create designs using
PHY Level 3 interface specified by
any of these standards. This gives you a lot
the SATURN® Development Group. With
Gbps per eight-wire link width, and can of flexibility and it helps you interface
these cores, broadband system designers
support up to 32 links. directly to on-site networks.
can rapidly develop highly functional, scal-
able, and standards-based equipment to • InfiniBandTM - This newly designed inter- What Does It Mean?
increase the speed of networks up to 2.5 connect system utilizes a 2.5 Gbps wire An FPGA is no longer just gates and rout-
Gbps, and support the exploding growth of speed connection with one, four, or ing. Over the years we have added more and
IP traffic over SONET/SDH backbones. twelve wire link widths. Promoted by an more hard cores, such as memory, clock
Xilinx has also been active in the Optical association comprising industry leaders management, and arithmetic functions.
Internetworking Forum (OIF) and the such as, Compaq, Dell, HP, IBM, Intel, Now we are driving the technology a major
ATM Forum to drive POS-PHY Level 4 Microsoft, and Sun Microsystems, step further by adding hard CPU cores and
acceptance. And, we are the only FPGA InfiniBand intends to deliver a channel high speed serial I/O cores. Combine these
company to demonstrate over 800 Mbps based, switched fabric technology. dramatic technology advances with our
operation, confirming that we can provide • XAUI - A quad transceiver utilizing high performance development tools, our
the full speed capability to support the 10 3.125 Gbps serial links to create a 10 unique Internet Reconfigurable Logic capa-
Gbps OC-192 draft standard at the OIF gigabit attachment unit interface bility, our extensive training and support
(OIF2000.088.2). (XAUI). Multiple XAUI interfaces can be services, our state-of-the-art manufacturing
implemented to allow a single chip to capabilities, and our ongoing partnerships
Serial Protocol Standards
interface to both 10 Gigabit Ethernet and with other industry leaders, and you get a
To use these high speed serial I/O channels OC-192c. logic design solution that can breathe life
effectively, you need well defined protocols into your new designs.
and networking standards. Xilinx actively • Fibre Channel - A high-bandwidth serial
standard offering 1.06 Gbps data rates The role of the FPGA is changing; it is
supports all of the emerging standards,
scalable to 2.12 or 4.24 Gbps. It is capa- becoming a platform on which the combi-
including:
ble of carrying multiple existing interface nation of soft cores, programmable logic,
• Lightning Data TransportTM (LDT) - A command sets, including Internet and hard cores gives you the best possible
chip-to-chip interconnect that provides Protocol (IP), SCSI, IPI, HIPPI-FP, and design solution–the speed of a custom
much greater bandwidth per I/O chan- audio/video. ASIC and the time-to-market advantages of
nel. It can achieve a bandwidth of 6.4 a flexible FPGA.

7
Cover Story
Platform-based Design

Designing
with FPGA by Raul Camposano
Chief Technology Officer, Synopsys, Inc.
[email protected]

Platforms
As FPGAs move into million-gate densi-
ties, a world of new possibilities and poten-
tial applications is opening up for pro-
grammable logic devices. And along with
these new opportunities comes many new
challenges. The pairing of a low-cost, high
performance PowerPC processor (and
other hard cores), along with the soft cores
and programmable logic circuitry in Xilinx
Virtex-II™ FPGAs means that you will
now be confronting challenges similar to
what ASIC designers encountered when
they made the transition to system-on-a-
The Chief Technology Officer at chip (SoC) ASICs.
Everyone who participates in the multimil-
Synopsys discusses the need for lion-gate segment of the FPGA market is
wrestling with the same issues: increased

platform-based design in the era complexity, escalating development costs,


evolving standards, too few design engi-
neers, increasingly compressed design
of system-on-a-chip FPGAs. cycles, and so on. In addition, as complex-
ity increases, the time it takes you to get a
product to market becomes dominated
more by your design time than by manu-
facturing considerations, compromising
one of the key advantages of FPGAs.
To help address these challenges, there is
increasing pressure for designs to share a
common architecture or platform, especial-
ly those that are targeted to similar applica-
tions. A platform is a basic system architec-

8
Cover Story Platform-based Design

ture that is geared towards a specific appli- • Hardware design. a suite of FPGA synthesis tools for this pur-
cation, such as cell phone base stations or pose. FPGA Express™ addresses the push-
• Software design.
set-top boxes, among others; it is cus- button, fast turn-around market, while
tomized through software and by adding • Integration of hardware, software, and IP. FPGA Compiler II™ addresses more com-
customized logic and IP. • Verification of the complete system (on a plex designs and compatibility with the
chip). ASIC design flow. Looking further into the
An FPGA platform enables you to differ-
future, other synthesis technologies such as
entiate your products by adding cus- Synopsys delivers solutions in all four of Synopsys’ Physical Synthesis will enable
tomized logic and IP using the tightly inte- these areas. The design of hardware has full timing closure for platform FPGAs.
grated FPGA fabric. Platforms are impor- been our traditional domain, and we offer
Open SystemC
One of the most difficult aspects of soft-
ware design involves how to interface soft-
ware effectively with hardware. Open
SystemC, a set of C++ class libraries that
enables electronic design at the system
level, provides an important tool for
designing software and hardware in a com-
mon language framework. Based on C and
C++ (the languages of choice for most algo-
rithm developers, system architects, and
software developers) SystemC also includes
all the language elements necessary to effec-
tively address hardware design. In this way,
trade-offs between hardware and software
can be addressed dynamically, even includ-
ing reconfiguration in the field.
Figure 1 - Platform FPGA: SystemC helps you create both systems and
Xilinx Virtex-II FPGA with embedded PowerPC processor chips; the suite of tools and methodologies
Synopsys has developed around SystemC
tant in the era of multimillion-gate FPGAs significantly accelerate the design of elec-
because they enable you to focus on adding
value through custom IP rather than wast-
ing time and resources by recreating stan-
dard components.
Platform-Based Design
A central piece of any platform is the
embedded processor, such as the IBM
PowerPC processor core in the Xilinx
Virtex-II platform. A typical platform
might also include a bus, DSP, input/out-
put channels, mixed signal functions,
memory, and some configurable logic such
as shown in Figure 1. FPGA design thus
becomes platform design; rather than sim-
ply designing with gates, you must now
focus on designing entire systems.
For you to effectively exploit a platform by
designing at the system level, four primary
design considerations must be addressed: Figure 2 - SystemC tools

9
Cover Story Platform-based Design

tronic systems from concept to implemen- System Verification Conclusion


tation (see Figure 2). SystemC follows the
The challenge for any system-on-a-chip FPGAs that contain embedded processor
community source-licensing model and
FPGA is to verify the complete system, cores and application-specific components
can be downloaded from the Open
including the processor core, and not just are creating a need for platform-based
SystemC Initiative’s website at
the individual blocks that comprise the sys- design, which requires not only the suite of
www.SystemC.org/.
tem. This requires not only a high-speed RTL logic design tools that are already in
IP Integration simulator, but also a complete array of use today (with the right capacity), but also
advanced verification tools. In particular, a comprehensive suite of system-level
One of the advantages of platform-based
testbench generation, coverage tools, for- design tools that will be new to most FPGA
design is that it supports the integration of
mal verification, a simulation model of the designers.
other pieces of proprietary logic and third-
processor and other IP, and static timing
party IP. In fact, it is the customized por- FPGA design has moved beyond the era of
analysis tools are essential for platform-
tion of any system-on-a-chip ASIC or plat- simple logic. With the advent of FPGAs
based design (see Figure 3).
form FPGA that provides the competitive that contain an embedded processor core,
differentiation from one device to the next. Static timing analysis illustrates the verifi- such as the PowerPC, FPGA designers will
cation challenges imposed by such a sys- soon join their peers in the ASIC world by
But with the very large number of gates
tem. Synopsys’ PrimeTime® static timing confronting the challenges of designing
that can now be implemented on a single
analysis tool can time and analyze a com- entire systems. Synopsys is helping you
FPGA, the challenge is to become signifi-
plete chip, offering the multimillion-gate meet these challenges through the power of
cantly more productive when creating with
capacity that is required by systems on a its system-level EDA tools optimized for
these gates. One obvious solution is to
chip. It also offers analysis modes that han- platform-based design.
leverage existing gates through design
dle the processor core in an effective way.
reuse. Synopsys has been leading the way in For more information on all Synopsys
this area and, with its products, see www.synopsys.com
DesignWare® libraries of
reusable building blocks
and methodology activi-
ties, offers several time-
saving options for
both ASIC and
FPGA designers to
leverage IP from a
variety of
sources.

Figure 3 - Platform verification tools

10
Applications Software

Inferring Multiplexers
in FPGA Compiler II
and FPGA Express
How to get better results by automatically inferring multiplexers that fully
utilize architecture-specific FPGA resources.
by Alan Ma
Senior Corporate Applications Engineer, Synopsys, Inc.
[email protected]
In general, multiplexers can be implemented beneficial when the number of inputs meets inputs for the target architecture are met (as
by using Look Up Tables (LUTs). To obtain certain requirements. Table 1 illustrates the shown in Table 1), FCII/FE maps the design
the best quality of results (QoR), Synopsys multiplexer sizes and the primitives FCII/FE to architecture-specific multiplexer resources
FPGA Compiler IITM and FPGA ExpressTM utilizes for Xilinx Virtex-II, Virtex, and if at least 75% of all possible cases are speci-
(FCII/FE) take it one step further by utiliz- XC4000 FPGAs (and their derivatives). fied.
ing the built-in multiplexer resources in FCII/FE automatically maps to these hard-
high-density FPGAs, which produces signif- Figure 1 shows an example of an eight-to-
ware resources (primitives) when you follow
icantly better results in both area and speed. one multiplexer in Verilog. Figure 2 illus-
the recommended coding guidelines.
trates its VHDL equivalent. Note that the
The Process Coding Guidelines control signal sel has three bits so there can
During elaboration, the process of translat- be as many as eight possible cases. As a result,
Synopsys recommends the use of CASE
ing the text-based description of a design to at least six (75% of eight) cases need to be
statements to describe multiplexer logic.
an architecture-independent gate-level repre- specified for multiplexers to be inferred.
When the requirements on the number of
sentation, FCII/FE infers a generic primitive
called MUX_OP when it encounters multi-
plexers in the Hardware Description
Language (HDL). It is during optimization Architecture Min. Inputs Max. Inputs Primitives Used
where MUX_OPs are mapped to architecture-
specific multiplexer resources. The following Virtex-II 4 256 LUT, MUXF5, MUXF6
sections describe the requirements for
MUX_OP to be inferred. Virtex 4 256 LUT, MUXF5, MUXF6
General Implementation XC4000 4 256 FMAP, HMAP
Our research indicates that using architec-
ture-specific multiplexer resources is only Table 1 - Multiplexer size requirements for automatic inference

11
Applications Software

Using the infer_mux Directive


Figure 3 shows a similar eight-to-one library ieee;
multiplexer with the addition of sever- module mux_8to1 ( use ieee.std_logic_1164.all;
al arithmetic operators; Figure 4 shows a, b, c, d, e, f, sel,
mux_out entity mux_8to1 is port (
its VHDL counterpart. To allow oper- ); a, b, c, d, e, f: in std_logic;
ator sharing, multiplexers are generally sel: in std_logic_vector(2 downto 0);
not automatically inferred for CASE input a, b, c, d, e, f; mux_out: out std_logic
statements which contain more than input [2:0] sel; );
output [1:0] mux_out; end mux_8to1;
one operator (regardless of the number
of cases specified). However, you have reg [1:0] mux_out; architecture rtl of mux_8to1 is
the option to override FCII/FE by begin
using the infer_mux directive. always @(sel or a or b or c or d or e or f) process (sel, a, b, c, d, e, f)
case (sel)// synopsys infer_mux begin
The infer_mux directive forces 3’b000 : mux_out = a + b; case sel is
FCII/FE to infer multiplexers as long 3’b001 : mux_out = a + c; when "000" => mux_out <= a;
as at least 50% of all possible cases are 3’b010 : mux_out = d - e; when "001" => mux_out <= b;
specified. It can be used when: default : mux_out = d - f; when "010" => mux_out <= c;
endcase when "011" => mux_out <= d;
• The requirements on the number of endmodule when "100" => mux_out <= e;
inputs (as shown in Table 1) are not when others => mux_out <= f;
end case;
met. end process;
• The CASE statement contains more end rtl;
than one arithmetic operator.
It is important to understand that Figure 1 - Using CASE statements for Figure 2 - Using CASE statements for
FCII/FE generally makes intelligent multiplexers in Verilog multiplexers in VHDL
decisions on multiplexer inference
based on the cost of doing so. For
example, it may choose not to infer
multiplexers, to allow operator sharing
module mux_8to1 ( library ieee;
for better performance. As a result,
a, b, c, d, e, f, sel, use ieee.std_logic_1164.all;
QoR is likely to suffer if you override
mux_out
that decision by using infer_mux. ); entity mux_8to1 is port (
Please use this directive with caution. a, b, c, d, e, f: in std_logic;
input a, b, c, d, e, f; sel: in std_logic_vector(2 downto 0);
Conclusion input [2:0] sel; mux_out: out std_logic_vector(1 downto 0)
output mux_out; );
FPGA Compiler II and FPGA Express end mux_8to1;
take advantage of Xilinx-specific mul- reg mux_out;
tiplexer resources to deliver the best architecture rtl of mux_8to1 is
quality of results. The tools automati- always @(sel or a or b or c or d or e or f) begin
case (sel) process (sel, a, b, c, d, e, f)
cally infer multiplexers if the design
3’b000 : mux_out = a; begin
complies with the coding guidelines case sel is -- synopsys infer_mux
3’b001 : mux_out = b;
and meets the requirements for the 3’b010 : mux_out = c; when "000" => mux_out <= a + b;
target architecture. You also have the 3’b011 : mux_out = d; when "001" => mux_out <= a + c;
option to force multiplexer inference 3’b100 : mux_out = e; when "010" => mux_out <= d - e;
default : mux_out = f; when others => mux_out <= d - f;
by using the infer_mux directive.
endcase end case;
endmodule end process;
end rtl;
Visit the Synopsys FPGA website at
www.synopsys.com/fpga for other
information on the latest FPGA Figure 3 - Using infer_mux for multiplexer Figure 4 - Using infer_mux for multiplexer
synthesis technologies. inference in Verilog inference in VHDL

12
New Products Development Tools

Spartan-II PCI Development Kit


Insight Electronics has introduced a Spartan-II PCI Development Kit
to help you jumpstart your next 32-bit PCI design.

by Jim Beneke
Technical Marketing Manager, Insight Electronics
ured, and tested. Figure 1 shows a block dia- grams are provided with C++ source code so
[email protected]
gram of the Spartan-II PCI card included in you can understand how the examples work.
When Xilinx introduced the SpartanTM-II the kit.
To assist in the development and debugging
FPGA family in January 2000, they not only
of Windows device drivers, the Spartan-II
offered the lowest cost FPGA devices with The New Reference Design Center
PCI Kit includes Compuware’s NuMega
system-level features, they also enabled pro- In addition to the Spartan-II FPGA, the driver development software. The NuMega
grammable logic to effectively replace off-the- PCI board also includes the new Xilinx package simplifies the task of writing and
shelf ASSP devices for 32-bit PCI applica- XC18V01 in-system programmable con- configuring Windows drivers through a series
tions. Combined with the proven PCI32 figuration PROM. This allows PCI appli- of GUI windows.
LogiCORETM interface from Xilinx, the cation designs to be quickly downloaded
Spartan-II PCI solution was the common- The Xilinx Spartan 32/33 PCI Core
multiple times to the board and saved in
sense choice for most PCI designs. non-volatile memory. The Insight Spartan-II PCI Development
Unfortunately, designers wishing to target Kit includes the new single-use version
a Spartan-II device for a PCI project, were of the Xilinx Spartan-only 32-
not able to prototype their design with an bit, 33 MHz PCI core. The sin-
off-the-shelf PCI platform. Insight gle-use license allows the kit
Electronics recognized the need for this owner to support a single produc-
type of development board, and intro- tion PCI core implementation. If
duced the Spartan-II PCI multiple PCI core solutions are
Development Kit. The kit includes a required, then the core license can
Spartan-II prototype board, single- be upgraded to an unlimited
use Spartan PCI32 LogiCORE license for a nominal fee. The 32-
license, Windows driver develop- bit Spartan PCI core is configured
ment software, one-day (eight and downloaded through the
hours) of Insight Design Services Xilinx PCI Lounge. The download-
support, reference designs, able core netlist is fully PCI v2.2
Figure 1 - Spartan-II
Windows-based applications, PCI development board compliant and supports initiator
example Windows 98/NT driv- and target functions with zero-wait-
ers, source code, and hardware documenta- state burst operation.
With this re-configurable feature, Insight is
tion. The demonstration board is based on including access to its new Reference Design Conclusion
the 150K-gate Spartan-II FPGA, in a 208- Center. At the Reference Design Center,
pin plastic quad flat package (PQFP). By providing exactly what is needed to com-
owners of the Spartan-II PCI kit can down-
plete a PCI design, the Spartan-II PCI Kit
Implementing the full initiator/target PCI load pre-configured PCI application designs
meets the demands of both experienced and
interface in the FPGA only consumes about and run them on the demonstration board.
new designers of programmable logic-based
ten percent of the logic resources, leaving Developed by Insight Design Services, these
PCI interfaces. Several versions of the
approximately 135K gates for custom user off-the-shelf application designs can be used
Spartan-II PCI Development Kit are avail-
back ends. Unlike other PCI prototype cards, as is, or can be customized to meet certain
able from Insight Electronics. Prices range
the Spartan-II PCI board does not contain application needs. In addition to providing
from $145 for a PCI card only kit, to $3,995
back-end application circuits to complicate reference design bit streams and their associ-
for the complete Spartan-II PCI
your custom design. Instead, all user I/Os are ated source code files, the Reference Design
Development Kit. For more information, go
brought out to expansion connectors for easy Center also provides example Windows driv-
t o w w w. i n s i g h t e l e c t r o n i c s . c o m /
access and interfacing. This allows your ers and Windows-based application pro-
solutions/kits/xilinx/spartan-iipci.html.
designs to be quickly implemented, config- grams. Both drivers and application pro-

13
Perspective Configurable Processors

Choosing the ARC


User-Configurable
Processor by Emmanuel Benzaquen
ARC Cores and Xilinx Third Party Program Manager, ARC Cores
[email protected]

provide everything you As FPGA capacity continues to


increase, especially with the new Xilinx
Virtex product family, it is becoming

need to develop custom increasingly practical to implement


complete systems in a single FPGA. A
soft processor core represents an attrac-

processor applications. tive solution for user-configurable


System-on-Chip (Soc) applications.

14
Perspective Configurable Processors

• An ARC design can be turned applications you wish to run.


from VHDL or Verilog into a
configuration that runs on the Since soft processors are
Xilinx FPGA-based ARCangel available as synthesizable
prototype board in a HDL, they inherently pro-
few hours vide more design flexibility
than hard processors because
• Both software and you can modify the core
hardware can be interface to fit better into a
tested and specific design. Some soft
benchmarked
processors provide even
at the same time
greater flexibility by being
configurable. A configurable
processor core may include a
graphical tool that enables
certain functions to be
included or excluded with-
out having to manually mod-
ify HDL source code. As a
result, you can create a
processor core that is cus-
tomized for your specific
application by using the
GUI.
The ARC soft processor design and debug cycle The ARC processor is a per-
fect example of a soft and
user-configurable core avail-
able for immediate use in
Xilinx FPGAs.
Processor Cores Complement The Hard Versus Soft Option Software Tools
Programmable Logic
Major FPGA vendors typically provide The most important factor that will influ-
Traditionally, an important motivation for two different approaches to including ence your choice of a soft processor is the
adding microprocessors into a design has processor cores in an FPGA. One software tool set that supports the code
been that software programmable solu- approach offers a soft processor core that is that will run on it. SoC designs can have
tions are easy to change and upgrade. provided in a synthesizable HDL format. lines of software code running anywhere
Since FPGAs are, by definition, program- This processor core is then included in a from 1Kbyte to multi-megabytes. For
mable, you can always upgrade them. generic FPGA using the same design applications that have only a few Kbytes
System designers know that it is much eas- process as the rest of the logic. The second of code, basic software tools such as an
ier to design and implement certain parts approach embeds a specific hard processor assembler may be sufficient. However
of a system using software, while hardware core (such as the PowerPC) into the once the amount of code starts increasing
implementations offer greater perform- FPGA. The most appropriate choice will and becoming more complex, it becomes
ance. For example, you may want to take depend on the application. essential to use a high-level language like
advantage of a large amount of low-cost As a general rule, a hard processor core will C or C++.
software intellectual property (IP) that is offer higher clock speeds than a soft core. ARC Cores provide a complete set of
available in C or C++ code, for functions However, since the hard processor solution high-level development tools customized
such as protocol stacks and modem algo- will require a specialized FPGA with dedi- for embedded applications, and offers
rithms. You may also want to implement cated processor buses and routing, it will both DSP (Digital Signal Processing) and
high-speed co-processor functions in hard- be less flexible than incorporating a soft general purpose control functions within
ware. You can get the best of both worlds processor in a generic FPGA. In addition the same processor architecture. There is
when you combine the hardware re-pro- to performance and flexibility trade offs, no need to learn two different processor
grammability of FPGAs with the software the choice between a hard or a soft proces- architectures and development tool envi-
programmability of microprocessors. sor will also be influenced by the software ronments.

15
Perspective Configurable Processors

Integrated Software Environment ARC provides the flexible ARChitect vides high performance at lower clock
Graphic User Interface (GUI) that can be speeds, while still maintaining a software
Because of the typical complexity of the
used to safely create your custom config- programmable solution.
software code, ARC offers the Metaware
ured processor. This is very helpful when
development environment. This profession- Instruction extensions are available from
using a soft processor in an FPGA, and it
al set of software development tools ARC and some third parties. Plug-ins can
allows you to experiment with different
includes a C/C++ compiler, assembler/link- be used and implemented directly in the
options and configurations within minutes.
er, and the SeeCode™ source-level debug- design. For additional capability, you can
ger. Most importantly,
it offers you the ability
to debug the embedded
software running on • The ARC IP is
the processor in the deeply embedded
FPGA. It is critical that with the rest of the
logic and interface
the core and its host
directly with other
interface include execu-
customer logic func-
tion control capabilities tion in the Xilinx
like breakpoint check- FPGA
ing so you can break
the program execution • "Gate-hungry"
or monitor reads and complex system buses
writes to program vari- and associated logic
ables. are no longer needed
to reach high-per-
As the software content formance because of
of a design increases, the tight integration
another important fac-
tor is the range of appli-
cations supported and
the available systems "ARC, Third Generation IP"
software. For example,
if a design requires sev-
eral hundred Kbytes of
code along with standard communications Instruction Set Flexibility also create your own specific instructions.
software, such as TCP/IP protocols, you Custom instruction extensions offer you a
The instruction set is one of the most
can save several months or more of design particularly powerful way to accelerate
important aspects to consider when choos-
time by purchasing a real-time operating application performance while retaining
ing a configurable processor. One poten-
system (RTOS) that includes prepackaged programmability. Consider the example of
tial disadvantage of soft processors is that
protocols. ARC supports a large variety of a DES (Digital Encryption Standard)
they cannot attain the high clock speeds of
commercially distributed RTOS from lead- encryption application: by adding special-
a hard processor. For a conventional
ing vendors and is constantly increasing ist bit-permutation, cipher instructions
processor design, the clock speed is essen-
their ease of integration. and additional registers to hold the keys, it
tially the key determinant of performance.
is possible to greatly accelerate a range of
In addition to the software tools and appli- The ARC processor changes this equation
encryption algorithms.
cations described above, another critical fac- by offering a configurable instruction set
tor in choosing the ARC core is its level of and the ability to add custom instructions. To provide a truly configurable instruction
flexibility. Unlike other configurable This enables you to accelerate an algo- set, it is also important that the number of
processors available today, which sometimes rithm by selecting or adding a few appro- clock cycles for an instruction extension is
require you to manually “hack” the HDL priate (but powerful) instructions specifi- configurable. For example, the ARC
code, the ARC processor core enables you cally needed for the application that is processor enables the addition of multi-
to easily select special options for configur- being executed. Thus, you can get the best cycle instructions to the pipeline where
ing the processor. Hacking the HDL code of both RISC (Reduced Instruction Set desired, and single-cycle operations to pro-
after configuring the processor core might Computer) and CISC (Complex ceed in parallel with long latency ones.
break the core, or even make it incompati- Instruction Set Computer) processor This is an advantage over architectures that
ble with the software development tools. design architectures. This approach pro- enforce a strict RISC paradigm where

16
Perspective Configurable Processors

every instruction must execute in a single iliary bus has a very simple interface that lenge in providing a truly configurable
cycle. Such restrictions may make it impos- virtually enables peripherals to be connect- processor solution.
sible to add very powerful, complex instruc- ed with just a few wires. This is well suited
ARC and Xilinx are responding to this
tions that require multiple cycles to execute. to FPGAs where there is no actual bus,
challenge by offering a complete “plug and
allowing peripherals to be efficiently con-
Interaction with Other Logic Functions play” solution to FPGA designers. In addi-
nected in a point-to-point manner.
tion, the ARC tools suite allows you to
The ARC processor can further improve
Tool Configurability enhance the original configurations
performance by enabling tight integration
offered in a simple manner.
between the processor core and other logic Any processor that offers a high degree of
on the FPGA. Traditional processor cores configurability must also offer equally con- Conclusion
typically communicate with peripheral figurable software tools and a debugging
Soft processor cores give you the ability to
hardware via a system bus. To send data to environment that work in coordination. It
include processors in standard FPGAs.
the processor, the peripheral interrupts the is of no use to add new instructions to the
Configurable cores can help you achieve
processor, which then processes the inter- processor if there is no way of telling the
higher performance at lower clock rates
rupt using a software routine known as an compiler and assembler about them so that
through instruction extension and periph-
ISR (Interrupt Service Routine). In addi- actual software programs can take advan-
eral logic integration. ARC and Xilinx
tion to supporting this approach, ARC tage of them. In a similar vein, the com-
offer the perfect combination of a config-
processor enables you to add new core piler must let you specify which instruc-
urable core with powerful extensions and
extension registers. If desired, the new regis- tions will be present in the processor, as
third party “plug-ins,” in addition to a
ters can be directly accessed by peripheral well as be able to take advantage of features
complete development environment and
logic, enabling such devices to communi- such as multipliers or barrel shifters when
operating system support, ready to use
cate with the processor directly. These alter- they are included. In fact, software tool
with Xilinx FPGA technology.
native approaches can improve performance configurability is one of the greatest chal-
and reduce gate count by eliminating the
need to duplicate a complex system bus and
its arbitration logic in an FPGA.
ARChitect creates a … and a com-
It is no longer necessary to pass data via a
HDL descrip- plete software
bus or to interrupt the processor to have it
tion of the tool chain to
load data from a memory-mapped register. CPU program it!
Since the special registers are unique to a
particular piece of peripheral logic, there is
no need for any decoding or arbitration
logic. The firmware simply selects the spe-
cial purpose registers to communicate with
the peripheral.
In addition to providing extension registers,
configurable processors like the ARC core
can also simplify integration with addition-
al logic by providing multiple buses. This
approach enables operations residing on
separate buses, such as instruction fetches,
load/stores, and communication with C, C++, ASM, profil-
peripheral logic. As a result, the bus proto- er, linker, simulator,
cols of each bus can be relatively simple VHDL, Verilog debugger, etc…
since there is no need to arbitrate between
multiple devices attempting to control one
bus. The ARC processor has four buses,
consisting of instruction and data buses
(Harvard architecture), a bus directly into Tools configurability: ARChitect, the
the processor registers (primarily used for ARC Graphic User Interface that make
debugging), and an auxiliary bus (typically it all possible
used to connect peripheral logic). The aux-

17
Perspective Design Verification

Re-thinking Your
Verification Strategies
for Multimillion-gate
FPGAs.
How do you alter your verification
techniques to meet today's high gate
count requirements? It depends on
your background and experience.

by Thomas D. Tessier
President, t2design Incorporated
[email protected]
late signals and view the resulting wave- large enough to handle the design com-
FPGA verification is essential for success- plexity that was previously achievable only
form responses. Because this process is
ful on-time product delivery, and today's with an ASIC. When ASIC engineers
time consuming, error prone, and diffi-
million-gate FPGAs require you to re- begin to use high density FPGAs, they
cult to repeat, engineers often spend min-
think your old verification strategies. take their verification approaches with
imal time in simulation and quickly move
Many engineers continue to use simula- them. Those who use a validation process
to debugging in the lab. Multimillion-
tor-specific approaches for verification; with robust tools and a complete self-
gate FPGAs implement functions far too
the simulation tools are primarily used for checking testbench environment find that
complex to rely on this ad-hoc method.
module testing, while the lab is used for continuing to use their familiar testing
system-level integration. This approach Designers are choosing million-gate approaches now causes them to loose
requires the engineer to manually stimu- FPGAs because they are fast enough and valuable design cycle time. ASIC
18
Perspective Design Verification

Designers can benefit from a carefully list offers examples of the type of informa- tions that are essential to simulate and
defined and executed verification plan tion you need to identify: those that can be tested during in-system
that takes FPGA reprogramability into test. The execution of the Verification
• External interfaces
consideration. Time that was once well Plan requires simulation and in-system
spent in exhaustive verification at the RTL - Stimulus and response test on the target PC board–the final
level with an ASIC, now becomes costly - Transaction level, such as Read vs. stages of the pyramid.
for a high density FPGA. Write operations Verification Simulation
What is Verification? - Timing requirements Simulation has two components:
Verification is not synonymous with sim- • HDL models available to assist in test- • Dynamic simulation describes behav-
ulation. It is a strategy to make sure all bench development ioral HDL, RTL, and gates.
parts of the system conform to the specifi-
cation document; simulation is a tool used - Packaged with proposed Intellectual • Static analysis encompasses Static
in the verification effort. The basic com- Property (IP) Timing Analysis (STA), Formal
ponents of verification are shown in • Tools available to the project Verification and Signal Integrity
Figure 1. Analysis.
- Simulators
Specification In-System Test
- Static Analysis
A detailed and complete specification is During in-system test you have a distinct
essential for producing working products, - Lab-based tools advantage when using FPGAs over ASICs.
on schedule. The specification document • Performance Requirements, such as: An obvious benefit is the ability to repro-
is the foundation of the verification plan, need 32 block data write @ 66 MHz gram the FPGA until the desired func-
and describes the features to be imple- with a latency of less than 300 ns. tionality is achieved. You also have an
mented, under what conditions they additional advantage with the Xilinx
occur, and what their expected outputs Execution ChipScope Integrated Logic Analyzer
should be. This documentation should A verification strategy that best suits your which enables you to observe internal
not determine implementation–that is left design means breaking out those func- nodes of the chip, on your PC board,
to the experience of the RTL designers. while running at system speeds.

Verification Plan
RTL engineers and verification engineers
share the responsibility for implementing
the test plan. The level of test granularity
(or detail) is outlined at: transactions, pro-
Execution
tocol, interfaces and timing. Essential
functions are identified. A determination
of the number of testbenches needed, Simulation & In-System
their complexity, and test module depend- Test
encies is made. Static: Dynamic:
• Signal Integrity • Behavioral
Any discrepancies in design implementa- • Static Timing • RTL
tion versus testbench results should be • Formal Verification • Gates
referred back to the specification for clari-
fication. This is not a new concept but
Verification Plan
often overlooked in the rush to produce a
Breadth of Detail
product. When all elements described
• Transaction
within the test plan are checked off, the • Protocol
verification effort has been completed to • Interfaces
the required level of confidence. To opti- • Timing
mize your verification effort the following • Correct by Construction

Specification

Figure 1 - Verification pyramid


19
Interaction of Verification and Design
Creation
Specification
Verification has many interactions with
design creation, as shown in Figure 2. To Interpretation
prevent confusion and save time, the
Verification Plan Behavorial
design and verification teams must work
Coding (opt.)
from the same thorough specification. In
addition, the RTL design engineers and
Debug
verification engineers must share the
responsibility for implementing the test Test Bench RTL Coding
plan–testbenches are written to validate Coding
the design to the specification, not to ver-
ify the design implementation. Dynamic
Simulation
Once the executable specification (of the
design) and testbench, both written in
Synthesis
behavioral HDL, meet the requirements,
P&R
the design is replaced with RTL code. The
RTL is then verified with the system-level Dynamic & Static
testbenches to make sure it meets the writ- Simulation
ten specification conditions. After the
RTL is validated it is synthesized and Bit Stream
processed by the Place and Route tools. Design Team Responsibility
Verification Team Responsibility
The resulting gates are plugged into the
Shared Responsibility
system verification testbenches or formal Integration
verification if it is available. This insures
the tools have correctly implemented the
design. In addition generated gates are run
thorough static timing analysis. This step Figure 2 - Interaction of the verification components
verifies that the system-level timing is
met.
System integration is typically referred to Conclusion About t2design
as “power on”. This is the time when proj- A verification strategy combining simula-
ect teams come up with creative answers t2design, Inc. provides HDL design and
tion, static analysis, and in-system testing
to the question "is it working yet?" is key to success with high density FPGAs. methodology process management solu-
Projects are ready for in-system test when You are bombarded with many different tions. We specialize in customizing project
they have validated RTL code, have been choices for verification of a design; to methodologies that create a cohesive,
successfully placed and routed, and can meet time-to-market pressures you need re-creatable design process from the archi-
create a bit stream to program the FPGA to leverage multiple approaches. tecture phase through verification. High
on the physical PCB. At this point it is
expected that module-level partitions have A detailed application note is available to density FPGAs, such as the Xilinx Virtex
been tested for functionality, and that guide you through the verification deci- and Virtex-II families, require an "ASIC
module interfaces are stable and well sion process, including an in-depth case like" HDL design approach which means
defined. The design has been simulated, as study. It evaluates the design-specific
HDL code, simulation, and synthesis. Data
a chip, at both RTL and gates levels with trade-offs of choosing functions that are
flow and process management planning
minimum functionality necessary for essential to simulate and those that can be
tested during integration. Prepackaged IP need to accompany the HDL design
power on. The simulation of the chip is
often not achieved by FPGA design teams testbenches are also evaluated for applica- approach to create a complete project solu-
still using simulator specific approaches. bility in the system testbench. The full tion. Our team of designers implement this
application note can be found at: ASIC and FPGA methodology strategy
www.hdl-design.com. while leveraging EDA tools to their fullest
level of effectiveness. Contact us at 303-
665-6402 regarding t2design services.
20
Year 2000 Worldwide Xilinx
Event Schedules

Foundation ISE –
Year 2000 European Event Schedule
Dec 5 Embedded Computing/Real Time
Show Tel Aviv, Isreal

What’s In a Name? Year 2001 Worldwide Xilinx


Event Schedules
Year 2001 North American Event Schedule
Xilinx Integrated Synthesis Environment stirs Jan 30-31 Portable Design 2001
Santa Clara, CA
Design Automation Conference debate. Feb 2001 FPGA Conference 2001
Monterey, CA
by Craig N. Willert most optimal design implementation from
Feb 13-15 Wireless Symposium 2001
Software Marketing Manager, Xilinx all of the variables.
San Jose, CA
[email protected] To simplify this approach, Xilinx has built- March 5-7 Synopsys User’s Group 2001
The new 3.1i Foundation™ in the HDL optimization using Xilinx San Jose, CA
ISE software from Xilinx made its debut at Synthesis Technology and the FPGA April 9-13 Embedded Systems Conference 2001
this year’s Design Automation Conference Express HDL synthesis tools from Synopsis. San Francisco, CA
(DAC), leaving many with the question This ensures that every engineer using April 23-26 NAB 2001
“What should ISE stand for?” Xilinx Xilinx Foundation ISE will have access to at Las Vegas, NV
thought the name would speak for least two HDL synthesis tools that are high- April 30-May 2 FCCM 2001
itself–Foundation ISE is an Integrated ly compatible and tightly integrated. Rohnert Park, CA
Synthesis Environment. But designers view- Furthermore, a design “environment” is May 8-10 ICASSP 2001
ing the product for the first time at the DAC Salt Lake City, UT
distinguished by its ability to address all
show excitedly came up with other ideas of of your needs as a designer, not just a few May 15-16 Applied Computing Conference 2001
what “ISE” should mean. Santa Clara, CA
specific design functions. Foundation ISE
provides an environment that ensures a June 18-20 38th Design Automation Conference
• “I” is for Ingenious, Intelligent, Las Vegas, NV
Internet-Enabled, Incremental, comprehensive, integrated design flow for
any programmable logic designer looking June 20-22 WITI Technology Summit 2001
Innovative, Intriguing, Inspiring, Santa Clara, CA
Inventive, Imaginative, Insightful, for an integrated solution that is capable
of delivering world-class results with June 24-27 ASEE Conference and Expo 2001
Intuitive, and Interoperable. Albuquerque, NM
push-button flows.
• “S” is for Simple, Speedy, Sensible,
State-of-the-art, Smart, Savvy, and Sexy. Conclusion Year 2001 South East Asian Event Schedule
The Xilinx 3.1i Foundation ISE software March 26-27 IIC 2001
• “E” is for Engineered, Easy, Efficient,
Shanghai, China
Empowering (EDA partners), is already being heralded as the industry’s
best programmable logic design tool. By March 29-30 IIC 2001
Expedient, Easy-to-Use, Extra-Special,
Beijing, China
Essential, and Eloquent. integrating the HDL design flow, synthe-
sis, and optimization, Xilinx Foundation April 2-3 IIC 2001
What is ISE? Shenzhen, China
ISE enables you to spend more time on
To understand the basis for the differing the creative aspects of programmable
Year 2001 Japanese Event Schedule
opinions, it’s necessary to look at the current logic design. This helps you focus your
resources and increase your productivity Feb 1-2 Electronic Design and Solution Fair
state of the design process.
so you can get to market faster and deliv- Tokyo, Japan
Integrated design, synthesis, and implemen-
er a more robust product to your cus-
tation tools automatically handle all of the
tomers. Xilinx 3.1i development systems
file dependency issues that any designer For more information about Xilinx Worldwide Events, please contact one
deliver superior push-button, interactive,
faces, by answering questions like “What of the following Xilinx team members or see our website at:
state-of-the-art design methods. http://www.xilinx.com/company/events.htm
tool do I need to run next,” and “Have I re-
• North American Shows: Darby Mason-Merchant at: [email protected]
synthesized all of the modified HDL The 3.1i release will begin shipping to all or Jennifer Waibel at: [email protected]
blocks?” But time and time again, designers registered, in-maintenance customers this • European Shows: Andrea Fionda at: [email protected] or
are synthesizing their designs with two or Spring. To learn more, please visit the Andrew Stock at: [email protected]
• Japanese Shows: Yumi Homura at: [email protected]
more synthesis tools–trying to create the Xilinx website at: www.xilinx.com. • SouthEast Asian Shows: Mary Leung at: [email protected]
New Products Design Automation Tools

Xilinx Foundation Series ISE Software–


Delivering the Benefits of HDL Design
Integrated design flows increase your productivity and accelerate your time to market.
by Justine Chen
Product Marketing Manager,
Worldwide Software Marketing, Xilinx
[email protected]
Karen Fidelak
Product Marketing Manager, timing information. Though a homegrown,
Design Software Division,Xilinx customized process for specification of com-
[email protected] mon information can often be automated,
updating a single point tool within a flow
Teams of software engineers from Synopsys, usually calls for a complete rewrite of setup
Synplicity, Model Technology, Visual The Keys to Increased Productivity
information. And using various point tools
Software Solutions, and Xilinx, working in In the past, most large digital design compa- within a design flow often requires creation
close collaboration, have created the ultimate nies relied on individual point tools, and of additional design data files. That addition-
in design automation tools–Xilinx were less concerned with managing the flow al design work and processing decreases your
Foundation SeriesTM ISE (Integrated of data between the tools. Solving the prob- productivity, and slows time to market.
Synthesis Environment). The Foundation lem of connecting point tools came later,
Series ISE software gives you the most The Foundation Series ISE software auto-
and required customized design flows. This
advanced design automation tools, in a fully matically communicates common informa-
need to connect data flows between various
integrated, fast-working environment that tion to each tool and eliminates the need to
point tools led to development of standard
increases your productivity and accelerates create data file overhead. Unlike homegrown
information exchange interfaces, such as
your time to market. flow automation, an integrated design tool
HDL. But HDLs, including Verilog and
suite is aware of downstream tool require-
The Foundation Series ISE software VHDL, though useful as industry standards
ments. For example, when you want to per-
includes: for hardware design, did not deliver a com-
form timing simulation after place and route,
plete solution. For example, various simula-
• Synopsys FPGA Express - HDL synthesis an integrated tool suite can instruct its place
tion and synthesis tools might interpret and
software. and route tools to produce the timing simu-
optimize differently, and produce undesir-
lation netlist, so it can be read by the simula-
• Synplicity Synplify - HDL synthesis soft- able results.
tor. Today, winners in the race to market are
ware. Today, there’s a new focus. As more and focusing on design automation tools that are
• Model Technology ModelSim - HDL sim- more competing companies address the integrated (see Figure 1).
ulator. problem of designing a “system on a chip,”
Integrated Project Management
they see more value in integrated tools that
• Visual Software Solutions HDL Bencher - Given the large number of source files, con-
work together seamlessly, than in individual
Automatic testbench generation tools. trol files, and implementation files generated
point tools, because tool integration is the
• Visual Software Solutions StateCAD- key to increased productivity. by today’s complex, time-pressed design
Automatic State machine generation tools. projects, it is not merely desirable, but neces-
Integrated Design Flow Management
sary, to have an automated, integrated soft-
• Xilinx XST synthesis technology - For fur- ware tool that can manage project files. For
Today, you need fast, reliable flows of design
ther optimization. example, a design project may consist of
information between tools. And, you want
• Xilinx implementation tools - For opti- to specify common information, just once, HDL files, IP cores, netlists, user constraints,
mum use of device resources and the fastest for multiple tools; this includes the location or any combination of these. You know it
place and route times in the industry. of simulation libraries, macro libraries, and can be difficult to manage the project when

22
Applications Software

one, or more of these design modules are Integrated Environment to efficiently transfer design data automati-
modified. for Design Optimization cally. What’s more, front to back design flow
strategies are used, enabling the individual
The Foundation Series ISE software will You usually have some overall design strategy
tool’s features to be leveraged to their greatest
manage all modules in the design for you. that you are looking to optimize in your
benefits. In a non-integrated environment
For example, it knows about all of the HDL design flow. For example, your strategy may
these communications tasks and decisions
code in your design, and it knows when the place highest priority on fitting the design in
are left to you.
code has changed; therefore it will
know, and can tell you, when Integrated Environment
HDL-generated netlists must be for Collaboration
updated, and processes re-run. To facilitate the efficient flow of design data
Then it will clearly display all constraints and strategies, it is far more effi-
design sources and implementation cient if teams of software developers work in
results, and provide easy access to collaboration. An integrated environment
the appropriate editing tool for makes possible, and enhances, collaborative
every source file. work, which is critical during the project
Many HDL compilers, as well as development phase. However, collaboration
schematic entry tools, require that presents a new challenge.
you specify a device family library Designers, working with an integrated tool,
up front, to provide appropriate in an integrated environment, depend on
library symbols and components Figure 1 - Foundation Series ISE –well-integrated HDL solu- software quality. When your in-house
for a given architecture. tion designers collaborate with third party part-
Additionally, if your design is retar- ners for example, and use different tools,
geted to a new device architecture in the the smallest possible device, or on getting the interoperability problems may occur; you
middle of your design project, then you must fastest performance. A synthesis tool can be can only hope solutions are available from
change the project libraries to match the new used to optimize the design’s performance each tool’s vendor.
architecture. The Foundation Series ISE soft- based on timing requirements, but for the
When you use the Foundation Series ISE
ware makes the changes for you. You’re left best results, the place and route tools
software, you are assured of software quality
with nothing to do but select must then receive
because it has been tested thoroughly for tool
the device family, once. Your the same informa-
interoperability, across the project creation
selection will set the appropri- tion to complete
lifecycle.
ate device libraries for design the design. This
entry. And automatically pass can mean setting Conclusion
device information forward to requirements Foundation Series ISE
place and route tools. twice. However, provides you with a com-
In the course of a design plete HDL design
cycle, it’s highly likely a environment.
design will be implemented Now you can
many times. For example, manage and opti-
revisions may be made to mize your design projects, and your
timing constraints, target engineers can work collaboratively,
device, and place and route Figure 2 - Foundation Series ISE with confidence in Xilinx quality
options, in pursuit of the best project snapshots for effective project management and technical support.
overall design implementa- Learn more about how Xilinx
tion. The Foundation Series ISE software Foundation Series ISE meets your require-
with the Foundation Series ISE software, you
provides revision control by archiving each ments for integrated design automation. See
only have to define the settings once, so you
implementation, along with all design flow and hear the Xilinx internet presentation,
can optimize your design strategy faster and
control files and design constraint files, for “Xilinx Foundation Series ISE: Delivering
more reliably.
future reference or use. With this informa- the Benefits of HDL Design to
tion, you can consult or deploy an archived The Foundation Series ISE software ensures Pr o g r a m m a b l e L o g i c D e s i g n e r s , ”
implementation anytime, without recompil- that the software tools work well together; by going to www.netseminar.com/tbd/tbd.
ing your entire design (see Figure 2). the tools must communicate with each other

23
New Products Software

StateCAD XE
you to create your design in the manner
best suited to your target application.
The way an HDL is structured dramati-
cally impacts the speed, area, and power
consumption of the synthesized device.

for Optimizing
When doing finite state machine design,
the best results can only be achieved by
careful consideration of the resources
available, and by having the flexibility to
experiment with different alternatives.

State Machine
Automated FSM Design Using StateCAD XE
A quicker way to implement state
machines optimized for Xilinx devices is
to use the Xilinx ISE software, which
includes StateCAD XE. This tool allows

Design
you to draw complex state diagrams,
choose design specific optimizations, and
generate synthesizable VHDL, Verilog, or
Abel-HDL. StateCAD allows you to
change optimizations (including state
assignment mode, registering output, and
signal loading), then reproduce the HDL
automatically.
Now you can implement faster, more compact One advantage of automatic state machine
state machines, with ease. translation is the ability to change opti-
mizations and regenerate code in seconds.
By trying different code styles, state
assignment modes, and optimizations, you
can find which combination yields the
optimal solution for your design.
by Andy Bloom State Machine Example
Manual FSM Design
Director of Engineering, Visual Software Solutions
([email protected]) Until recently, you had to specify control By comparing implementations of a sim-
[email protected] logic manually; you had to draw state dia- ple state machine, we can see the impact
grams by hand (or with a graphics pack- on state machine design. The small state
Ricky Escoto machine in Figure 1 will be implemented
age), and then manually translate them to
Director of Marketing, Visual Software Solutions with both registered and combinatorial
schematics or to an HDL. Timing and
([email protected]) outputs, illustrating the impact of output
logic problems identified during simula-
[email protected] optimization on implementation:
tion resulted in modifications to the orig-
Control logic is usually implemented as inal design, which then needed to be re-
finite state machines (FSMs), which usual- verified, step-by-step.
ly require you to work through multiple
This approach tends to be slow, repetitive,
levels of design and optimization, often RESET S0
and error-prone. Translation errors invari- EVEN
within tight development schedules. And,
ably creep in and require substantial effort
as designs grow larger, the complexity of
to eliminate.
implementing control logic increases cor-
respondingly, forcing you to migrate from Hardware Description Languages (HDLs) S2
S1
schematics to hardware description lan- allow more logic to be specified and main- EVEN
guages (HDLs). StateCAD® XE automates tained with less effort, and they can be
the state machine development process, synthesized in numerous ways. You can
saving you a lot of time and trouble. control how synthesis operates, allowing Figure 1 - Example state machine

24
New Products Software

Output Optimization
REGISTERED OUTPUTS COMBINATORIAL OUTPUTS
Outputs can be optimized for
speed (registered) or for area
PROCESS (sreg, RESET) BEGIN PROCESS (sreg, RESET) BEGIN
(combinatorial decode). Com-
next_EVEN <= ‘0’; next_sreg<=S0; EVEN <= ‘0’; next_sreg<=S0;
binatorial decoded outputs
IF ( RESET=’1’ ) THEN IF ( RESET=’1’ ) THEN
become active by decoding
next_sreg<=S0; next_EVEN<=’1’; next_sreg<=S0; EVEN<=’1’;
state registers (Moore) or by
ELSE ELSE
decoding state registers and
CASE sreg IS CASE sreg IS
inputs (Mealy). Registered
WHEN S0 => WHEN S0 => next_sreg<=S1;
outputs are calculated prior to
next_sreg<=S1; EVEN<=’1’;
the active edge of the clock,
WHEN S1 WHEN S1 => next_sreg<=S2;
and typically improve speed
=> next_sreg<=S2; next_EVEN<=’1’; WHEN S2 => next_sreg<=S0;
because a level of propagation
WHEN S2 => next_sreg<=S0; EVEN<=’1’;
delay is removed, but usually
next_EVEN<=’1’; END CASE;
require more area than combi-
END CASE; END IF;
natorial implementations.
END IF;
Registered outputs are insensi-
END PROCESS;
tive to input glitches or to
multiple state bit changes.
Design Results
In Table 1 you can see the reg-
istered design has outputs that
change at the same time as the
state bits, and are stable
between clocks. The output
delay time is the clock to out-
put delay of the register. All
decoding necessary for the Table 1 - Comparison of output styles
output occurs before the
clock, at the same time as the
decoding for the next state. The decode and associated logic. State diagrams can • StateCAD is fully integrated within the
time is effectively “buried” in the state include states, transitions, Mealy and Xilinx ISE software, and produces HDL
decode time, producing a faster design. Moore outputs, resets, counters, optimized for Xilinx devices, guarantee-
shifters, multiplexers, and much more. ing you the best possible results.
In comparison, the combinatorial design
No HDL knowledge is required to spec-
requires time to decode the state bits, yield- • StateCAD can import FSMs created with
ify control flow.
ing a slower implementation. The advan- previous releases of the Xilinx
tage for the combinatorial design is the • StateCAD exhaustively analyzes state dia- Foundation Series software.
smaller area: 5 logic elements compared to grams for inconsistencies, automatically Conclusion
8 for the registered design. identifying more than 200 problems,
such as stuck-at-states, conflicting out- Using StateCAD XE you can quickly
Additional StateCAD Benefits puts, and non-deterministic control flow. implement state machines optimized for
StateCAD provides additional benefits to Xilinx devices. As design parameters
• StateCAD includes a built-in simulator change, just select a new set of optimiza-
Xilinx customers: called StateBench, for behavioral verifica- tions, then regenerate code suited for the
• By automating the complete state tion and identification of problems at the new requirements.
machine development process, the Xilinx state diagram level.
ISE software and StateCAD eliminate StateCAD XE is available at no charge
• StateCAD automatically translates state to Xilinx customers, and is included
manual coding, translation errors, stale
diagrams to synthesizable VHDL and with the Xilinx ISE software or can be
documentation, and logic bugs.
Verilog. Optimizations include one-hot downloaded from www.xilinx.com
• StateCAD includes wizards tailored for state assignment, registered outputs, and (download StateCAD from the WebPack
designing concurrent state machines prioritized transitions. BackPack section).
25
New Products Software

HDL Bencher XE
Bencher links the error reported to the
offending line in the HDL source.
Create Self-Checking Testbenches
The testbenches include component instan-

for Fast Behavioral


tiations, generic specifications, stimulus, out-
put check procedures, and assertions. You
can create “golden models” for regression
testing and future design validation; mis-
matches in expected and actual output values

FPGA Verification
are flagged automatically. All the necessary
timing constraints are faithfully represented
in the resulting testbench.
Verify Timing
By adding timing constraints, you can gener-
ate VHDL or Verilog testbenches for post-
synthesis verification. Synthesized netlists
differ from behavioral HDL because data
types are remapped, I/O modes are changed,
unused signals are dropped, and generics are
Now you can develop complete, timing constrained VHDL flattened. HDL Bencher automatically re-
and Verilog testbenches in minutes. maps behavioral testbenches to simulate with
synthesized netlists.

by Andy Bloom Demonstration Design


HDL Bencher Overview
Director of Engineering, Visual Software Solutions As an example, the following HDL code is
([email protected]) HDL Bencher accepts any HDL design, and used as input into HDL Bencher:
[email protected] then lets you select the unit under test
(UUT), specify stimulus and response
Ricky Escoto
(using the pattern wizard and the library IEEE;
Director of Marketing, Visual Software Solutions
WaveTableTM spreadsheet-based interface), use IEEE.std_logic_1164.all;
([email protected])
and then export a complete, self-checking use IEEE.std_logic_unsigned.all;
[email protected]
testbench automatically. If your HDL entity counter is
Validating FPGAs can require substantial source contains external dependencies, Port (
effort, unless you have a high order software HDL Bencher prompts you to compile CLK,RESET,CE : in std_logic;
tool like HDL Bencher XE. The usual them locally so that the whole design can be T : out std_logic;
process requires you to write many test- simulated. COUNT : inout integer range 0 to 7 := 0);
benches, simulate them, check the results,
HDL Bencher lets you manipulate wave- end counter;
and log all failures. To adequately test a
forms in the same way you manipulate a architecture behavioral of counter is
design involves verifying all the possible cycle
spreadsheet. You can cut, insert, and paste begin
types available to the device, and may require
rows (signals) or columns (time regions) process (CLK, RESET, CE, COUNT)
several hundred test cases. And, as your
with ease, and HDL Bencher automatically begin
design is revised, port definitions may
readjusts timing. if RESET=’1’ then
change, making your existing testbenches
COUNT <= 0;
obsolete, which results in unnecessary effort Interactive Simulation elsif CLK=’1’ and CLK’event then
to update the HDL source code and the
The testbenches are automatically updated if CE=’1’ then COUNT <= COUNT + 1;
accompanying testbench.
when the HDL source changes, eliminating else COUNT <= COUNT;
To simplify FPGA and CPLD testing, Xilinx stale test cases. To facilitate design retarget- end if;
now includes Visual Software Solutions’ ing, HDL Bencher allows testbenches to be end if;
HDL Bencher XE in the Foundation ISE moved between VHDL and Verilog with if COUNT=1 then T<=’1’; else T<=’0’; end if;
and WebPack ISE design tools. No knowl- one simple command. When compilation end process;
edge of HDL or scripting is required. errors are found during simulation, HDL end behavioral;

26
New Products Software

Within the Xilinx ISE software, you start annotates the expected response into the
COUNT : inout integer RANGE 0 TO 7
by selecting the HDL file from the source waveform. If no expected response was spec-
Test Signals Defined
window, then you choose HDL Bencher ified, HDL Bencher back annotates the
SIGNAL CLK : std_logic;
from the process window. The design is response obtained by ModelSim. Otherwise,
:
expected and actual respons-
SIGNAL COUNT : integer RANGE 0 TO 7;
es are compared, and dis-
Instantiates Unit Under Test
crepancies are highlighted.
UUT : counter PORT MAP (
Once your design is synthe- CLK => CLK,
sized, its behavioral test- :
bench may be incompatible COUNT => COUNT
with the resulting VHDL Clock Process Created
netlist generated during the BEGIN
post-route process. In this CLOCK_LOOP : LOOP
Figure 1 - Initial timing dialog case, the resulting netlist CLK <= transport ‘0’;
uses std_logic_vector instead WAIT FOR 10 ns;
of integers. To make the CLK <= transport ‘1’;
synthesized netlist simulate, Creates Check Procedures
you would switch back to PROCEDURE CHECK_COUNT(
HDL Bencher, re-associate NEXT_COUNT : INTEGER
the waveform with the Reports Errors In Expected Values
synthesized netlist, and IF (COUNT /= NEXT_COUNT) THEN
Figure 2 - Stimulus for the design example
re-export the testbench. write(TX_LOC,string’(“Error at
Finally you would switch time=”));
back to the ISE software Applies Input Stimulus
and re-simulate. RESET <= transport ‘1’;
The Resulting Testbench CE <= transport ‘0’;
Validates Timing
The exported testbench in WAIT FOR 100 ns; — Time=820 ns
this example is 183 lines of Verifies Outputs
code, and took under 1 CHECK_COUNT(7,820); — 7
Figure 3 - ModelSim running the testbench and design minute to create and simu- Reports Success/Failure
late. The following portions ASSERT (FALSE) REPORT
of the testbench highlight “Simulation successful. No prob-
automatically imported, and you are given some of the aspects of automatic testbench lems detected. “
the opportunity to select worst-case global generation: Draw Expected Behavior
timing parameters:
A waveform is created next (Figure 1), which
includes all the signals for the unit under test Automatically Commented
(UUT). Individual waveforms are then mod- — VHDL TestBench created by
ified directly on the screen by clicking on the — Visual Software Solution’s HDL Conclusion
signals to show the expected behavior, or by Bencher 2.00
using the built-in pattern generator. Libraries Extracted
With HDL Bencher you can verify the
LIBRARY IEEE;
operation of VHDL and Verilog designs in
Next, HDL Bencher automatically exports a
USE IEEE.std_logic_1164.all;
minutes; no HDL scripting is needed. The
self-checking testbench. The testbench
Log File Created
resulting testbenches are self-checking, and
includes all stimulus (Figure 2), output asser-
FILE RESULTS: TEXT IS OUT
are compatible with the Xilinx ISE soft-
tions, timing constraints, and check routines
“results.txt”;
ware. HDL Bencher XE is available at no
needed to verify the operation of the design.
Components Instantiated
charge to all Xilinx customers, and is
The testbench is added to the ISE project,
COMPONENT counter
included with the ISE software or can be
then auto-simulated through the Xilinx ISE
PORT (
downloaded from www.xilinx.com
software and ModelSim (Figure 3).
CLK : in std_logic;
(download the HDL Bencher “BackPack”
An advanced version of HDL Bencher is from the WebPack section).
:
now available which automatically back-
27
New Products Software

Guided
Design
Using
BLIS
With Block Level Incremental
Synthesis (BLIS), your design
implementation times will
improve dramatically.

by Karen Fidelak
Technical Marketing Engineer, Xilinx Xilinx High-Level Floorplanning, BLIS Block Level Incremental Synthesis
[email protected] provides the most robust incremental
Incremental design changes (due to design capability ever offered. As you make design changes, BLIS recog-
ECOs, specification changes, and nizes “blocks” of the design which have
BLIS, a part of the Synopsys FPGA
repeated design iterations) can cause sig- been changed at the source, and intelligent-
Express/FPGA Compiler II v3.4 software
nificant delays if you have to synthesize ly synthesizes only those portions of the
(FE/FCII), is now available in the Xilinx
and place and route your entire design design. In this flow, a block is defined as a
ISE 3.2i development tools.
after each change. Ideally, your synthesis module/entity and any
and place-and-route software tools hierarchy tree beneath it.
should recognize where changes have To enable BLIS, you
been made in your overall design and choose blocks in your
recompile just those portions that have design that you want to
changed. That’s what you get with BLIS, denote as “Block Roots”
a unique synthesis and place-and-route through the FE/FCII
capability, developed by Synopsys for Constraint Editor GUI or
Xilinx, that provides a guided synthesis scripting language, as
methodology. Used in conjunction with Figure 1 - Constraint Editor, specifying Block Roots shown in Figure 1.

28
New Products Software

A Block Root is a block which is intelli- design. The existing placed-and-routed design ing during guided placement and increased
gently updated by FE/FCII in incremen- was used as a template when re-implement- signal matching during routing.
tal synthesis runs, and has the following ing the design. Any portions of the design Additionally, the synthesis tool does not
characteristics: which existed in both the “Guide” design and rewrite the EDIF netlists for the unchanged
the new modified design (determined by blocks, further reducing runtime, because
• A separate netlist is created by FE/FCII for
matching net and component names) were no file re-translation is needed.
each Block Root.
placed in the same location in the new imple-
Guide Improvements
• Only those Block Roots whose correspon- mentation as they were in the “Guide”
ding source has been modified are re-syn- design. New or changed logic was imple- When a design is placed-and-routed using
thesized. mented around existing, “Guided” logic. the Guide feature, the success of the Guide
can be determined by the “Design
• The Block Root has hard boundaries Runtime Improvements
Components Matched” statistics available
around it–no optimization occurs with
Runtime improvements of up to 50% in the Place-and-Route report. The higher
neighboring modules.
(with an average of 47%) were observed the percentage of matched components,
The Advantages of BLIS when using BLIS with Xilinx Guided Place- the closer the incremental design is to the
and-Route in an incremental design flow; original results, leading to better pre-
There are two main advantages to using this
dictability of timing and placement results.
type of incremental flow.
Runtime Reductions When using the BLIS incremental design
• Runtime for both synthesis and place-and- 50%
flow, Guide success rates reached levels of
route will be improved because only the
47% at least 95%, and averaged 97%. When
modified portion of your design will be re- 40%
BLIS was not used to guide the design,
synthesized and re-netlisted. The remain-
% Runtime Reduction

component and route matching was as low


der of the design will remain unchanged 30%
as 52%, as shown in Figure 3.
and the netlists for the unchanged portions
of the design will not be rewritten. Because 20% The improvements when using BLIS can
the netlists of the unchanged portions of be attributed to the increase in net and
the design remain untouched, you are 17%
10% component name matches between the
assured that all net and instance names in original placed-and-routed design and the
that part of your design are identical to ear- 0% incrementally modified version of the
Without BLIS With BLIS
lier runs. design. Because unchanged blocks of the
Figure 2 - BLIS runtime reductions
design are not re-synthesized, the netlists
• Timing predictability will be improved
are untouched and thus remain identical
because the “Guide” function of the place-
Design Efficiency to the original version. (Even if there
and-route tools, which relies on matched 100%
are no logic changes in the source, re-syn-
component names from run to run, will 97%
thesizing a block can lead to net and com-
have a higher success rate.
% Unchanged Design Preserved

80%
ponent names being changed in the final
Benchmarks netlist.)
60%
66%
We compared the results of incremental Conclusion
design flows using BLIS against the more tra- 40%
When utilizing FE/FCII Block Level
ditional methodology of re-synthesizing and
Incremental Synthesis in a Xilinx guided
re-routing the entire design. With the BLIS 20%
design, runtimes as well as timing and
flow, incremental changes are made to a small
placement consistency exhibit significant
number of design blocks (Block Roots). With 0%
Without BLIS With BLIS improvements over a more traditional
the traditional flow incremental changes are
Figure 3 - BLIS design efficiency design flow. These enhancements help you
made to the same design blocks, however
achieve a higher level of productivity by
they are not specified as Block Roots.
Figure 2 shows averaged design results. allowing you to synthesize and implement
After our example design synthesis was com- Because FE/FCII does not re-elaborate or incremental design changes, with a signif-
pleted, the design was placed and routed re-optimize unchanged blocks of the icantly reduced runtime, while preserving
using the Guide feature of the Xilinx imple- design, synthesis runtime was reduced. the unchanged portions of your design.
mentation tools, which allow you to specify This new design flexibility allows you to
an existing placed-and-routed design to be Implementation runtime was improved realize the productivity necessary to com-
used as a “Guide” when implementing a due to increased design component match- plete large or small FPGA designs faster.

29
New Products PROMs

New High-Density Virtex


PROMs and Cost-Effective
Spartan-II PROMs
Xilinx announces the addition by Theresa Vu
Product Marketing Engineer, Xilinx Inc.
[email protected]
of the XC17V00 and XC17S00A All Xilinx PROM families are designed

families to its existing line of one- specifically for use with Xilinx FPGAs, there-
fore we offer a complete, pre-engineered,
drop-in configuration solution that works
time programmable (OTP) PROMs. perfectly the first time; and you are spared
the time-consuming task of designing your
own. We recently introduced two new fami-
lies, one for our Virtex FPGAs and one for
our Spartan FPGAs

30
New Products PROMs

Virtex Configuration PROMs Device Density 8-pin 20-pin 20-pin 44-pin 44-pin The Most Cost-effective Solution
VOIC SOIC PLCC PLCC VQFP
Our low-cost XC17V00 The new XC17V00 family also offers signif-
XC17V01 1.6Mb ✔ ✔ ✔
PROMs support Virtex icant savings in board space, design time,
XC17V02 2Mb ✔ ✔ ✔
and Virtex-E FPGAs, and cost. Using one 17V16 to configure the
XC17V04 4Mb ✔ ✔ ✔
up to 3.2 million system gates, and are new 3.2 million system-gate Virtex
XC17V08 8Mb ✔ ✔
offered in 1-Mb to 16-Mb densities. The XCV3200E FPGA requires less than one
XC17V16 16Mb ✔ ✔
available packages are shown in Table 1. fourth the board space of any previous
Table 1 - Virtex PROM packages Xilinx configuration PROM solution. To
The 16-Mb 17V16 PROM, a four-fold
get the equivalent functionality from our
increase in maximum bit density, extends the
XCV300E nearest competitor would require 14 chips
Xilinx leadership in configuration memories
and more than 2x the board space, as illus-
and provides a one-chip configuration solu- XCV400E
trated in Figure 1.
tion for our entire line of Virtex FPGAs. XCV405E
XCV600E
Xilinx process expertise has also allowed us
Key Features
to use smaller packages, further reducing the
XCV812E
The XC17V00 serial/parallel PROM need for board space.
family is based on our proven, OTP archi- XCV1000E
Configuration of Multiple FPGAs
tecture that provides a stable, low-cost, XCV1600E
highly-reliable one-chip configuration XCV2000E The XC17V16 can also be used to configure
solution with the following features: multiple, daisy-chained FPGAs. This allows
XCV2600E
you to store configuration data for up to
• 1-Mb to 16-Mb densities. XCV3200E eight FPGAs in a single PROM, as illustrat-
• Simple, fast, serial FPGA interface that ed in Table 2.
Table 2 - Number of Virtex-E FPGAs
requires only one user I/O pin. configurable by one 16Mb PROM
Spartan-II Configuration PROMs
• Parallel configuration up to 264 Mbps
FPGA PROM 8-pin 8-pin 20-pin 44-pin
(17V16 and 17V08 only). Solution PDIP VOIC SOIC VQFP Our XC17S00A PROM
XC2S15 XC17S15A ✔ ✔ ✔
Family provides a high-per-
• Available in SOIC, VOIC, VQFP, and formance, low-cost configu-
XC2S30 XC17S30A ✔ ✔ ✔
PLCC packages. ration solution, optimized
XC2S50 XC17S50A ✔ ✔ ✔
• Low-power CMOS floating gate process. for use with Spartan-II FPGAs. This family
XC2S100 XC17S100A ✔ ✔ ✔
offers a dedicated PROM for each gate den-
XC2S150 XC17S150A ✔ ✔ ✔
• Programming support by leading program- sity in the Spartan-II family for ease-of-
XC2S200 XC17S200A ✔ ✔ ✔
mer manufacturers. selection and guaranteed compatibility, as
Table 3 - Spartan-II PROM packages and shown in Table 3. This family also offers
• Cascadable for storing longer or multiple device compatibility extended availability of the smallest package
bitstreams.
offered by Xilinx, the 8-pin VOIC.
Key Features
• Simple, fast, serial Spartan FPGA interface
that requires only one user I/O pin.
• Available in DIP, VOIC, SOIC, and
VQFP packages.
• Advanced, low-cost CMOS process.
• Programming support by leading pro-
grammer manufacturers.
Conclusion
With the new XC17V00 and XC17S00A
PROMS, there is no easier, faster, or less
expensive way to configure Xilinx FPGAs.
Figure 1 - The Xilinx solution beats the competition For more information see: www.xilinx.com.

31
Applications FPGAs

Create Efficient FIR Filters


Using Virtex and Spartan FPGAs
The Virtex and Spartan-II
Spartan II LUTs, configured as shift registers combined with Xilinx True
Dual-Port RAM, give you a very compact, flexible, and area-efficient FIR filter design platform.
TM

by Rotem Gazit
Design Engineer, MystiCom LTD. samples. The input sample
[email protected] storage holds the last N
input samples. For every new
A Finite Impulse Response (FIR) filter works
sample entering the filter, N
by multiplying a vector of the most recent N
multiply operations will be
data samples by a vector of coefficients and
performed, each multiplying
summing the elements of the resulting vec-
the filter coefficient by the
tor. In every cycle the filter receives a new
respective input sample.
sample of data and shifts out the oldest
sample. FIR filters are very common The result of each multiply oper-
in FPGA-based Digital Signal Figure 1 - F ation is added to the partial result
IR filter blo
Processing applications. ck diagram storage to produce a new partial
result. This newly calculated par-
The design concept described
tial result is then saved in the par-
here is suitable for systems
tial result storage by replacing the
with relatively low input rates
previous partial result. After N such
(0.5 to 8 MHz), which
multiply and add operations, the
require a FIR filter imple-
partial result storage content is driv-
mentation with hun-
en out of the filter. The partial result
dreds of taps; this is
storage content is then cleared to
common in modem
begin processing a new data sample.
and demodulation
A block diagram of serial FIR filter
applications.
structure is shown in Figure 2.
Figure 2 - Se
FIR Filter Design rial FIR filter st
ructure The hardware responsible for the com-
Concepts
bination of multiplying, adding, and
By examining the FIR block diagram in Serial FIR Filters storing is called a MAC (Multiply
Figure 1, you can see that if the filter is Accumulate) unit. Due to the serial nature
implemented in a straight forward manner, a Assuming that the performance capability of
of the filter, the MAC will operate on M
multiplier will be required for every filter tap the FPGA is M times faster than the data
taps of the filter. In the case where N is
(N multipliers for an N-tap filter). In addi- input rate, we will examine the case where
greater than M, several serial filters can be
tion, an adder with N inputs will be needed M is ≥ N (where N is the required number
chained together. The oldest data sample
to sum all multipliers outputs. However, if of filter taps).
leaving the first filter in the chain is used as
the data input rate is slower than the per- To implement a serial N-tap filter uses only the new data sample in the next filter, and
formance capability of the FPGA, the filter one multiplier, a 2-input adder, and storage so on. The results of all the chain filters
can be implemented much more efficiently. for the partial results and the filter input must be added together.

32
Applications FPGAs

Implementing a Serial FIR Filter


///////////////////////////////////////////////////////////
You can implement Serial FIR filters very // Name:mac
efficiently in Virtex and Spartan-II devices. //——————————————- Throwing away bits in the MAC can
// Target device: sometimes lead to different results than
The design can be divided into three sepa- //——————————————-
rate units: the coefficients bank, the MAC // Module description: you get from throwing away the bits
//——————————————- from the final result; a thorough discus-
unit, and the input sample storage. // MAC of 16 bit coefficient by 5 bit input data_sample.
// the result is 22 bits wide sion of the effect of such an operation on
Coefficients Bank // the filter performance is beyond the
// Parent:
The Virtex block RAM can be used to hold //——————————————- scope of this article.
the filter coefficients. No multiplexer is // filter_top
//
needed; all you need is a simple cyclic // childrens:
counter used as an address generator. In //——————————————-
//mac_adder.v ,mac_multiplier.
systems where a host DSP or an adaptation ///////////////////////////////////////////////////////////
mechanism is present, the block RAM can
module mac (coefficient,data_sample,rst,clk,enable,new_data,out);
be configured as a dual port RAM,
enabling the coefficients to be dynamically input [15:0] coefficient; //filter coefficient coming from coefficient storage
changed during the normal filter opera- input [4:0] data_sample; //filter data_samplescoming from samples storage
input clk,enable,rst;
tion. input new_data; //indicates a new data sample. new_data goes high for one cycle
//every 64 clocks, 3 clocks after the new data arrives
MAC Unit //Because of MAC pipeline.

The MAC unit consists of an adder, a mul- output [21:0] out; // MAC output.
reg [21:0] out; // MAC output changes whenever a new data is being processed.
tiplier, and result storage. Careful design of
the adder and multiplier is very important wire [16:0] mul_out; // mac_multiplier output.
for area efficiency. wire [21:0] add_out; // mac_adder output.

reg [21:0] add_out_d; // sampled mac_adder output.


Theoretically, the result of a 2x tap filter, reg [16:0] mul_out_d; // sampled mac_multiplier output.
which has 2y bits on every input data and
2z bits on every coefficient, will be 2(x+y+z) mac_multiplier mac_multiplier(.coefficient(coefficient),.data_sample(data_sample),.mul_out(mul_out) );
bits wide. In real world applications how-
ever, the number of bits in the result is usu- always @(posedge clk or negedge rst) // sample the multiplier output
begin // to improve timing
ally much smaller because the least signifi- if (!rst)
cant bits of the result are usually ignored in mul_out_d <= #2 17’b0;
else
the final result, after processing. It is very mul_out_d <= #2 mul_out;
important to throw away those unnecessary end
bits as early as possible in the data process- mac_adder mac_adder(.adder_out(add_out),.adder_in_0(mul_out_d),.adder_in_1(add_out_d) );
ing (in the MAC multiplier and adder).
always @(posedge clk or negedge rst) // sample the adder output
An example MAC implementation is begin // this is the “RESULT storege”
if (!rst)
shown in Figure 3. add_out_d <= #2 22’b0;
else
Input Samples Storage Unit if (new_data) // clear accumulator for new data processing
add_out_d <= #2 22’b0;
The input data storage unit can be imple- else
mented very efficiently in Virtex devices add_out_d <= #2 add_out;
end
using the LUTs as shift registers. Each
MAC, operating on M taps of the filter,
always @(posedge clk or negedge rst) // MAC output changes only when a new data arrives
requires an input data storage of M-1 stage begin
delay line. During the first M-1 cycles, the if (!rst)
delay line output is driven both to the out <= #2 22’b0;
else if (enable & new_data)
MAC and back to the delay line input. out <= #2 add_out;
end
In the Mth cycle, the delay line output is
driven only to the MAC, and the new endmodule
input data sample enters the delay line. If
several filters are chained together, then the Figure 3 - An example MAC implementation

33
Applications FPGAs

delay line output needs to be held for M


///////////////////////////////////////////////////////////
// Name:delay_line cycles before it is driven as an input to the
//——————————————- next filter in the chain. Sometimes
// Target device:
//——————————————- (depending on the available resources
// Module description: inside the device) it is better to imple-
//——————————————-
// delay line of 63 delays x 5 bit. ment the delay line using block RAM
// the oldest sample is delayed for 64-clock cycle before driven to the next configured is a simple FIFO.
// delay line in the chain
// An example of a LUT SRL16-based delay
// Parent:
//——————————————- line implementation is shown in Figure 4.
// filter_top A diagram of the complete serial FIR fil-
//
// Childrens: ter is shown in Figure 5.
//——————————————-
//shift5x63.v shift63.v Conclusion
///////////////////////////////////////////////////////////
FIR filters with many hundreds of taps
module delay_line (new_data_sample,clk,rst,enable, new_data, mac_data,next_mac_data);
can be implemented easily even in the
input [4:0] new_data_sample; // new_data sample smallest members of the Virtex and
input clk,enable,rst; Spartan-II FPGA families. By taking
input new_data; // new_data is active every 64 cycle for one cycle ->
// SR mux control (input from it’s output OR new_data_sample) advantage of the Virtex and Spartan-II
output [4:0] mac_data; // data for MAC architecture, you can implement FIR fil-
output [4:0] next_mac_data; // data for next MAC in chain ters very efficiently.
reg [4:0] next_mac_data; // Hold next_mac_data back for one MAC cycle
// (64 clock cycles)
wire [4:0] mac_data;
shift5x63 shift5x63(.din(new_data ? new_data_sample : mac_data) ,
.clk(clk),.enable(enable),
.dout(mac_data)
);
// Hold next_mac_data back for one MAC cycle (64 clock cycles)
always @(posedge clk or negedge rst)
begin
if (!rst)
next_mac_data <= 5’b0;
else if (new_data & enable)
next_mac_data <= mac_data;
end
endmodule
module shift5x63 (din, clk,enable, dout);
input [4:0] din;
input clk,enable;
output [4:0] dout;
shift63 bit0(.din(din[0]), .clk(clk),.enable(enable), .dout(dout[0]));
shift63 bit1(.din(din[1]), .clk(clk),.enable(enable), .dout(dout[1]));
shift63 bit2(.din(din[2]), .clk(clk),.enable(enable), .dout(dout[2])); Figure 5 - Serial FIR filter implementation
shift63 bit3(.din(din[3]), .clk(clk),.enable(enable), .dout(dout[3]));
shift63 bit4(.din(din[4]), .clk(clk),.enable(enable), .dout(dout[4]));
endmodule
About MystiCom
module shift63 (din, clk,enable, dout);
input din, clk,enable; Founded in 1997, MystiCom is dedicat-
output dout; //Synplify automatically infers
//SRL16 for shift register with no reset ed to providing DSP and mixed-signal
reg [62:0] shifter; VLSI cores for high-speed communica-
always @(posedge clk)
begin tions. The company’s first product line
if (enable) implements the physical layer (PHY) for
begin Local Area Networks (LANs) using Fast
shifter[62:0] <= {shifter[61:0],din} ;
end Ethernet and Gigabit Ethernet proto-
end cols. MystiCom is headquartered in
assign dout = shifter[62] ;
endmodule Netanya, Israel, and has marketing and
customer support offices in Mountain
View, Calif. Additional information can
Figure 4 - An example of a LUT SRL16-based delay line implementation
be found at www.mysticom.com.
34
Success Story IP Telephony

LogiCORE PCI Module Is a Key


Element in Voice over IP Applications
Silicon & Software Systems provides an elegant solution to Nortel Networks, using a
Xilinx LogiCORE PCI module implemented in a Spartan-XL FPGA.
by Dara Hurley
Director, Hardware Systems Division
Silicon & Software Systems
[email protected]
Voice over Internet Protocol (VoIP) offers
companies and consumers enormous poten-
tial cost savings compared to traditional
switched telephone networks. This emerging
technology is enabling low-cost international
telephony and remote teleworking (also
known as telecommuting).
Nortel Networks, a pioneer in VoIP, has
employed a LogiCORETM PCI module in a
Xilinx SpartanTM-XL FPGA to improve serv-
ice. The system architecture is shown in
Figure 1. The system is based on the popular
Figure 1 - Architecture
PC architecture. The network adapter
of Voice over IP network
receives data (containing compressed speech)
from the Internet and passes the data to the
wait states were inserted in the PCI transac- also makes better use of the available PCI
DSP compression/decompression engine via
tions to compensate for the different data throughput, because it uses zero-wait-state
the PCI bus. The digital speech is then rout-
rates. This approach, however, consumed too burst transfers instead of one-word-delayed
ed back through a time-switched FPGA to
much of the PCI throughput, Garvin stated. transactions.”
the telecommunications network.
This is where Silicon & Software Systems
In the other direction, digital speech is routed
from the telecommunications network via
(S3) stepped in and provided a more elegant About Silicon & Software Systems
solution. Silicon & Software Systems
two Xilinx FPGAs to the DSP engine. An Based in Dublin, Ireland, Silicon &
designed a DMA controller and FIFO data
x86 CPU controls the system. The DSP Software Systems (S3) is a world-class
buffer, and integrated these along with a electronic design service company. Since
engine uses a system memory, which is con-
Xilinx LogiCORE PCI module into a its inception 1986, Silicon & Software
nected to the CPU local bus. The CPU pro-
Spartan-XL FPGA device (XCS40XL). The Systems has experienced rapid growth,
vides IP packet processing, and data is trans-
device also contained an interface between penetrating markets throughout Europe,
ferred from the system memory to the net-
the time-division multiplexed (TDM) the US, the Middle East, and Eastern
work adapter using DMA on the PCI bus.
speech data from the telecommunications Europe. In the 1999 Data Quest survey,
“Critical to the system performance is the network and the data presented to the DSP Silicon & Software Systems emerged as
PCI implementation,” said Eugene Garvin compression/decompression engine. The one of Europe’s leading electronic design
development manager of Nortel Networks designers used Spartan SelectRAMTM facilities. The company employs over 300
“the DSP bus operates at a much slower speed memory to create the dual-ported RAM- design engineers specializing in ICs, soft-
ware, and hardware systems for the com-
than the PCI bus, so the realization of the bus based FIFO buffer.
munication infrastructure, digital con-
must be optimized.”
“Use of DMA and data buffering over the sumer, and wireless markets. To learn
Initially, Nortel used a simple approach: Data PCI bus has freed up the processor to do more about Silicon & Software Systems,
transfer was routed through the CPU and other necessary tasks,” Garvin reported. “It go to: www.s3group.com.

35
Applications FPGAs

Creating Finite
State Machines
Using
UsingTrue
TrueDual-Port
Dual-PortFully
Fully
Synchronous
SynchronousSelectRAM
SelectRAMBlocks
Blocks
Create very dense, high-performance, highly efficient designs that require no logic resources.

by Edgard Garcia
Senior Engineer, Multi Video Designs apply to the Virtex 4K-bit block easily be expanded to more complex designs
[email protected] SelectRAM™ which incorporates output that could require two or more blocks.
registers. You can use a single 4K-bit RAM Synchronous FSMs and sequencers have
The latest Virtex, Virtex-E, and Spartan-II
block as a 512 x 8 clocked ROM, to imple- some important characteristics in common:
FPGA families offer a broad range of unique
ment a very fast FSM working at more than
features, including block RAM, that give you • They are clocked by a single clock.
150 MHz, and it uses no CLBS. You can
dramatic speed and density improvements.
implement the following, for example:
The dedicated RAM blocks allow you to • A feedback path allows you to (partially)
build fast and dense bidirectional data • 16 states + 4 additional outputs and 5 define what the next step will be.
buffers and FIFOs, with built-in data width inputs + Enable and Synchronous Reset.
conversion. This RAM can also be used to • They may need a clock enable to suspend
• 32 states + 3 additional outputs and 4
implement very fast and efficient sequencers the operations.
inputs + Enable and Synchronous Reset.
and Finite State Machines (FSMs), which
frees your logic gates for other tasks. Design Example • They must have a reset to go back to a pre-
defined state.
A well known approach to building The following example shows how to imple-
sequencers consists of a ROM-based design ment FSMs or sequencers with a single 4K- Figure 1 shows a typical FSM or sequencer
with output registers. The same method can bit block SelectRAM. The same method can logic diagram.

36
Applications FPGAs

Simulation
Converting the FSM behavior to a truth
table can be a very tedious and time con-
suming task; debugging and modifying the
design can turn into a nightmare.
The method used here takes advantage of the
modern design entry, simulation, synthesis,
and implementation tools, combining their
complementary respective power. The goal is
to use a VHDL simulator to automatically
generate the constraint file for initialization
of the block SelectRAM used as ROM.
One of the problems encountered when sim-
ulating a design with VHDL, is that an out-
Figure 1 - Synchronous FSM simplified diagram put can’t be forced to any logic level. An easy
way to avoid this inconvenience consists of
breaking the feedback loop, which allows
you to enter patterns into the inputs, includ-
ing current and illegal states. Figure 2 illus-
trates one way of designing the FSM VHDL
code so it can be simulated more easily. A
top-level file can provide the feedback for a
classical simulation of the FSM.
Optimization
If the number of inputs (including binary
encoding states feedback) is nine or less, and
the number of outputs (including binary
encoding states) is eight or less, a single 4K-
bit block SelectRAM can be applied by using
structural VHDL with the scheme shown in
Figure 3.
RAM Initialization
Figure 2 - Modified FSM simplified block diagram By initializing the contents of the memory
with the appropriate values, the behavior of
any synchronous FSM can be reproduced.
Binary encoded state The initialization of the RAM block is done
by an NCF (constraint) file that will be used
by synthesis and Xilinx implementation
ADDR[8:5] DO[7:4] tools. A very easy way to initialize the mem-
Inputs
ADDR[4:0]
ory with the correct values is to make an
automatic generation of the NCF file, by
WE DO[3:0] Outputs using a VHDL simulator and another test-
Tied to GND
DI[7:0] bench.
Clock Consider the behavioral VHDL code of the
CK
Enable FSM without feedback. A simple 9-bit pseu-
EN
Reset
RST RAMB4_S8 do counter (generated by a testbench) can
(512 x 8 primative) provide all the 512 possible states of the
inputs, including illegal states. Each associat-
Figure 3 - Using fully synchronous RAM blocks for FSM implementation ed result (state output and FSM outputs) can

37
Applications FPGAs

thus be converted to a text file obeying the by taking advantage of the innovative fea- design productivity is greatly improved, and
NCF file format. See Figure 4. tures such as block SelectRAM. By combin- complex designs can be easily implemented
ing the power of the Xilinx architecture and or modified.
Resources Required
implementation tools, with the associated
For more information, please e-mail:
Table 1 summarizes some examples of typi- VHDL synthesizers and simulators, your
[email protected]
cal FSMs or sequencers in terms of perform-
ance and logic resources. All these designs
can be implemented with a single RAM
block, but the same method can easily
expanded to more complex functions, pro-
viding similar improvement. As you can see,
a RAM-based FSM is much faster and uses
no logic resources.
True Dual-Port RAM Advantages
Block SelectRAM provides true dual port
capability; a single location can be read at
the same time by the two ports. Therefore, a
single block can be used to implement two
identical synchronous FSMs, with separate
inputs, synchronous reset, and clock enable;
you can also implement separate clocks, if
needed. Figure 5 shows the architecture of a Figure 4 - Testbench for automatic
NCF file generation
Binary encoded states of FSM_A

Implementation mode Number of FFs Number of Slices RAM Blocks Speed*

One Hot Encoding 20/35 30...60 – 80...110Mhz ADDRA[8:5] DOA[7:4]


Inputs_A
Binary Encoding 8/8 25...50 – 70...100Mhz ADDRA[4:0]

SelectRAM block 0 0 1 150 Mhz WEA DOA[3:0] Outputs_A


(binary encoding) Tied to GND
DIA[7:0]
• Results for a single 16 (or 32) state synchronous FSM with Enable, synchronous Clock_A
Reset, 5 (4) inputs and 4 (3) outputs. CKA
Enable_A
ENA
Reset_A
RSTA RAMB4_S8_S8
Table 1 - Single FSM implementation comparisons (512 x 8 DPR primative)
Inputs_B
ADDRB[4:0]
Implementation mode Number of FFs Number of Slices RAM Blocks Speed* WEB
Tied to GND
One Hot Encoding 40/70 60...120 – 80...110Mhz DIB[7:0] DOB[3:0] Outputs_B

Binary Encoding 16/16 50...100 – 70...100Mhz Clock_B


CKB
Enable_B
SelectRAM block 0 0 1 140 Mhz Reset_B
ENB
(binary encoding) RSTB DOB[7:4]
ADDRB[8:5]
*Dual 16 (or 32) states synchronous FSM with Enable, synchronous Reset, 5 (4)
inputs and 4 (3) outputs.
Binary encoded states of FSM_B
Table 2 - Dual FSM implementation comparisons
Figure 5 - Dual FSM implementation

dual FSM using a single dual-port block


SelectRAM. Table 2 shows a dual FSM About Multi Video Designs
implementation comparison.
MVD is a design and training center, specializing in Xilinx FPGA/CPLD
Conclusion designs and Hardware Description Languages. Consulting services and on
The Virtex architecture provides powerful site classes are offered in France, neighboring countries, and South America,
features and flexibility, allowing you to cre- in French, Spanish, and Portuguese. More information on our activity as
ate very dense and high performance designs well as the source code of some examples are available on our website, at:
www.mvdfpga.com/training/VHDL_examples.htm
38
Applications CoolRunner CPLDs

Design a Low-Power The ultra low-power CoolRunner CPLD is the ideal


choice for SMBus systems, because you can easily
SMBus System Using configure it to suit your specific needs.

CoolRunner CPLDs
TM

by John Hubbard • Host operation. See http://www.smbus.org/index.html.)


Applications Engineer, Xilinx, Inc. • Automatic mode switching between mas- • Software selectable SMBus acknowledge
[email protected] ter and slave. bit.
Low-power devices use the System • Calling address identification. • Arbitration.
Management Bus (SMBus) protocol to • START and STOP signal generation and A more detailed description of the
communicate with components and periph- detection. SMBus controller is shown in Figure 2.
erals. SMBus is a compatible derivative of
• Repeated START signal generation. You can easily modify the microcontroller
the I2C two-wire serial bus protocol and can
interface to adapt the design to any
therefore reside in the same device. In addi- • Acknowledge bit generation and detec-
microcontroller of your choice.
tion to the I2C features, SMBus enhances tion.
systems designed for power management • Bus busy detection. This design was created in VHDL using
tasks. Because SMBus is often used in hand- Xilinx WebPACK™ ISE (Integrated
• From 10 to 100 kHz operation.
held devices containing “smart” batteries, Synthesis Environment). It was verified
CoolRunner CPLDs are the perfect solution • Optional signal SMBSUS# for system using ModelSim™ XE (Xilinx Edition)
for implementing the low-power SMBus suspend mode. simulation software.
components designed into these batteries. • Optional signal SMBALERT# for slave Conclusion
interrupt request.
A system that uses the SMBus protocol can You can get started with your own
pass information between components • Packet Error Code (PEC) using 8th poly-
CoolRunner CPLD SMBus design by vis-
without the need for individual control nomial Cyclic Redundancy Check (CRC-
iting the Xilinx website at
lines. This passed information can contain 8) methods.
www.xilinx.com/xapp/xapp353.pdf.
manufacturer information, model numbers, • Automatic determination of PEC capable Download (for free) the following:
part numbers, system status, control param- devices.
eters, errors, and so on. SMBus is so flexible • Complete detailed application notes.
• Compliant with System Management Bus
you can even add or remove components • Complete VHDL source code.
Specification Rev. 1.1. (Note: the new
during system operation. It also has the
SMBUS 2.0 Specification was posted • VHDL test benches.
ability to determine arbitration in multi-
Aug. 3.
master systems, and if a newly inserted
device has the ability to check packets of
data for errors.
Functionality
The SMBus implementation for the
CoolRunner CPLD consists of a
microcontroller or microprocessor
interface and an SMBus
master/slave controller as shown
in Figure 1. It implements the
following features:
• Microcontroller interface.
• Master or slave operation.
• Multi-master operation.
Figure 1 - Basic SMBus controller functions Figure 2 - SMBus controller detailed block diagram

39
Applications CoolRunner CPLDs

CoolRunner
Power-Saving
Tips and Tricks
These techniques can lower your CoolRunner power consumption by 40%.

by Frank Wirtz
Staff Applications Engineer, Xilinx Inc.
[email protected] niques you can use to further reduce fixed geometry of the device determines
power consumption in CoolRunner both the speed-sensitive paths and the
With the advent of Fast Zero Power tech-
CPLDs, which are already the lowest power-sensitive paths.
nology and CoolRunner CPLDs, you can
power-consuming CPLDs in the world.
now create portable, high-performance, The following design implementation
Here are some highlights from that appli-
low-power, programmable devices, effort- techniques are just a sampling of the
cation note.
lessly. And, with some additional effort, information you can use to slash power
you can reduce your power consumption Tips and Tricks for Reducing Power consumption to a minimum:
by as much as 40%. However, to accom- Terminate!
The CoolRunner XPLA architecture gives
plish ultra low power reductions, you
you a flexible logic allocation tool that You must properly terminate all inputs to a
must first understand the mechanics of
allows you to decrease power consump- CMOS buffer. A single floating pin can
CPLD logic generation.
tion by placing your logic in optimum result in an increase of quiescent current by
Xilinx has published a new application locations. To use this tool, you need to 13mA. Slow input transitions will also
note, “Low Power Tips for CPLD Design” understand the basic architecture of the cause unnecessary power use. Test data
(XAPP346), that describes design tech- device, and you need to know how the shows that input buffer power consump-

40
Applications CoolRunner

tion doubles if input rise time increases other internal devices depend upon the Binary Counters
from 800ps to 5ns per input. output voltage of the buffer for driving
Typical binary counters will have their out-
their inputs.
Congregate! puts changing state at a rate of:
In some cases, mixed voltage interfacing is
You can see how your design is imple-
necessary. Slight modifications to differ- 2 n+1 - 2
mented by reviewing your fitter report, Percent of bits toggling = X 100
ing VCCs can drastically reduce power n2 n
and then adjust the fit to constrain your
consumption in these instances. For
high frequency signals to a single logic
example, the XPLA3 devices may be pow-
block. This will decrease the distribution Where the number of bits of the counter = n
ered at 3.3V +10%, and 5V devices may
of high-speed nets and further decrease
be powered at 5V -10%. This changes the So, a typical 8-bit binary counter would
power consumption.
differential between Voh and Vih by have approximately 25% of its bits chang-
Modulate! 800mV per input, and will significantly ing state for any single clock edge.
reduce wasted power. However, examine
The application note details special clock LFSR Counters
the data sheet to ensure safe and reliable
considerations and explains how asynchro-
operating conditions. LFSR (Linear Feedback Shift Register)
nous clocking can provide low power ben-
counters are wonderful solutions for FPGA
efits. Typically, asynchronous clocking Default System Conditions
users who need to keep look-up table fan-
increases power consumption. Modulation
Attention to default system operating con- in to a minimum. However, because of the
in this instance refers to only applying a
ditions may provide an insight into ways internal hardwired feedback of CPLDs,
clock signal to a register when it is required.
you can further decrease power consump- this type of counter consumes much more
Many designs have registers that infre-
tion. As an example, a CPLD may be inter- power than the other counter examples
quently change state, yet the clock signal is
faced to a CMOS microcontroller with described here.
continually present and applied to the reg-
programmable (polarity sensitive) inter-
ister. While asynchronous design For example, an 8-bit LFSR counter has
techniques are usually discouraged, approximately 50% of its bits chang-
they do provide designers with ing on average for any single clock
additional flexibility when low edge. In comparison, an 8-bit binary
power (or sometimes high speed) counter changes at a 25% bit rate.
characteristics are required.
Grey Code Counters
As an example of this technique,
Because of their characteristic step
consider a counter circuit. In the
pattern of a single changing bit, Grey
case of a binary counter, not all of
code counters offer designers the low-
the registers change state on each
est power consumption of these three
significant clock edge. Designers can use rupts. If it is necessary to interface a 3.3V counter methods. The average bit change
a high speed clock for the LSBs of the CPLD to a 5V interrupt, system power can rate for an 8-bit Grey code counter is
counter, and then use a prescaled clock be saved by programming the microcon- approximately 13% as defined by the equa-
for the higher order bits, so the total troller interrupt such that the system oper- tion:
amount of power required by the clock ates with the interrupt level normally low.
buffers is decreased. This decreases the amount of time that the 1
Percent of bits toggling = X 100
interrupt is active (high) which will reduce n
Mixed Voltage Interfacing
the overall amount of power consumed
When interfacing devices that have differ- when under driving a CMOS input.
ent VCC levels, consider the impact caused The Grey code design implementation is the
The Effects of Implementation Style most difficult, however, because next-state
by under driving a CMOS input. Because
a CMOS input buffer is comprised of at Implementation style affects power con- information must be coded for each count
least two primary transistors, a P-channel sumption. For example, consider how dif- value.
pull up and an N-channel pull down, there ferent types of counters are implemented; The Bottom Line
exists a region of input voltage where both binary, Grey, and LFSR counters are cre-
transistors are slightly on, and current flows ated in different ways and require differ- Take full advantage of the CoolRunner low
from VCC to GND through these buffers. ent amounts of resources. Keep in mind power benefits by downloading the free “Low
This causes power to be wasted, and since that a minimal number of changing sig- Power Tips for CPLD Design” application
the output of this buffer may also be in the nals will always deliver the lowest dynam- note (in PDF format) from the Xilinx website
linear region, it can cause problems because ic power solution. at: www.xilinx.com/xapp/xapp346.pdf.

41
Applications Software

Creating a Low Power Serial Peripheral Interface


How to implement communications between microprocessors and peripherals,
using the Xilinx CoolRunner XPLA3 CPLDs.

is attempting to be a master and address the µC to configure and control the opera-
by Anita Schreiber
this device as a slave. Assertion of SS auto- tion of the SPI master. Status of the current
Staff Applications Engineer, Xilinx
matically disables SPI output drivers in the transfer is provided to the µC via a status
[email protected]
master device if more than one device register in the Register File. Registers are also
The CoolRunner implementation of a Serial attempts to become master. included to contain the µC data to be trans-
Peripheral Interface (SPI) Master described mitted on the SPI bus and data received
The SCK, MOSI, and MISO pins of all SPI
here can be used to add an SPI controller to from the SPI bus. The SPI Control State
devices on the SPI bus are connected togeth-
microprocessors or microcontrollers that do Machine controls the shifting and loading of
er in parallel.
not provide this interface. It will permit SPI data in the SPI shift registers, and the
direct inter-processor communication and CoolRunner SPI Master Implementation generation of the slave select signals. The
communication with numerous commer- SCK clock logic generates an internal SCK
This SPI master design supports the follow-
cially available peripherals. based on the settings in the control register
ing features:
Serial Peripheral Interface Protocol for clock phase, division, and polarity.
• Microcontroller interface.
SPI is a full-duplex, synchronous, serial data Conclusion
• Multi-master bus contention detection
link. A single SPI device is configured as a CoolRunner CPLDs operate at the lowest
and interrupt.
master; all other SPI devices on the SPI bus standby power (<100µA) of any CPLD avail-
are configured as slaves. • Eight external slave selects. able today, and they are an ideal program-
The SPI bus consists of four wires: • Four transfer protocols available with mable logic solution for providing interface
selectable clock polarity and clock phase. controllers in portable or power sensitive
• Serial Clock (SCK) - Driven by the SPI applications. See www.xilinx.com/apps/
master and regulates the flow of data bits. • SPI transfer complete interrupt. epld.htm#CoolRunner for an SPI reference
The SPI specification allows a selection of • Four different bit rates available for SCK. design which contains a detailed application
clock polarity and a choice of two funda- note (XAPP348), VHDL source code, and
mentally different clocking protocols on an A high-level block diagram is shown in VHDL testbenches.
8-bit oriented data transfer. Figure 1. The microcontroller (µC) interface
is a VHDL module that you can
• Master Out Slave In (MOSI) - Data out- easily modify to support other
put from the SPI Master and input to the microcontrollers.
SPI Slaves.
The Address Decode/Bus
• Master In Slave Out (MISO) - Data input Interface logic interprets
to the SPI Master and output from the the bus cycles of the
selected SPI Slave. Only one selected slave microcontroller
device can drive data out from its MISO and performs the
pin. read/write opera-
• Slave Select (SS) - Selects a particular slave tions to the
via hardware control. Slave devices that are Register File.
not selected do not interfere with SPI bus The Register
activities. The SS control line can be used File is the inter-
as an input to the SPI master indicating a face between
multiple-master bus contention (SS_IN). the µC and the Figure 1 - CoolRunner SPI master
If the SS signal to the master is asserted, it SPI master logic,
indicates that some other device on the bus and allows

42
Perspective Power Consumption

CoolRunner CPLDs Beat the Heat


The disadvantages of high power components...
by Steve Prokosch the BOM (bill of materials) can be expen- Consider how well a CoolRunner XPLA3
CoolRunner Product Marketing, Xilinx Inc. sive. Besides the cost of the component 256-macrocell device performs in a Thin
[email protected] itself, additional expenditures of time and Quad Flat Package. The TQFP has the
money can be incurred: worst thermal characteristics of any avail-
Conventional thinking assumes that high
able CPLD package. A worst-case analysis
performance requires high power consump- • Locating and ordering the component.
shows a CoolRunner XPLA3 256-macro-
tion, but the Xilinx XPLA3 CoolRunner
• Shipping and delivery costs and delays. cell device would run for 107 years at a
CPLD family defies convention. In both
• Controlling inventory. constant 145 degrees Celsius!
ultra-low standby and total current con-
sumption modes, Xilinx XPLA3 complex • Dealing with the availability of the phys- As shown in Table 1, all other CPLD
programmable logic devices consume less ical device. products, even in their low-power modes,
power than any other CPLDs in the world. radiate more heat than CoolRunner
Availability can be a real budget-buster; if CPLDs. (For more information on how
By consuming the least amount of power, that one single component cannot be these measurements were obtained, see
CoolRunner CPLDs radiate the least obtained or delivered in a timely fashion, the the Xilinx Thermal Emissions Web page
amount of heat. Devices that emit excessive entire system cannot be shipped to the cus- at: www.xilinx.com/products/cpldsolu-
heat can cause serious problems, such as: tomer. Assembly costs continue to rise until tions/techtopic/thermalimg.htm.)
• Higher FIT (Failures In Time) rates. all components are delivered, installed, and
shipped. Such a delay can turn a company’s Conclusion
• Intermittent field failures.
bottom line upside down. Xilinx CoolRunner XPLA3 CPLDs are
• Higher cabinet or enclosure design and designed for high-performance, low-
The CoolRunner Reliability Advantage
manufacturing costs. power products such as portable PCs,
Heat plays an enormous role in determin- PDAs, and handheld wireless devices
• Increased risk of EMI/RFI leakage (caused
ing the reliability of your designs in the where heat can be a critical factor in form,
by extra cooling vents).
field. Because most semiconductor fit, and function. If heat is the problem,
• Mechanical stress to package parts. devices are tested for hot temperature CoolRunner CPLDs are the solution.
operating life (HTOL), it is easy to com-
• Printed circuit board layout concessions.
pare how long a product would last under
• Design compromises that affect the overall high temperature conditions.
size and appeal of the end product.
The High Costs of High-Power Components
In addition to radiating destructive heat, Ambient
devices that consume high amounts of power
Xilinx XCR3256XL-7TC44
also generate extra costs. For instance, when
you buy power supplies, the more power Cypress CY37256VP160-100AC
output you need to run the system, the high- Lattice M4LV-128/64-10YC
er the cost. Many designers accept the high
Altera EPM7256AETC1477
costs of power supplies-without ever ques-
Altera EPM3256ATC1447
tioning the efficacy and efficiency of the
devices demanding all that power. Lattice ispLS12192VE-100LT128

Furthermore, a device that consumes a lot 25 30 35 40 45 50 55 60


of power may require the addition of
Degrees Centigrade
another physical component, such as a
larger cooling fan or a heat sink. Adding to Table 1 - CPLD thermal emissions

43
Perspective Reliability

by Austin Lesea

FPGAs – Principal Engineer, Xilinx


[email protected]
Sudip Nag
Manager Implementation Tools Engineering, Xilinx

The Solution
[email protected]
Hitesh Patel,
Manager Alliance EDA Marketing, Xilinx
[email protected]

to Ultra-Deep As device geometries decrease from 0.25µ


to 0.13µ, the problems of substrate cou-
pling, ground bounce, and interconnect
crosstalk increase dramatically. ASIC
designers who want to take advantage of

Sub-Micron Design these new device technologies are faced


with a difficult task; they must create
designs that are both logically correct and
reliable within the specified environmen-
tal extremes. As device geometries
decrease, it becomes much more difficult
With Xilinx FPGAs you can focus on your to produce reliable designs.

design function without being concerned with The FPGA Solution


The FPGA solution (consisting of both
the tricky physical design issues caused by today’s devices and software) assures you of a reli-
able design, because FPGAs are composed
ultra-deep sub-micron device geometries. of a consistent architecture that has been
tested over a long period of time, under
real world conditions. Thus, FPGAs are
guaranteed to operate exactly as specified,
with very predictable interconnect delays,
so you can spend more time on design
optimization.
Substrate Coupling
Substrate coupling is a serious problem for
either an ASIC or an FPGA. To guarantee
performance, both ASIC and FPGA IC
designers must use the best tools and
models possible to accurately simulate the
device, and have the means to automati-
cally verify the design once it is ready for
mask making.
Xilinx has developed proprietary automat-
ed techniques similar to those used by
EDA vendor CadMOS (used in their
SeismICTM and PacifICTM products for tra-
ditional DSM ASIC designs). We use
these tools to automatically model and
verify all of our FPGAs, which isolates you
from this issue.

44
Perspective Reliability

Substrate Bounce have to worry about whether the silicon are mature and sophisticated resulting in
will work. ASIC IC designers must also accurate delay prediction. The placement
Substrate bounce is caused by the switch-
expend the same effort, but often the and routing tools are smart enough to
ing of fast, high-current I/O transistors.
expense is too high and resources are determine when to update the
It can cause double-clocking, as well as
inadequate, resulting in designs that are timing/slack information dynamically,
indeterminate and invalid logic states.
not reliable. based on changing interconnect delays
The substrate bounce effects on Xilinx
during layout.
FPGAs are modeled precisely, and our The Software Solution
designs are guaranteed to provide ade- The newer Virtex and Spartan-II genera-
Using the current ultra-deep sub-micron
quate isolation. For all of our FPGAs, tion FPGAs, are co-developed by the
design rules, interconnect delays account
Xilinx models and minimizes substrate Xilinx software and hardware teams. This
for approximately 75% of the path
noise in the design prior to making the process naturally results in a highly pre-
device masks for fabrication. dictable architecture, and predictable
interconnect delays. This allows the soft-
Interconnect Crosstalk
ware to make correct placement and
Interconnect crosstalk between the routing decisions early in the design
tiny wires, on multiple metal lay- process, even during the synthesis
ers, now requires a 3D field phase where there is maximum
solver for extraction; anything flexibility to influence the
less is not accurate enough to design performance.
completely model the inter-
The quality of the synthesis
connections on the chip.
wireload models plays a key
Models must take into
role in the timing pre-
account the potential for
dictability after synthesis.
crosstalk induced delay so
Advancements in FPGA syn-
that any possible user circuit
thesis technology (such as
will behave predictably and
improved wire delay estima-
reliably, regardless of process
tion by synthesis-driven place-
and silicon variation. Xilinx IC
ment tools) enable highly accu-
designers perform extensive
rate timing predictability, and is on
interconnect modeling using field
average 20% to 25% more accurate
modeling to ensure that our FPGAs
than ASIC technology. In addition, re-
do not have crosstalk problems.
synthesis capabilities for critical path
Device Fabrication optimization reduce the number of
design iterations for faster time to timing
IC Designers must also model devices
delays. Therefore, design modeling and closure.
carefully to avoid yield problems, speed
timing closure become significant factors,
grading issues, and design failures. Xilinx Conclusion
and development tools are critical.
manufactures millions of devices for each
With FPGAs, you can now implement
FPGA family, and the manufacturing For FPGAs, interconnect delays have
multi-million gate designs without being
process utilizes device monitors, test always been a significant portion of path
plagued by the DSM problems inherent in
structures, and other process related struc- delays because of the existence of switches
ASICs. Our advanced FPGA architectures
tures that are measured on every wafer. in the routing paths. However, FPGAs
shelter you from physical design issues
The models are incrementally refined so have specific, fully characterized routing
such as crosstalk and ground bounce, and
that all process corners are modeled. The resources and, therefore, accurate delay
the latest synthesis and implementation
use of highly accurate device models models can be achieved using exhaustive
software delivers timing predictability
enables Xilinx IC designers to rigorously empirical methods and functional analysis.
early in the design flow, giving you a sig-
verify and characterize all of our devices.
FPGA software tools are extremely nificant reduction in design closure time,
The combination of extensive and accu- mature in the area of handling intercon- and a shorter time to market.
rate modeling, simulations, and verifica- nect delays. Therefore the ultra-DSM
tion requires hundreds of engineer-years. technology does not pose a new problem
Xilinx has made the investment and pre- for FPGA place and route tools.
engineered all of our devices so you don’t Specifically, the delay estimation methods

45
Applications FPGAs

Implementing a
Histogram for Image
Processing Applications
Use the VirtexTM or SpartanTM-II True Dual-PortTM RAM and DLLs to create a real-time histogram.

by Edgard Garcia
Engineer, Multi Video Designs
[email protected]
Image processing is key for many automated of the 256 counters will be active at each A 16-bit incrementer will allow you to
industrial inspection applications. However, valid pixel clock (only one value will be update the RAM contents during a Read-
even the most sophisticated algorithms can’t updated). Therefore, the registers of the 256 Modify-Write operation, where the video
extract the right information if the image x 16 bit counters can be replaced by a mem- data inputs are used as the address of the
contents are not available in a convenient ory array, such as a 4K-bit block memory block. Figure 1 shows the block dia-
format. By using a histogram, you can ensure SelectRAM™ organized as 256 x 16 gram of a basic hardware implementation.
that the image content can be easily (RAMB4_S16).
processed.
What is a Histogram?
For each possible pixel value, the histogram
algorithm counts the number of times the
value was encountered in the current image.
For example, the histogram of an 8-bit-per-
pixel image will contain 256 values (28), each
one representing the number of pixels found
at this value. This allows a microprocessor or
DSP to quickly get the profile of the image,
and take the appropriate decisions, by ana-
lyzing just those 256 precomputed values.
You can do this easily, in real time and at low
cost, in a Virtex or Spartan-II FPGA.
A Basic Hardware Implementation
For an 8-bit-per-pixel image, 256 different
values are possible for each pixel, so 256 16-
bit counters would be necessary to complete
the real time histogram. However, only one Figure 1 - Basic hardware implementation

46
Applications FPGAs

Optimized Implementation Resources and Performance Conclusion


Each memory cycle can be either a Read Here are the logic resources required for By taking advantage of the high level fea-
or a Write, so we need to divide each pixel implementing the histogram algorithm: tures of the Virtex and Spartan-II FPGA
clock cycle in two sub-cycles: a Read cycle architectures, you can greatly increase the
• 1 x CLKDLL + BUFG
for getting the current value, and a Write speed and reduce the cost of your designs.
cycle for updating (+1) the memory con- • 1 x RAMB4_S16_S16 For more information about how to imple-
tent. You can do this easily, using a • 1 x 16-bit INCREMENTER (8 slices) ment the histogram algorithm, e-mail:
CLKDLL to recover a clock at twice the [email protected]
frequency of the video clock (CLK2X), For a Virtex -6 or Spartan-II -6 device, Fpix
and to create an image of this clock shift- = 50 MHz.
ed by 90° to validate Read and Write
cycles. Figure 2 shows the detailed dia-
gram of an optimized implementation.
During horizontal and vertical retrace,
pixel values must be discarded. This is
done with no additional logic, by con-
necting the BLANKING# signal to the
ENA input of the memory block. Figure
3 illustrates the timing of the operations.
The DSP or microprocessor can directly
read the result of the histogram by using
the B port of the same block SelectRAM
(configured as a RAMB4_S16_S16). A
multiplexer is not needed because the two
ports (A and B) each have dedicated inputs.
Figure 2 - An optimized implementation

VIDEO_CLOCK

CLK2X

CLK90

BLANKING#

VIDEO_IN 10 42 33 10 73 D(n-2) D(n-1)


(after register)

DOA 0 0 0 1 0 x y

DIA 1 1 1 2 1 x+1 y+1

READ READ READ READ READ READ READ READ READ READ
WRITE WRITE WRITE WRITE WRITE WRITE WRITE WRITE WRITE

PIXEL CYCLE

Figure 3 - Timing

47
Applications Digital Radio

High Performance
Digital Down-Converters
for FPGAs Virtex FPGAs surpass off-the-shelf ASSPs
in design flexibility and system integration.

48
Applications Digital Radio

by Ray Andraka ly match our application. Furthermore, Frequency Synthesizer


President, Andraka Consulting Group, Inc with an FPGA implementation, we can put
The frequency synthesizer is simply an accu-
[email protected] the DDC and any post-processing in the
mulator used to integrate a phase increment
same chip. Post-processing is usually some
Digital down-converters (DDC) are a key value. If we interpret the MSB (most signifi-
form of demodulator.
component for digital radio. The DDC per- cant bit) of the accumulator as having a
forms the critical frequency translation need- The Oscillator weight of  then the accumulator represents
ed to recover the information from a digi- the fractional portion of the accumulated
In terms of system performance, the criti-
tized modulated signal. phase angle. Phase accumulator frequency
cal component in digital down-conversion
synthesis is discussed in detail in Xcell
Thanks to the high-level of interest in digi- is the numerically controlled oscillator
Journal #31 in an article by Austin Lesea
tal radio, the market for DDC devices is (NCO). This component generates a sam-
(www.xilinx.com/xcell/xl31/xl31_32.pdf).
soaring. Typically, a designer will select an pled digital sinusoid, which when mixed
off-the-shelf application-specific-standard- with the incoming signal, shifts the sig- Using a phase accumulator offers several
part (ASSP) for this task. Although the costs nal’s spectrum. In other words, if we mul- advantages over other methods:
of these parts have fallen precipitously in the tiply (mix) a signal with a sine wave, we • The synthesized frequency need not have
face of market demand, ASSPs don’t offer get a frequency translation or “shift” of an integer relationship to the sample clock,
the design flexibility or integration attain- the spectral image. The amount of transla- because modulo arithmetic preserves the
able in an FPGA. tion is equal to the frequency of the “car- fractional part of the accumulated phase on
rier” sine wave. an overflow. This lets us set the local oscil-
ASSP vendors are stuck with the challenge of
creating a one-size-fits-all design, and end Insufficient precision or accuracy in the lator to an arbitrary frequency without
users are stuck with fitting the device to their sinusoid leads to degraded signal-to-noise changing the sample rate.
needs–often paying for features or perform- ratios and to spurious spectral artifacts, • The phase increment value does not have
ance they don’t need or want. DDCs imple- either of which can swamp the incoming to be a constant. By dynamically changing
mented in FPGAs, however, can compete signal. Attention to the quantization that the increment value, we can easily modu-
with ASSPs by offering the additional bene- leads to these noise terms is essential for late the phase or frequency of the generat-
fits of customizability and higher integration. the proper design of an NCO. In our ed signal.
implementation model, our NCO con-
A down-converter consists of a numerically • Because 2N represents a full phase revolu-
sists of a phase accumulator frequency
controlled digital oscillator, a mixer (shown tion, this generator interfaces nicely with
synthesizer and a phase angle-to-wave
as a pair of multipliers), and a low pass fil- look-up tables for wave shape conversion.
shape conversion. The phase angle-to-
ter, as shown in Figure 1. The band-limited Nothing in the phase accumulator design
wave shape conversion circuit may be any
output from the filter allows us to reduce will impair the noise performance of the
one of several possible designs.
the sample rate by decimating. The design NCO; reducing word width only restricts
is fairly straightforward, although we must the frequencies that can be synthesized.
to pay attention to the fidelity of the digital
sinusoid-sine and cosine waveforms pro-
duced by the numerically controlled oscilla-
tor. We must also consider the quality of the
filters, if we are to have acceptable noise
performance. (We must keep the design
from adding so much noise to the incoming
modulated signal that we can’t reliably
detect it. How much noise is acceptable
depends on the application.)
Some digital radio applications have fairly
high sample rates, which can make the
design more challenging. With careful
design, however, modern FPGAs can han-
dle data as fast as any commercially avail-
able analog-to-digital converter can supply
it. The advantage of using an FPGA is that
Figure 1- Digital down-converter
it allows us to customize the DDC to exact-

49
Applications Digital Radio

Noise is generated by an imperfect rendi- the width. To keep the size of the table that makes this task fairly easy in hard-
tion of the sinusoid at the output of the reasonable without sacrificing frequency ware. (See www.andraka.com/cordic.htm
NCO. That noise can be phase errors resolution, we must truncate the phase for details on CORDIC.) The algorithm
(angular distortions) or amplitude errors. accumulator output, using only the MSBs simultaneously generates a sine and cosine
The phase accumulator generates only a at the cost of degrading the SFDR. The value by rotating a unit vector from the
phase angle, so there is no amplitude size of a table grows exponentially with “I” axis to the desired phase angle using a
error. Errors caused by quantization of the phase resolution, so for even moderate series of successively smaller elemental
phase increment can cause a frequency SFDR requirements, the table becomes rotations. The angles of those elemental
error, but not a changing phase error. larger than what we would like to use in rotations are specifically selected for a
an FPGA. shift-and-add implementation. The “I”
Waveform Synthesis
(real or in-phase) and “Q” (imaginary or
Simple amplitude and phase symmetry
The phase accumulator produces a quadrature) components of the rotated
allows us to reduce the table size by a fac-
“wrapped” phase angle that must be con- vector are proportional to the cosine and
tor of 4 by reusing the first quadrant data
verted to a sampled complex sinusoid. sine of the phase angle respectively.
for the other quadrants. The same table is
The accuracy of the conversion directly
used for the both sine and cosine values, The Mixer
affects the noise performance of the
so if clock cycles per sample permit, the
DDC. The noise introduced by the NCO The function of the mixer is to multiply
same ROM can be read twice per sample.
is caused by amplitude and phase errors, the incoming signal by the locally generat-
In Virtex devices, you can use the dual-
which manifest themselves as reduced sig- ed sinusoid to shift the spectrum of the
port feature of the block RAM to simulta-
nal-to-noise-ratio (SNR) and degraded signal. A straightforward implementation
neously obtain both the sine and cosine
spurious free dynamic range (SFDR) uses two multipliers, one each for the sine
values from a shared ROM. Large ROMs
respectively. Each additional bit of phase and the cosine. The multipliers produced
in FPGAs are expensive in terms of
improves the SFDR by about 6dB and by the CORE Generator tool can easily be
resources used so, for phase resolutions of
extra amplitude resolution adds to the used for this application.
more than 8 to 10 bits, other methods
SNR by about 6dB.
should be used. If we use CORDIC for the wave shape
The most obvious conversion circuit is a conversion, however, we can obtain the
The large ROMs can be avoided by algo-
simple lookup table of sine values by mixer function for free. The combination
rithmically generating the sine and cosine
phase angle, which is addressed directly by of the NCO and the mixer multiplies the
on the fly. While that sounds difficult,
the phase accumulator. The phase resolu- incoming signal by cos(t)-jsin(t) = e-jt.
there is a simple shift-add algorithm based
tion determines the depth of the table, Because the NCO and mixer generate a
on vector rotation called CORDIC
while the amplitude precision determines complex phasor, the net effect is to rotate
(COordinate Rotation DIgital Computer)

Figure 2-FPGA implementation of a digital down converter

50
Applications Digital Radio

the incoming signal by a constantly sufficiently narrow, the rejection of the arithmetic (see www.andraka.com/dis-
changing phase angle. Rather than rotat- aliased image is quite good, much better tribu.htm for a tutorial on distributed
ing a unit vector to get I and Q scale val- than might be expected otherwise. We can arithmetic).
ues, we can use the CORDIC to directly also cascade several sections to lower the
Identical filters must be applied to both
rotate the input signal. This eliminates the amplitude of the side lobes. The passband
the I and Q channels. Even using the slow-
two multipliers and avoids the potential of this filter does exhibit a pronounced roll-
est speed grade Virtex FPGAs, the DDC
for additional quantization noise. off that usually must be corrected by the
design described here can be clocked at
clean-up filter. Keeping the passband of the
A more subtle advantage to using more than 130 MHz if the design is care-
final filter narrow not only improves the
CORDIC is that it actually rotates the fully executed and floor planned. This
alias rejection, but also makes the roll-off
vector rather than multiplying the compo- high potential clock rate permits us to
compensation easier.
nents separately. This means it does not time multiplex the I and Q data through
add noise to the signal other than the The advantages of using a CIC filter in this the same filters by interleaving the I and Q
spectral spurs caused by the phase quanti- implementation are: samples on a clock-to-clock basis. Thus for
zation. The CORDIC hardware occupies • It is a computationally easy filter to real-
very little additional overhead, we can
about the same area as a pair of multipli- ize.
handle both the I and Q data in the same
ers with the same input width in the filter. We can also use the same technique
Virtex architecture. Thus, in effect, we • The same filter structure works for a very to handle several independently tuned
have a net area savings about equal to wide range of decimation ratios by simply channels with a single instance of the
what we would have used for the sine and changing the timing of the clock enables DDC design.
cosine wave shape conversion. The on the comb section.
An advantage of using an FPGA for the
CORDIC rotator also accepts a complex • The filter response referred to the output DDC is that we can customize the filter
input, so no additional hardware is need- sample rate is nearly independent of the chain to exactly meet our requirements.
ed for applications requiring a complex decimation ratio, so one clean-up filter With an off-the shelf chip, we would have
signal input. can be used for all decimation ratios. to either fit our requirements to the chips’
The Filter and Decimator The gain of the CIC filter is a function of features or add additional post-processing
the decimation ratio. Therefore, a barrel to modify the output to our needs.
The mixed signal has to be filtered to iso-
shifter is required after the CIC filter in Conclusion
late the portion of the spectrum containing
applications where the decimation ratio has
the signal of interest. The filter typically We’ve briefly discussed implementation of
to be changeable without changing the cir-
has to be a narrow-band filter with a fairly a high performance DDC in an FPGA. If
cuit. This is an issue in an ASSP DDC, as
high rejection of unwanted spectrum. This we apply these techniques to a 16-bit
it is a one-size-fits-all solution. Most of the
translates to an expensive filter if it is done DDC with a 64 MS/sec input and a 100 dB
time in FPGAs, we can hardwire the shift,
at the input sample rate. Instead, we can SFDR requirement, we come up with a
or at worst, use a limited barrel shift,
use a multi-rate approach in which the sig- design that occupies about 550 Virtex
because we can customize the DDC for our
nal is first decimated to a much lower sam- CLBs (configurable logic blocks). The
application.
ple rate using a less computationally inten- occupied area is heavily influenced by spe-
sive filter. Then the signal is cleaned up “Clean-Up” Filter cific requirements of the application. The
with a second more complex filter working The output of the CIC filter has a sinc cited design, shown in Figure 2, consists of
at the decimated sample rate. shape, which is not suitable for most appli- an NCO and mixer implemented as a
cations. A “clean-up” filter can be applied CORDIC rotator and a programmable
High Ratio Decimator
at the CIC output to correct for the pass- decimating filter. The filter is a 4th order
A high-ratio decimation can be performed band droop, as well as to achieve the CIC filter followed by a 63-tap symmetric
very efficiently using a cascaded integrator- desired cut-off frequency and filter shape. Finite Impulse Response (FIR) filter.
comb (CIC) filter. The CIC filter is a This filter typically decimates by a factor of Backing off on any of the requirements
recursive implementation of the “boxcar” 2 or 4 to minimize the output sample rate can substantially reduce the area occupied
or moving average filter. The spectral after the passband has been limited and by the DDC. Because we are using an
response of such a filter is the sinc (sinx/x) shaped. An application-specific filter FPGA, we have the luxury of picking the
function. In a CIC filter, the number of response, such as a raised cosine Nyquist features and performance to match our
effective taps is an integer multiple of the filter, can either be combined into the cor- application. If we were to use an ASSP
decimation ratio, so the filter nulls alias rection filter or be applied at a subsequent component, we would have to mold our
onto the passband when the spectrum is filter stage. The clean-up filter is compact- requirements and design around the capa-
folded by decimation. If the passband is ly implemented using serial distributed bilities of the selected device.

51
New Technology Cores

Digital Image New Color Space Converter LogiCOREs


and a Combined Forward/Inverse DCT
LogiCORE give you pre-designed blocks

Processing
that solve difficult design problems.

with LogiCOREs
by David Mann
Multimedia ASVC Marketing, Integrated Silicon Systems
[email protected]
Color Space Conversion Color Space Converter LogiCOREs
The real-time manipulation of high-resolu-
tion images (either moving-picture video Color Space Conversion (CSC) is one of Xilinx presently offers a family of four dif-
or still-frame image streams) usually the standard image processing techniques; ferent Color Space Converter LogiCOREs,
demands custom digital video processing it’s a trick that allows you to more-effi- as shown in Table 1.
in hardware. But why use a bunch of dif- ciently use the digital image data, associat- Other useful pre-processing functions such
ferent ASSPs for common video/image ed with a color pixel, by switching color as Gamma Correction are also incorporat-
processing tasks-such as Color Space domains. Processing an image in the Red- ed in these particular ISS-designed
Conversion (RGB to YCrCb or vice versa), Green-Blue color space with a set of (R, G, LogiCOREs, so you spend less time devel-
or Discrete Cosine Transform, when you B) values for each and every pixel really oping your CSC design using these solu-
can do it all in a Virtex, Virtex-E, or isn’t very efficient. The RGB representa- tions.
Spartan device? And the performance of tion has a significant downside: although
these FPGA-optimized “Application- its the natural paradigm for rendering full- Here’s just a few application areas for the
Specific Virtual Components” is very color pictures using display technologies Color Space Conversion family of
attractive. that emit mixtures of the three primary LogiCOREs:
There are several standard video processing colors (such as CRTs, LCDs, LEDs, etc), it • Video output conversion to digital RGB.
functions that are common to many vision is not as efficient as special alternative
schemes. • Image filtering.
systems; these systems include video broad-
cast, machine vision, and image filtering The standard alternative representations • Machine vision.
applications. Now, thanks to Integrated use de-correlated components–luminance • Video and still-image processing.
Silicon Systems’ ASVC technology, the IP and chrominance. Thus CSC only comes
cores powering these applications can be DCT Engine LogiCORE
into play whenever it’s time to present a
implemented in FPGAs, and Xilinx can picture to the human visual cortex, or after Figure 1 shows where one of the
supply all of the vital links in your cus- a real-world image is captured using a RGB2YUV or RGB2YCrCb Color Space
tomized video compression/decompression scanner or camera followed by processing Converter LogiCOREs fits into a typical
chain (as shown in Figure 1). in the digital domain. video/image processing flow. This example

52
New Technology Cores

Figure 1 - LogiCOREs in a typical digital video/image processing chain

data path incorporates a Discrete Cosine architecture because the 2-D architecture Using the Combined Forward/Inverse DCT
Transform block–a necessary element in uses row-column decomposition to separate LogiCORE makes it very easy to create
image compression algorithms. the transform into two distinct 1-D opera- your own design, even if you don’t have the
tions. Each operation generates a set of inter- engineering bandwidth or DCT expertise.
Now, there’s a new LogiCORE which com-
mediate results that are written into trans- And, the Xilinx software tools make it easy.
bines both Forward DCT and Inverse DCT
pose memory. Data is “burst” into the
functions in one, and it’s ISO/IEC 10918-1 Designing with LogiCOREs
DCT/iDCT core as blocks of 64 values, and
JPEG compliant. This high-performance
the results of the transform are presented in If you’re familiar with HDL-based design
DCT/iDCT engine offers 1-symbol/cycle
the same format. and simulation, component instantiation,
processing power thanks to its fully pipelined
script-based logic synthesis, and the use of
architecture. The design is highly-tuned for When in Forward DCT mode, this
testbenches, then you’re all set to design
optimal performance across the various LogiCORE takes 8-bit input data words and
using LogiCOREs. All the LogiCORE
Xilinx FPGA technologies. It requires only produces an 11-bit output. In the Inverse
modules described here are available under
1756 slices in Virtex, 1759 in Virtex-E, or mode, the converse is true. You’ve got 14-bit
a standard license agreement from Xilinx.
1728 in Spartan devices; and only 48 IOBs cosine coefficients, and a 15-bit representa-
You get the code and test vectors, together
are needed for interfacing. tion in transpose memory, so there’s no need
with installation and instantiation instruc-
to worry about precision.
This design is very efficient in the Xilinx tions as part of the LogiCORE deliverables.
Conclusion
LogiCORE RGB to YUV YUV to RGB RGB to YCrCb YCrCb to RGB
Digital video/image processing applica-
Slices used 230 147 211 186 tions can be very difficult to develop.
IOBs used 50 50 49 49 However, the new Xilinx LogiCOREs offer
feature-enhanced Color Space Conversion
System clock and Forward/Inverse Discrete Cosine
Virtex >75 MHz >75 MHz >60 MHz >75 MHz Transforms that give you a time-to-market
advantage.
Spartan-II >80 MHz >65 MHz >65 MHz >70 MHz
There are more LogiCOREs in develop-
Virtex-E >100 MHz >100 MHz >90 MHz >90 MHz
ment for digital video applications, includ-
Features used Carry logic Carry logic Carry logic Carry logic ing standalone M-JPEG Codec solutions
for Virtex and Virtex-E. Talk to an ISS rep-
Precision 10-bit 10-bit 10-bit 10-bit resentative or your local Xilinx FAE about
Datasheet Yes Yes Yes Yes your particular application.
To find out more, or to access the
Availability Now Now Now Now
datasheets, visit the Xilinx dedicated IP
Table 1 - LogiCORE specifications Center at: www.xilinx.com/ipcenter.
53
New Products Development Boards

Stackable
by Dr. Stefan Schafroth
Hardware and software development engineer,
ErSt Electronic GmbH
[email protected]
To help you increase your productivity

Development and decrease your time to market, we


recently designed a new set of develop-
ment boards, using Spartan-II, Virtex,
and Virtex-E FPGAs. Like our original

Boards for
Virtex-based board, the new boards pro-
vide all the necessary basic components
needed in most of FPGA-based designs.
In addition, we incorporated an optional
large ZBT RAM to satisfy the needs of

Spartan-II, Virtex,
modern telecommunication and imaging
applications. All I/Os are routed to head-
er connectors where you connect your
special purpose interfaces. By stacking
several boards you can easily cope with

and Virtex-E FPGAs


complex designs that exceed the scope of a
single FPGA. The boards are fully com-
patible with their predecessor such that

A new series of prototyping


boards to help you quickly
test and implement your
FPGA designs.

Figure 1 - Functional diagram of the


development board modules

54
New Products Development Boards

you can stack them together and reuse our Push buttons, DIP switches, and LEDs Figures 2 and 3 show the top and bottom
power module (PWR3) as the supply for form a user interface that allows you to view of a development board module
the various required supply and reference provide configuration data and monitor equipped with an XCV1000E FPGA.
voltages. display status information from the run-
Applications
ning system.
Key Features
The board is very well suited to:
Each development board
• Evaluate the larger members
uses either a Spartan-II,
of the Spartan-II, Virtex,
Virtex, or Virtex-E FPGA
and Virtex-E FPGA families
in a PQ-208 package
in the PQ-208 or HQ-240
(Spartan-II) or an HQ-
packages, respectively.
240 package (Virtex,
Virtex-E). Vital compo- • Experiment with different
nents for a basic system low voltage I/O standards.
are placed around the • Implement custom designs
FPGA, including two using the full power of the
crystal oscillators, three Virtex architecture.
push buttons, eight DIP
switches, and nine status • Test algorithms under real-
LEDs. An optional ZBT time conditions and watch
RAM helps to support the signals with a logic ana-
any memory demanding lyzer.
applications. Figure 2 - Top view of the development board module
• Quickly and easily expand
All configuration modes of the complexity of the sys-
the FPGA are supported, tem by stacking several
and you can provide con- boards.
figuration data either by Conclusion
using serial configuration
PROMs (SCPs) sitting in The EVALXC2S, EVALXCV,
onboard sockets, in-system and EVALXCVE develop-
programmable (ISP) PROMs, ment board series gives you an
or by connecting a Xilinx ideal platform for evaluating,
MultiLINX, XChecker, or implementing, testing, and
JTAG cable. The ISP extending custom designs
PROMs and the FPGA using Spartan-II, Virtex, or
form a single JTAG chain. Virtex-E devices. Using the
A functional diagram optional ZBT RAM you can
detailing the building even implement applications
blocks of the prototyping calling for large amounts of
boards is shown in Figure 1. Figure 3 - Bottom view of the development board module memory. You can also easily
integrate the board into a
The crystal oscillators are larger system. Like their pred-
housed in standard DIL-8 or DIL-14 size You can configure all eight I/O banks ecessor, the boards can be combined with
metal cans plugged into sockets, so you can independently of each other, and you can the PWR3 power module to form a com-
easily change the frequency. To facilitate the pact unit that runs from a single power
select their VCCO and reference voltages
distribution of very fast clocks, we mounted supply. This makes it ideal for teaching,
individually with jumpers. Two different
four SMB coaxial connectors, next to the seminars, and courses.
reference voltages (derived from the
clock pins of the FPGA, which may be ter-
minated with optional resistors to ground. FPGA core voltage) can be generated
The synchronous clock input of the ZBT onboard by means of trim potentiometers. For additional information on
RAM is also connected to one of these con- Up to eight reference voltages can be con- EVALXC2S/XCV/XCVE see:
nectors. Alternatively you can use an FPGA- nected from an external source, such as www.erst.ch, or
generated clock driven on an I/O pin. our PWR3 power module. contact us at [email protected].
55
Perspective Services

Xilinx
by Jannis McReynolds
Business Development Manager
Xilinx, Global Services Division
[email protected]
It’s a fact of life–the market waits for no
one. Your new product idea won’t be nearly
as successful if your competitors get to mar-

Global
ket first. But success is not just about mov-
ing quickly; design methodologies are get-
ting more and more complex with each new
advance in technology, so you must also
move intelligently.
Xilinx Global Services is a portfolio of serv-
ices, and tools designed to keep you on the
fast track. From technical support to educa-

Services
tion to design consulting, Xilinx Global

“Designers face a multitude of


challenges in their efforts to stay ahead
of the market,” said Dave DeMarinis,
Business Development Manager, Global
Services Division. “We developed Xilinx
Global Services to help them
work faster and smarter.”

In the race to market, Services compress your learning curve and


accelerate your design time. The Xilinx
Global Services portfolio consists of

make the most of Product Services, Education Services,


Design Services, and the highly acclaimed
support.xilinx.com.

your time and resources. It’s Okay To Cut In Line


Our Platinum Technical Service, which
costs $1,295.00 per seat per year, is a
remote technical support service developed
to help you reduce the time you spend trou-
bleshooting and accelerate your design
schedule–it gives you rapid access to our
Senior Application Engineers. When you
call or log-on to the website, your call auto-
matically moves to the head of the queue.
56
Perspective Services

In addition, you receive eight Xilinx educa- Recorded e-Learning allows you to log-in At support.xilinx.com you’ll find:
tion credits you can use to further reduce from anywhere, listen to recorded educa-
• The Answers Database, with over 4,000
design time. tion sessions and
proven design solutions.
learn at your own
Of course, our free
services already give “The Xilinx Design Services teams bring pace. The e-Learning • Problem Solvers, to troubleshoot device
modules can be configuration, software installation, and
you access to a
wealth of technical a lot to the table – unique expertise, accessed anytime, day JTAG issues.
or night, and are less
support, service • A Web support interface, which allows
packs, and software experience with Xilinx tools and expensive than the
you to open a Platinum priority case.
live e-Learning classes.
updates, but our
Platinum Technical development, and the fruits of All of our Education • Discussion forums through which you
Service gives you Services provide can interact with other designers.
top priority and is investments Xilinx continuously excellent, hands-on
As you might expect, support.xilinx.com is
always available training, suitable for
when you are: in makes in R&D,” said Dave DeMarinis, all skill levels, from
available seven days a week, 24 hours a day,
365 days a year, so you can troubleshoot
North America the novice to the
between 7 a.m. and the Business Development Manager for expert. Classes are led
your design when it’s convenient for you–
anytime, anywhere, support.xiinx.com has
5 p.m. Pacific by instructors who
Standard Time Global Services. “With our team, you pay are themselves experi-
your answers.

(PST), In Europe, enced designers. With


service is available 9 for results, not for a consultant’s time.” our flexible and
a.m. to 5:30 p.m. always available
Greenwich Mean courses, you can “When compared against the
Time (GMT). Xilinx Platinum Technical choose the instructional method that keeps
Service is a powerful tool to help you get you on the technical cutting edge, without competition in customer surveys,
your product to market before your com- sacrificing your productivity.
petitors.
Stack the Deck in Your Favor Xilinx (support.xilinx.com) outscored
Move to the Head of the Class Our Design Services give you the advan-
tage of experienced engineers, who are ded-
every other PLD vendor’s website in
Our Education Services are a flexible, com-
icated to your design. In the rush to mar-
prehensive collection of courses and venues
ket, it makes good business sense to take
content, accessibility, and ease of use,”
targeted at keeping you and your designers’
advantage of every resource you can,
technical skills sharp, so you get increased
whether in-house or outside your company.
wrote Gartner Group in their 1999
design productivity and therefore reduced
Now you can outsource your design to an
time to market.
experienced Xilinx team, made up of best-
Corporate Customer Satisfaction Study.
We offer a broad range of classes at training in-class designers, silicon experts, and soft-
centers worldwide or we’ll bring the class- ware specialists who become your virtual
room to your place of business. We even have in-house design team.
portable classrooms complete with comput- Conclusion
ers, software training materials, and instructor You can immediately extend your project
Xilinx Global Services can increase your
set-up, so all you have to do is show up ready bandwidth and eliminate ramp-up time by
productivity, reduce your design time, and
to learn. using Xilinx design Services. The benefit is
improve your return on investment when
that you’re free to focus on tasks of the
If you’re looking to expand your skills with designing programmable logic solutions.
highest priority to you, and you can feel
the ultimate in convenience and cost savings, The Xilinx Global Services portfolio of
confident that your programmable logic
check out Xilinx e-Learning. “Live” education, product, and design services,
design is in the most capable hands.
e-Learning is a virtual, real-time classroom along with the support.xilinx.com online
which allows you to interact with our instruc- Go Ahead... Ask Us Anything resources will extend your technical capa-
tors and chat with other engineers over the Our free suport.xilinx.com website is rec- bilities and accelerate your time to market.
Internet. Live e-Learning is also an excellent ognized as the industry leader for online For more information, contact your Xilinx
way to collaborate, particularly if your design solutions to programmable logic design sales representative. You can find the Xilinx
team is spread out around the world, and we issues; it has received rave reviews from sales office nearest you at: http://www.xil-
offer over 70 live e-Learning modules. industry insiders. inx.com/company/sales/offices.htm.
57
Services Education

e-Learning Makes the Grade


Get up to speed on the latest technologies quickly, conveniently, and cost effectively.

by Renne Ricciardi We can help you determine which e-learn- or if you prefer a date or time that is more
Business Development Manager, ing modules are right for you through our convenient for you, simply call us and we
Xilinx Educational Services brief self-assessment pre-tests. These tests will schedule an instructor to deliver a
[email protected] help you gauge your knowledge and deter- module of your choice for your group
Despite enormous technical advances since mine what you need to know so you won’t only. The only requirement is that you
the days of chalkboards and spiral note- be spending money on training you’ve have a minimum of six people who want
books, traditional instructor-led classroom already had. to attend the session. The maximum num-
training is still the best way to learn when ber of people is 100.
Live e-Learning Environment
you need an in-depth understanding of a A private session gives you more control
specialized topic. In the day-to-day race to Live instructors present classes and mod-
over the pace of delivery. Just ask the
market, however, time is a luxury few ules in real time. During each session, you
instructor to speed up or slow down. In
designers can afford. will have the opportunity to interact with
addition, the questions and discussion can
the instructor, as well as collaborate with
Online Learning, On the Spot be focused on issues and ideas important
online subject experts. You may pose ques-
to you.
With Xilinx e-Learning, you can choose tions to the instructor, view slides, share
from more than 70 online classes or mod- whiteboards, and discuss issues with other If you have designers located in remote
ules covering a broad range of topics and students in chat rooms. Pop quizzes locations, or if you have designers who are
skills involving Xilinx products and servic- appear periodically throughout the ses- interested in different modules, call us and
es. For example: sion–and you get instant feedback. ask about a Bundle Package Program.
With a bundle purchase, your people can
• Introduction to FPGA Design. Participating in an e-learning session is
complete modules when it’s most conven-
simple. You access the e-learning module
• Timing Constraints. ient for them. Any of the designers can
using your Web browser and a phone con-
sign in to attend a session at any time
• Spartan-II Architecture. nection. No additional software is
throughout the year, or until you have used
required. Ten minutes before the class,
• ModelSim XE. up all the modules you purchased.
you call into the conference, log on to the
• Virtex-EM Architecture. URL, download training documents, and Conclusion
you’re ready to go.
Each module is an hour in length, and Xilinx e-Learning is the most cost-effective
enrollment is quick and easy. Modules are Xilinx e-Learning classes are open to solution to help you keep your technical skills
taught weekly and presented at different everyone, and each e-learning module sharp and up-to-date. To learn more about
times throughout the day to support costs $100 per session. Xilinx e-Learning, visit the Xilinx e-Learning
worldwide access. Moreover, Xilinx website at www.support.xilinx.com/support/
Customized e-Learning
e-Learning won’t interfere with your proj- education-home.htm or call the registrar at
ect timeline, because there’s no lost pro- If you have multiple designers who want 877-959-2527.
ductivity due to travel time. to take the same course at the same time,

58
Perspective Partnerships

Xilinx in the Community–


Champions for Change
by Autumn Conrad
Public Relations, Xilinx, Inc. annual kick-off rally and urged other cor- recently started a stock-sharing program
[email protected] porations in Silicon Valley to get involved called “Stock for Students” which encour-
through increased corporate sponsorship ages employees to donate one share of their
As leaders in the global semiconductor and participation. In Santa Clara County, personal stock to Xilinx-adopted schools.
market, we take pride in sharing our suc- the Walk For AIDS serves as the largest Through funds raised by the program,
cess with the communities in which we do fundraising event for nine organizations Oster Elementary School will build the
business. Since the founding of our com- that provide services and prevention edu- “Xilinx Science Laboratory” to provide stu-
pany in 1984, Xilinx has maintained close cation to individuals living with HIV and dents with hands-on learning in the life,
relationships with several community part- AIDS. Agencies like Health-Connections, earth, and physical sciences that would
ners by establishing and supporting inno- a community service of The Health Trust otherwise be unavailable.
vative programs in the areas of education, which also nominated us for the award,
health, and welfare. Faculty Endowment Fund
provide critical health and social services to
2000 Outstanding Corporate Grantmaker individuals and their families affected by In higher education, the Department of
HIV/AIDS. Electrical Engineering at San
This tradition of community involvement Jose State University nominated
was recently recognized by the National Children’s Shelter of Santa us for establishing a faculty
Society of Fundraising Executives Clara County endowment fund for the uni-
(NSFRE) who honored Xilinx as the “2000 Xilinx employees remain versity. Due to the high cost of
Outstanding Corporate Grantmaker” of active year round by vol- living in Silicon Valley, the uni-
the year. With this award, Xilinx joins past unteering at organiza- versity was having difficulty
recipients like Hewlett Packard, Aspect tions such as the attracting and retaining faculty.
Communications, Applied Materials, Sun Children’s Shelter of The faculty endowment fund
Microsystems, AMD, Therma, and Apple Santa Clara County. Our enables the university to aug-
Computer as philanthropic leaders dedicat- relationship with the ment the salaries for three qual-
ed to building strong communities. Children’s Shelter started ified professors annually. In
This prestigious award is given to corpora- when we helped con- addition, we recently funded
tions that demonstrate a commitment to tribute to the $13.7 mil- the establishment of the “Xilinx
the community through financial support lion building fund, Digital Laboratory” which will
and exceptional employee involvement. which is located adjacent to Xilinx head- help prepare students to meet the chal-
Xilinx was nominated by five community quarters in San Jose. In November of lenges they will face once they graduate.
organizations and educational institutions 1995, the Children’s Shelter opened its Conclusion
that have maintained a close relationship doors to provide shelter, 24-hour care, and
with us over the past 16 years. It is through services to abused and neglected children. Through our partnerships, we have come
such partnerships that we can truly make a Since that time, the Children’s Shelter has together to explore ways to build commu-
difference in our communities and the lives provided care to over 15,000 children. nities that reflect the success and generos-
of many. ity of our times. We are proud of our
Stock for Students partnerships with these organizations and
Walk for Aids As part of our strong commitment to edu- take honor in supporting their efforts to
The Santa Clara County Walk For AIDS cation, we support innovative, results-ori- make our community a better place for all
nominated Xilinx for our active participa- entated programs that provide students people to live, work, and prosper. It is
tion and support of the event over the past with the skills necessary to succeed in only by working together that we can
nine years. Once again, we hosted the tomorrow’s technological world. We truly make a difference.

59
Reference Software

Software Solutions
Version 3 Development Systems
Quick Reference Guide
Xilinx development systems give you the speed you need. With the initial release of our
version 3 solutions, Xilinx place and route times are as fast as two minutes for our
200,000 gate, XC2S200 Spartan™-II device, and 30 minutes for our one million gate,
system-level XCV1000E Virtex™-E device. That makes Xilinx developmen systems the
fastest in the industry.
And with the push of a button, our timing-driven tools are creating designs that sup-
port I/O speeds in excess of 800 Mbps, and internal clock frequencies in excess of 300
MHz. With each quarterly release, we are further accelerating your design process.
Base and Base Express Configurations
Xilinx desktop design solutions combine powerful technology with an easy to use inter- The Base and Base Express configurations
face to help you achieve the best possible designs within your project schedule, regard- provide push button design flows and sup-
less of your experience level. For more information on any Xilinx products, visit port a broad array of FPGA and CPLD
www.xilinx.com devices targeted for low density and high
Alliance Series Solutions: volume applications.
The Alliance Series Solutions contain powerful open systems implemen-
tation tools that are engineered to plug and play your existing design Standard and Express Configurations
flow. This combination of advanced features delivers high performance The Standard and Express configurations
results on the toughest designs. combine push button flows with powerful
auto-interactive tools. These tools give
designers more influence and control over
Foundation Series ISE Solutions: implementation while maintaining the ben-
Foundation Integrated Synthesis Environment (ISE) is Xilinx next gen- efits of design automation.
eration design environment, optimized to deliver the benefits of an
HDL methodology. Foundation ISE is packed with technologies that Elite Configurations
help you bring your product to market faster. The Elite configurations are designed to
support powerful design flows that deliver
high-performance designs for even the high-
Foundation Series Solutions: est density, multi-million gate FPGA devices
The Foundation Series solutions are complete, ready-to-use design envi- from Xilinx.
ronments for programmable logic design based on industry-standard
schematic, HDL, and pushbutton design flows.

Xilinx Web-based Design Solutions provide designers the ability to engage in digital
design activities, on-line, using Xilinx application servers, or download design and
implementation software modules for use in their own design environment. These
applications include:
WebFITTER: WebFITTER URL:
The WebFITTER is a free Web-based design tool that allows system Go to the Xilinx website
designers to evaluate their designs using Xilinx XC9500 Series http://www.xilinx.com and jump to
CPLDs. "WebFITTER"
WebPACK:
The WebPACK is a collection of four free downloadable software
WebPACK URL:
modules including ABEL v7.1, VHDL and Verilog synthesis,
Go to the Xilinx website
design implementation tools, and device programming software.
http://www.xilinx.com and jump to
WebPACK now includes support of the entire Spartan-II FPGA "WebPACK"
family as well as the 300,000 system gate Virtex XCV300EFPGA.

60
Reference Software

Version 3 Development Systems


Feature Comparison Guide
Design Entry Alliance Foundation Foundation ISE WebPACK
Schematic • • •
VHDL, Veriolog HDL, ABEL, HDL • • •
State Diagram Editor • •(1) •(1)
Floorplanner • • • •
CORE Generator • • • •
Timing Constraint • • • •
Modular Design (Optional) (Optional)
Design Synthesis Alliance Foundation Foundation ISE WebPACK
Xilinx Synthesis Technology (XST) • •
FPGA Express / Incremental Synthesis •(5) •
Design Verification Alliance Foundation Foundation ISE WebPACK
Timing Simulation • • • •
Gate Level Simulator • •(2) •
HDL Simulator •(1) •(1) •(1) •(1)
HDL Testbench Generator •(1) •(1)
Integrated Logic Analysis (ChipScope ILA) (Optional) (Optional) (Optional)
Static Timing Analysis • • • •
Design Implementation Alliance Foundation Foundation ISE WebPACK
Constraints Editor • • • •
CPLD ChipViewer • • • •
FPGA Editor • • • •
Error Navigation to Xilinx Web • •
Command Line Operation • • •
HTML Timing Reports • • • •
Data Book I/O Timing • • • •
Timing-Driven Place and Route • • • •
Multi-pass Place and Route • • •
Project Archiving • • • •
System Interfaces Alliance Foundation Foundation ISE WebPACK
EDIF In • • CPLD Only
PROM File Generator • • • •
JTAG Download Software • • • •
IBIS • • • •
STAMP • • • •
VHDL, Verilog Out • • • •
HDL Simulation Libraries • • • •
Environment Alliance Foundation Foundation ISE WebPACK
Operating System PC / UNIX PC PC PC

Device Comparison Guide


Elite Standard/Express Base/Base Express WebPACK ISE
All Virtex-II Family Virtex-II Family up to XC2V1000 Virtex-II Family up to XC2V80 Virtex XCV300E only
All Virtex-E Family Virtex-E Family up to XCV1000E Virtex-E XCV50E only All Spartan-II Family
All Virtex Family All Virtex Family Virtex XCV50 only All CoolRunner Series(4)
All Spartan Series All Spartan Series All Spartan Series All XC9500 Series
All XC9500 Series All XC9500 Series All XC9500 Series
All XC4000E/L/EX All XC4000E/L All XC4000E/L
All XC4000XL/XLA All XC4000XL/XLA/EX/XV(3) XC4000XL/XLA up to XC4020
All XC3000(3) All XC3000(3) All XC30003
All XC5200(3) All XC5200(3) All XC52003
1. Evaluation functionality available through the Xilinx ALLSTAR program. For more information on the ALLSTAR program, go to www.xilinx.com.
2. Functional and timing simulation is performed using a HDL simulator in the ISE product.
3. XC3000, XC5200, and XC4000XV devices are not supported in the Foundation Series ISE configurations.
4. CoolRunner series is only available in WebFITTER and WebPACK at this time.
5. Foundation Base does not include a license for FPGA Express.

61
Reference Virtex

Virtex and XC4000X Series FPGAs


Each Virtex family has its own unique fea- The XC4000X Series is part of the broad port, or dual-port memory. Designed in an
tures to meet different application require- spectrum of Xilinx “XL” products unveiled advanced 0.25 micron process, the
ments. All devices have both distributed September, 1998. As a result, Xilinx offers XC4000X series delivers industry-leading
RAM and block RAM, and between four the broadest choice of 3.3V and 2.5V devices performance while significantly reducing
and eight DLLs for efficient clock manage- available from a single supplier, with densi- power consumption.
ment. ties ranging from 800 to 500,000 system
gates. With 12 family members ranging from
• The Virtex family, consisting of devices See www.xilinx.com
30,000 to 500,000 system gates, the devices
that range from 50K up to 1 million logic for more information
feature patented SelectRAM memory, with a
gates, supports 17 I/O standards, and offers
highly flexible arrangement of logic, single-
5V PCI compliance.
• The Virtex-E family
offers the highest logic
gate count available for FPGA Product Selection Matrix
DENSITY FEATURES
any FPGA, ranging
FPGA Product Selection Matrix

Max. RAM Bits


Typical System
from 50K up to 3.2

PCI Compliant
Output Drive
Logic Gates

Gate Range

CLB Matrix
Logic Cells

million system gates,


Maximum

Flip-Flops

Max. I/O

1.8 Volt
2.5 Volt
3.3 Volt
5.0 Volt
and supports 20 I/O

CLBs

(mA)
standards including DEVICES KEY FEATURES
LVPECL, LVDS, and XC4013XLA 1368 13K 10K-30K 18K 24x24 576 1536 192 12/24 Y – – – X
XC4020XLA 1862 20K 13K-40K 25K 28x28 784 2016 224 12/24 Y – – – X
Bus LVDS differential
XC4028XLA XC4000 Series: 2432 28K 18K-50K 33K 32x32 1024 2560 256 12/24 Y – – – X
signaling. Density
XC4036XLA 3078 36K 22K-65K 42K 36x36 1296 3168 288 12/24 Y – – – X
Leadership/
• The Virtex-EM XC4044XLA High Performance/ 3800 44K 27K-80K 51K 40x40 1600 3840 320 12/24 Y – – – X
Extended Memory XC4052XLA SelectRAM 4598 52K 33K-100K 62K 44x44 1936 4576 352 12/24 Y – – X *
XC4062XLA Memory 5472 62K 40K-130K 74K 48x48 2304 5376 384 12/24 Y – – X
family consists of two *
XC4085XLA 7448 85K 55K-180K 100K 56x56 3136 7168 448 12/24 Y – – X *
devices that have a
XCV50 1728 21K 34K-58K 56K 16x24 384 1536 180 2/24 Y – – X *
high RAM-to-logic – – X *
XCV100 2700 32K 72K-109K 78K 20x30 600 2400 180 2/24 Y
gate ratio that is target- XCV150 Virtex Family: 3888 47K 93K-165K 102K 24x36 864 3456 260 2/24 Y – X I/O *
ed for specific applica- Density/
XCV200 Performance 5292 64K 146K-237K 130K 28x42 1176 4704 284 2/24 Y – X I/O *
tions such as gigabit XCV300 Leadership 6912 83K 176K-323K 160K 32x48 1536 6144 316 2/24 Y – X I/O *
per second network XCV400 BlockRAM 10800 130K 282K-468K 230K 40x60 2400 9600 404 2/24 Y – X I/O *
Distributed RAM
switches and high defi- XCV600 15552 187K 365K-661K 312K 48x72 3456 13824 512 2/24 Y – X I/O *
SelectI/O
nition graphics. XCV800 4 DLLs 21168 254K 511K-888K 406K 56x84 4704 18816 512 2/24 Y – – X *
XCV1000 27648 332K 622K-1,124K 512K 64x96 6144 24576 512 2/24 Y – – X *
XCV50E 1728 21K 47K-72K 88K 16x24 384 1536 176 2/24 Y X I/O I/O **
XCV100E 2700 32K 105K-128K 118K 20x30 600 2400 196 2/24 Y X I/O I/O **
XCV200E Virtex-E Family: 5292 64K 215K-306K 186K 28x42 1176 4704 284 2/24 Y X I/O I/O **
Density/
XCV300E 6912 83K 254K-412K 224K 32x48 1536 6144 316 2/24 Y X I/O I/O **
Performance
XCV400E Leadership 10800 130K 413K-570K 310K 40x60 2400 9600 404 2/24 Y X I/O I/O **
XCV600E BlockRAM 15552 187K 679K-986K 504K 48x72 3456 13824 512 2/24 Y X I/O I/O **
XCV1000E Distributed RAM 27648 332K 1,146K-1,569K 768K 64x96 6144 24576 660 2/24 Y X I/O I/O **
SelectI/O+
XCV1600E 8 DLLs 34992 420K 1,628K-2,189K 1062K 72x108 7776 31104 724 2/24 Y X I/O I/O **
XCV2000E LVDS, BLVDS, 43200 518K 1,857K-2,542K 1240K 80x120 9600 38400 804 2/24 Y X I/O I/O **
XCV2600E LVPECL 57132 686K 2,221K-3,264K 1530K 92x138 12696 50784 804 2/24 Y X I/O I/O **
XCV3200E 73008 876K 2,608K-4,074K 1846K 104x156 16224 64896 804 2/24 Y X I/O I/O **
XCV405E Virtex Extended 10800 130K 1,068K-1,307K 710K 40x60 2400 9600 404 2/24 Y X I/O I/O **
XCV812E Memory Capabilities 21168 254K 2,569K-3,062K 1414K 56x84 4704 18816 556 2/24 Y X I/O I/O **
* I/Os are 5V tolerant
** 5 Volt tolerant I/Os with external resistor
X = Core and I/O voltage
62 I/Os = I/O voltage supported
Reference Spartan

Robust Feature Set Advantages Over ASICs


• Flexible on-chip distributed and • No costly NRE charges.
block memory.
• No time consuming vector generation
• Four digital Delay Locked Loops for needed.
Say hello to a new level of performance; the efficient chip-level/board-level clock
Spartan-II family now includes devices • All devices are 100% tested by Xilinx.
management.
with over 200,000 system gates. You get • Field upgradeable (remotely upgrade-
100,000 system gates for under $10, at • Select I/O Technology for interfacing
able, using Xilinx Online technology).
speeds of 200 MHz and beyond, giving with all major bus standards such as
you design flexibility that’s hard to beat. HSTL, GTL, SSTL, and so on. • No lengthy prototype or production
These low-powered, 2.5V devices feature lead times.
• Full PCI compliance.
I/Os that operate at up to 3.3V with full • Priced aggressively against comparable
5V tolerance. Spartan-II devices also fea- • System speeds over 200 MHz.
ASICs.
ture multiple Delay Locked Loops, on-chip • Power management.
RAM (block and distributed), and versatile
Extensive Design Support
I/O technology that supports over 16 high- For more information see
performance interface standards. You get • Complete suite of design tools. www.xilinx.com/products/spartan2
all this in an FPGA that offers unlimited
• Extensive core support.
programmability, and can even be upgrad-
ed in the field, remotely, over any network. • Compile designs in minutes.

FPGA Product Selection Matrix


DENSITY FEATURES
FPGA Product Selection Matrix
Max. RAM Bits
Typical System

PCI Compliant
Output Drive
Logic Gates

Gate Range

CLB Matrix
Logic Cells

Maximum

Flip-Flops

Max. I/O

1.8 Volt
2.5 Volt
3.3 Volt
5.0 Volt
CLBs

(mA)

DEVICES KEY FEATURES


XCS05 Spartan Family: 238 3K 2K-5K 3K 10x10 100 360 77 12 Y – – – X
XCS10 High Volume 466 5K 3K-10K 6K 14x14 196 616 112 12 Y – – – X
ASIC
XCS20 950 10K 7K-20K 13K 20x20 400 1120 160 12 Y – – – X
Replacement/
XCS30 High Performance/ 1368 13K 10K-30K 18K 24x24 576 1536 192 12 Y – – – X
XCS40 SelectRAM Memory 1862 20K 13K-40K 25K 28x28 784 2016 205 12 Y – – – X
XCS05XL Spartan-XL Family: 238 3K 2K-5K 3K 10x10 100 360 77 12/24 Y – – X *
XCS10XL High Volume 466 5K 3K-10K 6K 14x14 196 616 112 12/24 Y – – X *
ASIC
XCS20XL Replacement/ 950 10K 7K-20K 13K 20x20 400 1120 160 12/24 Y – – X *
XCS30XL High Performance/ 1368 13K 10K-30K 18K 24x24 576 1536 192 12/24 Y – – X *
XCS40XL SelectRAM Memory 1862 20K 13K-40K 25K 28x28 784 2016 224 12/24 Y – – X *
XC2S15 432 8K 6K-15K 22K 8x12 96 384 86 2/24 Y – X I/0 *
XC2S30 Spartan-II Family: 972 17K 13K-30K 36K 12x18 216 864 132 2/24 Y – X I/0 *
High Volume
XC2S50 BlockRAM 1728 30K 23K-50K 56K 16x24 384 1536 176 2/24 Y – X I/0 *
XC2S100 Distributed RAM 2700 53K 37K-100K 78K 20x30 600 2400 196 2/24 Y – X I/0 *
XC2S150 SelectI/O 3888 77K 52K-150K 102K 24x36 864 3456 260 2/24 Y – X I/0 *
4 DLLs
XC2S200 5292 103K 71K-200K 130K 28x42 1,176 4704 284 2/24 Y – X I/0 *
* I/Os are tolerant
X = Core and I/O voltage
I/Os = I/O voltage supported
63
Reference CPLDs

XC9500 and CoolRunner CPLDs


Whether performing high-speed net- WebPOWERED Software Solutions – Offer • WebPACK ISE – downloadable desktop
working or power-conscious portable you the flexibility to target Xilinx CPLD solutions that offer free CPLD and
designs, Xilinx CPLDs provide you and FPGA products on-line or on the FPGA software modules for
with a complete range of value oriented desktop, including: ABEL/HDL synthesis and simulation,
products. device fitting, and JTAG programming.
• WebFITTER – an on-line device fitting
and evaluation tool that accepts HDL, Through leading performance, free inter-
ABEL, or netlist files and provides all net-based WebPOWERED software, and
XC9500 – Offers industry-leading
reports, simulation models, and pro- the industry’s lowest power consumption,
speeds, while giving you the flexibility
gramming files, along with price quotes. Xilinx has the right CPLD and FPGA for
of an enhanced customer-proven pin-
Available to support all Xilinx CPLD every designer’s need.
locking architecture along with exten-
products.
sive IEEE Std. 1149.1 JTAG For more information about Xilinx CPLD
Boundary-Scan support. products, see:
www.xilinx.com/xlnx/xil_product_
CoolRunner – Offers the patented Fast
landingpage.jsp
Zero Power (FZPTM) design technology,
combining low power
and high speed. These CPLD Product Selection Matrix Density Features
devices offer standby
Key Features
Core Voltage

CPLD Family

Low-Power
Macrocells

currents of less than


Frequency
Pin-to-Pin
Delay (ns)

Individual
Max. I/O

System
100 microamps, oper-
Device

OE Ctrl

JTAG

Ultra
ating currents 50-67%
lower than traditional XC9536XV 36 36 3.5 278 √ √ –
Best Pin-Locking
CPLDs, and pin-to-
2.5 VOLT XC9500XV XC9572XV JTAG w/Clamp 72 72 4 250 √ √ –
pin speeds of 5.0 ns. High Performance
ISP XC95144XV 144 117 4 250 √ √ –
High Endurance
XC95288XV 288 192 5 222 √ √ –
XC9536XL Best Pin-Locking 36 36 5 222 √ √ –
XC9572XL JTAG w/Clamp 72 72 5 222 √ √ –
XC9500XL
XC95144XL High Performance 144 117 5 222 √ √ –
High Endurance
3.3 Volt XC95288XL 288 192 6 208 √ √ –
ISP XCR3032XL 32 36 5 175 – √ √
XCR3064XL Ultra Low Power 64 68 6 145 – √ √
XPLA3 XCR3072XL JTAG 512 TBD TBD TBD – √ √
XCR3128XL Increased Logic 128 108 6 145 – √ √
Flexibility
XCR3256XL 256 164 7.5 140 – √ √
XCR3384XL 384 220 7.5 127 – √ √
XC9536 36 34 5 100 √ √ –
XC9572 72 72 7.5 83.3 √ √ –
Best Pin-Locking
5 Volt XC95108 108 108 7.5 83.3 √ √ –
XC9500 JTAG
ISP XC95144 144 133 7.5 83.3 √ √ –
High Endurance
XC95216 216 166 10 66.7 √ √ –
XC95288 288 192 10 66.7 √ √ –

64
Reference PROMs

XC18V FPGA Configurations

XC17V

XC17S

Xilinx offers a full range of configu-


Configuration PROMs for Virtex-E/Virtex-EM
ration memory devices optimized for
Configuration XC17xx XC18Vxx 8-pin 20-pin 20-pin 44-pin 44-pin
use with Xilinx FPGAs. Our PROM Device Bits Solution Solution TSOP PLCC SOIC PLCC VQFP
product lines are designed to meet XCV50E 630,048 17V01 18V01 X* X X – X**
the same stringent demands as our XCV100E 863,840 17V01 18V01 X* X X – X**
high-performance FPGAs, taking XCV200E 1,442,106 17V02 18V02 X* X* X* X** X**
XCV300E 1,875,648 17V02 18V02 – X* – X X
full advantage of the same advanced XCV400E 2,693,440 17V04 18V04 – X* – X X
processing technologies. In addition, XCV405E 3,430,400 17V04 18V04 – X* – X X
they were developed in close cooper- XCV600E 3,961,632 17V04 18V04 – X* – X X
ation with Xilinx FPGA designers XCV812E 6,519,648 17V08 2 of 18V04 – – – X X
XCV1000E 6,587,520 17V08 2 of 18V04 – – – X X
for optimal performance and relia-
XCV1600E 8,308,992 17V08 2 of 18V04 – – – X X
bility. XCV2000E 10,159,648 17V16 2 of 18V04 – – – X X
XCV2600E 12,922,336 17V16 3 of 18V04 + 18V512 – X*** – X** X
XC18V00 – Our in-system repro-
XCV3200E 16,283,712 17V16 4 of 18V04 – – – X X
grammable family provides a feature-
* Available in XC17Vxx only.
rich, fast configuration solution ** Available in XC18Vxx only.
available today, and provides a cost- *** Available in XC18V512 only.
effective method for reprogramming
and storing large Xilinx FPGA bit- Configuration PROMs for Virtex
streams. This family is JTAG ready Configuration XC17xx XC18Vxx 8-pin 20-pin 20-pin 44-pin 44-pin
and Boundry-Scan enabled for Device Bits Solution Solution TSOP PLCC SOIC PLCC VQFP
XCV50 559,200 17V01 18V01 X* X X – X**
exceptional ease-of-use, system inte-
XCV100 781,216 17V01 18V01 X* X X – X**
gration, and flexibility. XCV150 1,041,096 17V01 18V01 X* X X – X**
XCV200 1,335,840 17V01 18V02 X* X* X* X** X**
XC17V00/XC17S00 – Out low-cost
XCV300 1,751,808 17V02 18V02 – X* – X X
XC17V and XC17S families are an XCV400 2,546,048 17V04 18V04 – X* – X X
ideal configuration solution for cost- XCV600 3,607,968 17V04 18V04 – X* – X X
sensitive applications. XC17V XCV800 4,715,616 17V08 18V04 + 18V512 – X*** – X** X
PROMs are pin-compatible with our XCV1000 6,127,744 17V08 18V04 + 18V02 – – – X X
XC18V family to allow for a cost- * Available in XC17Vxx only.
** Available in XC18Vxx only.
reduction migration path as your *** Available in XC18V512 only.
production volumes increase. The
XC17S family is specially designed
to provide a low-cost, integrated Configuration PROMs for Spartan-XL/Spartan-II
solution for our Spartan families of PROM 8-pin 8-pin 20-pin 44-pin
Device Solution PDIP VOIC SOIC VQFP
FPGAs. XCS05XL XC17S05XL X X – –
XCS10XL XC17S10XL X X – –
XCS20XL XC17S20XL X X – –
XCS30XL XC17S30XL X X – –
XCS40XL XC17S40XL X X X –
XC2S15 XC17S15A X X X –
XC2S30 XC17S30A X X X –
XC2S50 XC17S50A X X X –
XC2S100 XC17S100A X X X –
XC2S150 XC17S150A X X X –
XC2S200 XC17S200A X X – X

65
Reference QPro

QML-Certified FPGAs and PROMs


The Virtex QPro family of High Reliability because their systems can retain consistent
The Xilinx QPro family of Radiation products is experiencing a high degree of form, fit, and function through the use of
Hardened FPGAs and PROMs are finding success in the defense market. As designers Virtex QPro FPGAs. This cannot be
homes in many new satellite and space appli- find it more and more difficult to find achieved with costly and inflexible ASICs
cations. Both the XQR4000XL and XQVR components suitable for the harsh environ- or custom logic.
Virtex products are being designed into space ments seen by defense systems, they are
Please visit http://www.xilinx.com/prod-
systems that will utilize reconfigurable tech- discovering that they can incorporate the
ucts/hirel_qml.htm for all the latest infor-
nology. Numerous communications and functions of obsolete parts into Virtex
mation about these products, including
GPS satellites, space probe, and shuttle mis- QPro products. This has the added long
some new applications notes.
sions are included on the growing list of pro- term advantage of significantly reducing
grams that will be flying these devices. the costs of future re-qualifications,

FPGA Product Selection Matrix


DENSITY FEATURES

Max. RAM Bits


Typical System

PCI Compliant
Output Drive
Logic Gates

Gate Range

CLB Matrix
Logic Cells

Maximum

Flip-Flops

Max. I/O

1.8 Volt
2.5 Volt
3 Volt
5 Volt
CLBs

(mA)
Device Key Features
**XQR/XQ4013XL XC4000 Series: 1,368 13K 10K-30K 18K 24x24 576 1,536 192 12/24 Y – – X *
Density
**XQR/XQ4036XL Leadership/ 3,078 36K 22K-65K 42K 36x36 1,296 3,168 288 12/24 Y – – X *
High Performance/ – – X
**XQR/XQ4062XL SelectRAM 5,472 62K 40K-130K 74K 48x48 2,304 5,376 384 12/24 Y *
XQ4085XL Memory 7,448 85K 55K-180K 100K 56x56 3,136 7,168 448 12/24 Y – – X *
XQV100 Virtex Family: 2,700 32K 72K-109K 78K 20x30 600 2,400 180 2/24 Y – X I/O *
Density/
**XQVR/XQV300 Performance 6,912 83K 176K-323K 160K 32x48 1,536 6,144 316 2/24 Y – X I/O *
Leadership
**XQVR/XQV600 BlockRAM 15,552 187K 365K-661K 312K 48x72 3,456 13,824 512 2/24 Y – X I/O *
Distributed RAM
**XQVR/XQV1000 SelectI/O 4 DLLs 27,648 332K 622K-1,124K 512K 64x96 6,144 24,576 512 2/24 Y – X I/O *

* I/Os are tolerant


** XQR and XQVR devices are Radiation Hardened
X = Core and I/O voltage
I/Os = I/O voltage supported

(1) Selected XQ4000E/EX devices also available

QPRO QML-certified PROMs


Package
Device Density DD8 SO20 CC44 VQ44
XC1736D 36Kb X
XC1765D 64Kb X
XC17128D 128Kb X
XC17256D 256Kb X
XQR/XQ1701L* 1Mb X X
XQR/XQ18V04* 4Mb X X**
* XQR devices are Radiation Hardened.
** XQ devices only.

66
Reference IP

Xilinx Intellectual Property Solutions


The Most Comprehensive and Highest Search; you can easily find the IP that you algorithms. Plus you get all the traditional
Quality Solution in the PLD Industry need at www.xilinx.com/ipcenter. Advanced FPGA benefits:
function cores are available for IP evaluation
The Xilinx Intellectual Property Solutions - RAM-based FPGA technology, for fast
and can be purchased via the IP Center.
Division offers the best selection of and easy design changes
Intellectual Property solutions for a wide Design Reuse – Download the "Reuse Field - Fast time to market, to give you a com-
variety of industries and applications. Xilinx Guide Methodology for FPGA and ASIC petitive advantage
Smart-IP Technology delivers high perform- Designs." Then use the Xilinx IP Capture
ance, flexibility, and predictability, with opti- Tool to package your IP with simulation - Field upgradeable systems (using
mized cores that give you both reduced cost models, testbenches, and PDF or HTML IRL™), for extended product lifecycle
and faster time to market. files. Then, you can catalog and share your IP
• Xtreme Productivity – The industry's first
using the CORE Generator.
LogiCORE™ Products – Licensed and sup- System Generator for Simulink® bridges
ported by Xilinx, LogiCORE products such The REAL PCI 64/66 – Parameterizable PCI the gap between FPGA and conventional
as parameterizable DSP building blocks and cores, reference designs, prototyping boards, DSP design flows, and features:
memory cores are included with the Xilinx education, and Xilinx PCI XPERTS com- - Unique constraint-driven Filter
CORE Generator software which is a com- bined with a proven design and guaranteed Generator, for performance/cost opti-
ponent of your Xilinx Foundation Series or timing make Xilinx PCI the lowest risk solu- mization
Alliance Series software. tion in the market
- Power estimator tool (Xpower™), for
AllianceCORE™ Products – A cooperative pro- The Xilinx DSP Solution – Our exclusive FPGA very low-power DSP implementations
gram with third-party IP suppliers who sell partnership with MathWorks enables you to
create complex, high performance DSP - Eleven optimized DSP algorithms
and support their cores directly with Xilinx
designs in a familiar environment with huge (cores) that cut development time by
customers. AllianceCORE products must
time to market advantages. Xilinx and its weeks
meet criteria that ensure they deliver value
and performance in a Xilinx device. partners offer a complete set of cores for - New DSP features added to the
high-performance low-cost DSP implemen- ChipScope ILA tool, rapidly reduces
Reference Design Alliance Program - Xilinx tation that provide: hardware debugging time
proactively supports development of third-
party system-level Reference Designs to pro- • Xtreme Flexibility – Distributed DSP • Xtreme Performance - Table 1 illustrates
vide fully functional, modular designs, that resources (such as look up tables, registers, the amazing performance you can achieve
offer considerable development time savings. multipliers, memory) and segmented rout- with Xilinx DSP.
ing allow optimized implementation of
XPERTS Program – The worldwide XPERTS
Program provides over 70 consultants certi-
fied in delivering turnkey system designs for
the Xilinx architecture, including PCI Table 1 - Extreme Performance
designs, new design methodologies, system- Industry's Xilinx
level design, along with IP customization Fastest DSP Xilinx
and integration. Function Processor Core Virtex-E -08
IP Delivery Tools – The Xilinx CORE MACs per second
Generator™ enables cataloging and genera- - Multiply and accumulate 4.4 Billion 128 Billion 600 Billion
tion of parameterized cores that are high per- - 8 x 8-bit
formance, predictable, and integrated with FIR Filter 17 MSPS 160 MSPS 180 MSPS
our system-level design reuse tools; the cores - 256-tap, linear phase
- 16-bit data/coefficients @ 1.1 GHz @ 160 MHz @ 180 MHz
are provided in VHDL and Verilog behav-
ioral description languages. FFT
- 1024 point, complex data 7.7 µs 41 µs <1 µs
The IP Center Internet portal, provides access - 16-bit real and
to the latest LogiCORE and AllianceCORE @ 800 MHz @ 100 MHz @ 140 MHz
imaginary comp.
products and reference designs via Smart

67
Introducing the Xilinx 3.2i software release . . . the fastest in the industry
Watch your designs go supersonic. With the new Xilinx 3.2i release, you can
place and route your next 100,000-gate design using a Spartan® FPGA in just
one minute, or your next one-million-gate design using
a Virtex™-E FPGA in only thirty minutes.

Faster tools improve your time-to-market advantage


With Xilinx’s ultra-fast software you can beat your
competition to market every time. When comparing
the time it takes to complete your design, Xilinx place and route finishes up
to 8 times faster for small designs, and up to 12 times faster for the most
complex, high density designs.

See for yourself


Visit www.xilinx.com/3_2i.htm today and see how fast we can make your
design. At Xilinx, we give you all the speed you need.

The Programmable Logic CompanySM


www.xilinx.com

PN: 0010526-Q400 © 2000, Xilinx, Inc. All rights reserved. The Xilinx name, Xilinx logo and Spartan are registered trademarks, Virtex, and all XC and XS designated products are trademarks,
and The Programmable Logic Company is a service mark of Xilinx, Inc. All other trademarks and registered trademarks are the property of their respective owners.

You might also like