0% found this document useful (0 votes)
32 views3 pages

In This Article, Eric Gives An Overview of The Benefits of Using Fpgas in DSP Design and Concludes With A List of Recommended Design Rules

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 3

I P S

T
DSP design
By Eric Cigan

In this article, Eric gives an overview of the benefits of using FPGAs in DSP design
and concludes with a list of recommended design rules.

Challenges of FPGA-based dramatically due in part to the growth in


multimedia and communications systems.
The silicon resources of an FPGA lead to
staggering performance gains – while the
Not long ago, designers of high- Products as diverse as 3G wireless base fastest general purpose can deliver up to
performance, digital signal processing stations, medical diagnostic imaging 5 billion MAC/s (multiply-accumulate
systems (DSPs) had two alternatives for equipment – even driver-assist systems per second), leading FPGA devices can
implementation – general purpose DSPs that will automatically park a car – would deliver more than 500 billion MAC/s –
or ASICs. General-purpose DSPs, such be inconceivable without the use of that’s more than 100x faster. What’s more,
as those from TI, Agere, Motorola, and advanced DSP algorithms. The throughput channelized applications such as those
Analog Devices are special-purpose requirements of these systems has strain- common to wireless communications
microprocessors optimized for common ed the abilities of general-purpose DSPs. naturally lend themselves to parallel im-
DSP operations. The benefit of general- For example, one leading manufacturer plementations in FPGAs. Growth rates
purpose DSPs is that they are the fast- of advanced echo cancellation systems in processing speed requirements versus
est method to get an algorithm running incorporated more than 25 general-purpose capabilities are shown in Figure 11. A
because they offer a comprehensive de- DSPs on a single board to meet their comparison of general purpose DSPs
velopment environment, with tools for performance goals. versus FPGAs is shown in Table 12.
code analysis, debugging, and rapid
prototyping. The disadvantage of DSPs is A new generation of programmable Tips for DSP design
that ultimately they execute instructions chips has emerged as an alternative to Let’s now take a close look at how designers
serially, setting an upper limit on the standard DSPs. Platform FPGAs such as navigate the challenges of using FPGAs in
chip’s throughput. Altera’s Stratix II and Xilinx’s Virtex II, design to avoid prolonged design cycles
incorporate arrays of dedicated multipliers, or reduce the component cost for end
ASICs offer the ability to break through embedded memory, and high-speed I/O products. It comes down to the following
these performance limitations. Custom that make them ideal for DSP applications. basic rules.
ASIC design lets the designer employ the
optimal mix of resources on a chip and to
place them in close physical proximity to
minimize delays. Moreover, ASICs are
ideal for use in portable electronics since
the flexibility of ASIC design allows the use
of processes and architectures optimized
for lower power consumption. The draw-
backs of ASICs are considerable:

■ They need to be fabricated.


■ They require more time to design.
■ They require more complex and expen-
sive development tools.
■ A single design flaw could lead to a
design respin causing additional cost
and delay.

In the past, given these two choices, most


designers avoided ASICs unless absolutely
necessary.

Two trends have recently changed


the landscape. First, the demand for
high-performance DSPs has increased Figure 1

Reprinted from Embedded Computing Design / Spring 2004


Industry leading, general Industry leading
Function But how does the engineer determine that
purpose, DSP processor core platform FPGA
the design matches all the way through?
8x8 Multiply Accumulate 4.8 Billion MACps 1 Trillion MACps The proven method for verifying designs
(MAC) fclk = 600 MHz fclk = 300 MHz in top-down flows is via a testbench that
feeds the design under test with its input
FIR Filter stimulus and monitors whether its outputs
– 256 Taps, Linear phase 9.3 Msps 300 Msps match the expected results. Designers
– 16-bit data/coefficients fclk = 600 MHz fclk = 300 MHz should create testbenches that allow this
methodology to be employed using HDL
Complex FFT 10 µs 1 µs simulators, but better yet, they should seek
– 1024 point, 16-bit data fclk = 600 MHz fclk = 150 MHz
out tools that automatically generate an
Viterbi decoding 500 channels at 7.95 Kbps 155 Mbps (OC-3 rates) RTL testbench from the original design
throughput for a total of 3.9 Mbps and use an HDL simulator to simulate and
functionally verify the design including
Reed-Solomon decoding 4.1 Mbps 10 Gbps (OC-192 rates) bit-true operation.

4
throughput fclk = 600 MHz fclk = 85 MHz
Turbo convolutional Six 2 Mbps data streams 5.4 Mbps Rule #4
decoder throughput (6 iterations) (6 iterations) Don’t reinvent the wheel.
DSP systems incorporate a number of
building blocks that are common to most
Table 1 designs – FIR and IIR filters, fast Fourier
transforms and discrete cosine transforms,
channel coding, etc. Developing these

1
functions from scratch comes at great
Rule #1 which meant that the hardware designer
expense; in fact, according to Berkeley
Start at the beginning. and software developers needed to recreate
Design Technology, Inc., developing a
Complex DSP designs start with an the design. new FFT in silicon can consume up to six
algorithm developer who creates the initial months. Designers need to adopt tools and
design based on existing designs and Many embedded systems developers techniques that provide access to a large
experience. According to the DSP market are accustomed to implementing DSP and growing variety of DSP intellectual
research firm Forward Concepts, the lead- algorithms on general purpose DSPs in C property (IP) that is geared towards DSP
ing tool for algorithm design is MATLAB or assembly language. This puts hardware design. While there are many sources for
from MathWorks. Using the MATLAB and software engineers into the role of IP in hard form (laid out on target silicon)
language, algorithm developers can create translating designs from one language to or in soft form (delivered in synthesizeable
designs in a natural and productive form another, creating many opportunities for in- RTL), typically there is no corresponding
and may tap into an immense wealth of serting errors with the attendant debugging simulation model available in MATLAB.
designs, scripts, and engineering know- process. To avoid this process altogether, This breaks the verification flow, making
how available only in the MATLAB companies are looking to architectural it difficult for the algorithm developer
language. Though designers can choose synthesis tools that use the MATLAB and the hardware designer to validate that
from other options including block- M-file as the golden source for downstream the algorithm is faithfully represented
level environments, such as Simulink design, automatically synthesizing the in silicon.
or SPW, or languages based on C/C++, design at the Register Transfer Level
these environments are less widely used (RTL). Coupled with traditional RTL
and there may not be as many designs synthesis tools that can synthesize RTL to
available for them. Moreover, many gate-level implementation, this establishes
constructs used in DSP designs – such as an unbroken design flow from algorithmic
looping, repeated structures, and 2- or 3- creation to hardware implementation.
dimensional data arrays – are much easier The top-down design process is shown in
to represent in MATLAB than in block- Figure 2.

3
level environments. Once the algorithm
is created in MATLAB, it can be readily Rule #3
shared or partitioned across a design team Always check your work –
and reused over time. use a verification flow that is

2
complete.
Rule #2 In any design process, it’s essential to be
Avoid recopying your work able to verify that the design meets the
(or alternatively, “Don’t get higher level specifications. If the design
lost in translation”). starts as a floating-point algorithm in
Once the algorithm is available, the rest MATLAB – also called an M-file – the
of the design team, including hardware fixed-point M-file should behave within
designers, software developers, and system an acceptable range of the floating-point
designers who integrated the design M-file. The RTL implementation should
components, swings into motion. In the then conform precisely to the fixed-point
past, the completed algorithm in MATLAB M-file – in other words, the RTL should
became the executable specification, be bit-true to the floating-point M-file. Figure 2

Reprinted from Embedded Computing Design / Spring 2004


5
Rule #5
Use your budget wisely.
DSP algorithms are almost always
developed using floating-point arithmetic,
giving the algorithm developer the ability
to evaluate a design in its best case
scenario. While general-purpose DSPs are
typically designed to perform 16- or 32-bit
arithmetic, implementing algorithms in
FPGAs or ASICs gives the designer the
ability to independently control the num-
ber of bits used to represent each number
in the algorithm. Using too may bits can
be costly – a 40% increase in the number
of bits in a multiplier can double its area
in silicon – but using too few can lead to
overflows or instabilities. When choosing
tools for implementing DSP algorithms in
silicon, designers should evaluate tools that
help automate this floating-point to fixed- Figure 3
point conversion process.

6 7
Rule #6 Rule #7 most highly skilled and best equipped
Given the time, you can design teams. The demand for a more
Keep your options open
always make a design better efficient path from algorithm to an ASIC
with vendor-independent, or FPGA has given rise to a new breed of
technology-independent flow. – use design exploration. EDA companies, such as AccelChip, that
As designers, we are all increasingly Almost invariably, getting the functionality
bridge the gap between DSP algorithm
under pressure to cut costs, which often of the design correct is just the beginning
development and silicon. Architectural
leads to having to select the supplier who – then begins the pursuit of improving
synthesis tools such as this accelerate
can provide the lowest price, the best performance to make specs and trying
design and implementation by automati-
availability, etc. The tools available in the to shrink to a smaller device or go to a
cally synthesizing algorithms written
market fall into two categories. slower speed grade to cut costs. Hardware
in floating-point MATLAB model to
engineers have an arsenal of tools and
synthesizable VHDL or Verilog models
Vendor-supplied tools are available from tricks at their disposal, but working at
suitable for standard ASIC and FPGA
companies offering FPGA devices for gate-level – even at RTL level – has its
design flows. AccelChip’s toolset also
DSP and provide integrated environments limits. Inserting intermediate registers
enables rapid design exploration targeting
spanning graphical design entry, IP block can be difficult. Optimizing quantization
fidelity, performance, area, and cost
libraries, and RTL simulation and synthesis throughout a design is particularly tedious
tradeoffs for optimal results while using
tools. But, these tools offer libraries of and error-prone when changing at the RTL
MATLAB as a golden source.
DSP functions that can only target a single level. And, if the algorithm developer
vendor’s devices. To convert a design comes up with a brilliant new idea two
Eric Cigan is the Product Marketing
from one vendor’s tools to another is at weeks into the hardware design, chances
Manager for AccelChip Inc., and is
best a time-consuming and error-prone are that the project manager will decide to
responsible for product planning and
process, leaving the designer at the mercy stick with the old design rather than risk
promotion for the AccelChip product
of the vendor in terms of the cost and the entire project schedule.
family. He has more than fourteen years’
availability. experience in the EDA industry.
The greatest benefits can be realized
Vendor-independent tools provide a more by keeping the original floating-point Reference:
flexible alternative. Once the design is MATLAB source file as the golden source 1. Xilinx, June 2003.
for all design and using algorithmic 2. Xilinx website.
captured, it can easily be retargeted from
one device family to another from the same synthesis tools that synthesize from
MATLAB to RTL. Using architectural For further information, contact Eric at:
vendor, and can even be retargeted to an
entirely different family of FPGAs. Yet synthesis tools such as AccelChip DSP
another advantage of vendor-independent Synthesis, the algorithm designer can AccelChip Inc.
flows involves the need to retarget the same make changes to the design well into 1900 McCarthy Blvd., Suite 204
design to different silicon technologies. the flow, resynthesize to RTL, and work
with the hardware designer to determine Milpitas, CA 95035
Companies find that they can meet their
need for first silicon using FPGAs, and whether the design performance and Tel: 408-943-0700 • Fax: 408-943-0661
then incorporate structured ASICs or device utilization has improved. Possible
E-mail: [email protected]
ASICs as they become available from trade-offs given different levels of ab-
product lines. While vendor-supplied tools straction are shown in Figure 3. Website: www.accelchip.com
provide IP that is only available for FPGA
devices, vendor-independent tools allow Wrapping up
designers to retarget the designs without Implementing DSP algorithms in silicon
changing the golden design source. used to be a task reserved for only the

Reprinted from Embedded Computing Design / Spring 2004

You might also like