Dsp48 Macro V3.0: Logicore Ip Product Guide
Dsp48 Macro V3.0: Logicore Ip Product Guide
Dsp48 Macro V3.0: Logicore Ip Product Guide
Chapter 1: Overview
Feature Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Licensing and Ordering Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Appendix B: Debugging
Finding Help on Xilinx.com . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Debug Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
can be selected through a single port on the Resources Performance and Resource Utilization web page
generated core. Provided with Core
Design Files Encrypted RTL
Example
Features Design
Not Provided
Notes:
1. For a complete listing of supported devices, see the Vivado IP
catalog.
2. For the supported versions of the tools, see the
Xilinx Design Tools: Release Notes Guide.
Overview
The DSP48 Macro core allows straightforward configuration of the DSP Slice by specifying
user-defined instructions. Multiple instructions can be specified, and the instruction being
performed can be changed dynamically at run time.
Feature Summary
The DSP48 Macro core supports up to 64 separate instructions, which can be selected
dynamically though a single control port.
The pipeline stages in the core can be allocated automatically in the user interface in the
Vivado® Integrated Design Environment (IDE), specified in tiers, or you can individually
specify them.
The option to enable and specify the use of DSP Slice cascade ports allows multiple DSP48
Macro cores to be efficiently connected to construct a larger circuit.
Applications
• Basic DSP Slice operations such as accumulator, multiplier, adder
• Time-shared math operations using a single DSP Slice such as complex multiplication
• DSP engine / co-processor
• MAC FIR
For more information, visit the DSP48 Macro product web page.
Product Specification
Resource Utilization
For full details about performance and resource utilization, visit the Performance and
Resource Utilization web page.
Port Descriptions
X-Ref Target - Figure 2-1
&/. 3
6&/5 &$55<287
&(
' $&287
$ %&287
% 3&287
&21&$7 &$55<&$6&287
&
&$55<,1
$&,1
%&,1
3&,1
&$55<&$6&,1
6(/
&('
&('
&(6(/
&(6(/
6&/5'
6&/56(/
Figure 2-1 and Table 2-1 illustrate and define the schematic symbol signal names. All
control inputs are active-High.
Clocking
The DSP48 Macro requires a single clock, CLK, and is active-High triggered.
Resets
The DSP48 Macro has the option of selecting a global synchronous reset, or a reset per
path.
Asserting an SCLR pin for a single cycle resets the relevant registers.
Instruction Format
The instructions are case insensitive and ignore spaces between operands. The left side of
the arithmetic expression, P=, is implicitly declared and should not be specified
- The >>17 operator targets the DSP Slice 17-bit wire shift; valid only for P and
PCIN operators.
- Rounding functions require that the P output width is less than full precision.
- See Supported Functions for further details.
Examples
Accumulator
1. C : Load
2. C+P : Accumulate (add)
3. P-C : Accumulate (subtract)
1. C+CONCAT+P
Multiply Accumulate
1. A*B : Load
2. P+A*B: Multiply accumulate
15-tap symmetric filter, where data is provided sequentially on the A,D and B inputs. A and
D are used for the data values and B is used for the filter coefficients.
Notation
varname = [ l1,l2,....]
This indicates varname supports the list of operator/operand combinations l1, l2, ....
7 Series Devices
preadder = [ (A+D) , (D+A) , (D-A), D, -A , A, (ACIN+D) , (D+ACIN) , (D-ACIN), -ACIN , ACIN ]
mult_ip1 = [ ACIN , A , preadder ]
mult_ip2 = [ BCIN , B , 1 ]
mult = [ mult_ip1 * mult_ip2 , mult_ip2 * mult_ip1 ]
The 1 is not explicitly required when defining an instruction; the 1 and corresponding *
operator are ignored.
xmux = [ CONCAT , P , 0 ]
ymux = [ C , 0 ]
zmux = [ C , PCIN , P , P>>17 , PCIN>>17, 0 ]
cinmux = [ CARRYIN , CARRYCASCIN , 0 ]
Similarly, 0 is not explicitly required when defining an instruction. The 0 and corresponding
operator are ignored.
xycomb = [ xmux + ymux , ymux + xmux , mult ]
valid instructions = [
Note that for the rndsimple functions, operand C is not supported for zmux.
UltraScale™ Devices
preadder = [ (A+D) , (D+A) , (D-A), D, -A , A, (ACIN+D) , (D+ACIN) , (D-ACIN), -ACIN , ACIN ]
mult_ip1 = [ ACIN , A , preadder ]
mult_ip2 = [ BCIN , B , 1 ]
mult = [ mult_ip1 * mult_ip2 , mult_ip2 * mult_ip1 ]
The 1 is not explicitly required when defining an instruction; the 1 and corresponding *
operator are ignored.
wmux = [ C , P , 0 ]
xmux = [ CONCAT , P , 0 ]
ymux = [ C , 0 ]
zmux = [ C , PCIN , P , P>>17 , PCIN>>17, 0 ]
cinmux = [ CARRYIN , CARRYCASCIN , 0 ]
Similarly, 0 is not explicitly required when defining an instruction. The 0 and corresponding
operator are ignored.
xycomb = [ wmux + xmux + ymux , wmux + ymux + xmux , mult ]
valid instructions = [
Restrictions
• CONCAT operand is mutually exclusive to mult operands. When any input to the
multiplier is specified, the CONCAT operand is restricted for all instructions, or vice
versa. The inclusion of 1 in mult_ip2 enables all mult_ip1 combinations to be used as
direct inputs to the second stage add/sub.
• The choice between A and ACIN is static; after one is specified the other is restricted.
Similarly for B and BCIN.
• The use of CARRYCASCIN is restricted to a subset of instructions.
Supported Functions
RNDSIMPLE(arg)
RNDSIMPLE implements a non-symmetric round to negative; this is equivalent to the
MATLAB® function ceil(arg-0.5).
The binary point of arg is defined by the full precision output width and the specified core
output width; the core output is taken from the upper MSBs of the DSP Slice output with the
remaining LSBs considered as the fractional portion. The binary point is taken as full
precision p_width - p_width.
A rounding constant with a binary value of 0.0111...(or 0.499...) is added to arg and the LSBs
removed by the process of reinterpreting the full precision output to the specified core
output width. The LSBs remain on the accumulator.
Arg can include CARRYIN. This enables CARRYIN to determine the rounding direction. The
assertion of CARRYIN is equivalent to adding 0.00...01. This modifies the rounding constant
to 0.100.. (or 0.5) and therefore the rounding direction. This can be used to implement
random rounding.
RNDSYM(arg)
RNDSYM implements a symmetric round to highest magnitude; this is equivalent to the
MATLAB function round(arg).
The binary point of arg is defined in the same manner as for the RNDSIMPLE(arg) function.
A rounding constant with a binary value of 0.0111.... (or 0.499...) is added to arg along with
a carryin, defined as the inverse sign bit of arg. The LSBs are removed by the process of
reinterpreting the full precision output to the specified core output width. The LSBs remain
on the accumulator. For multiply operations, the XNOR of the multiplier inputs is used
instead of the inverse sign bit of arg, performing multiply rounding toward infinity.
When carryin is 1, this is equivalent to adding 0.00....01. This modifies the rounding
constant to 0.100.. (or 0.5) and therefore the rounding direction.
RNDMACC(arg)
When performing symmetric rounding towards infinity of MACC and add accumulate
operations, it is difficult to determine the sign of the output ahead of time, so the round
might cost an extra clock cycle. This extra cycle can be eliminated by adding the C input
rounding constant (typically a binary value of 0.0111…) on the very first cycle. The sign bit
of the last but one cycle of the accumulator can be used for the final rounding operation
done in the final accumulate cycle. This implementation is a practical way to save a clock
cycle. There is a rare chance that the final accumulate operation can flip the sign of the
output from the previous accumulated value.
To perform this for a MACC operation, the following DSP48 Macro instructions are used:
&21&$7
$ $ $ $ $
$&,1 0
&$55<287
% % % % %
3
3
%&,1
,102'(
&RQVWDQW 35(''68%
)DEULF
5HJLVWHU
5RXQGLQJ
68%75$&7
FRQVWDQW
23&2'(
'636OLFH
5HJLVWHU
'\QDPLF
6(/ 6(/ 6(/ 6(/ 6(/
6(/ 520
6WDWLF
0X[
0X[
&,16(/
&$55<,1 &,1 &,1 &,1
&,1 &,1
&$55<&$6&,1 3&,1
Figure 3-1 illustrates the generalized DSP48 Macro implementation for 7 series devices.
Note:
• The second stage add/sub input mux implementations vary depending on the selected
device.
• A static mux is resolved at core generation, whereas a dynamic mux is implemented in
the generated core.
Users requiring a specific mapping to registers in the DSP48 primitive can use Table 3-3 to
determine the corresponding DSP48 Macro registers. See the UltraScale Architecture DSP
Slice User Guide (UG579) [Ref 2] or the 7 Series FPGA DSP48E1 Slice User Guide (UG479)
[Ref 3].
Notes:
1. The mapping of A4 changes depending on whether the pre-adder is used so the tiered latency model is
maintained.
• Vivado Design Suite User Guide: Designing IP Subsystems using IP Integrator (UG994)
[Ref 4]
• Vivado Design Suite User Guide: Designing with IP (UG896) [Ref 1]
• Vivado Design Suite User Guide: Getting Started (UG910) [Ref 5]
• Vivado Design Suite User Guide: Logic Simulation (UG900) [Ref 7]
If you are customizing and generating the core in the Vivado IP Integrator, see the Vivado
Design Suite User Guide: Designing IP Subsystems using IP Integrator (UG994) [Ref 4] for
detailed information. IP Integrator might auto-compute certain configuration values when
validating or generating the design. To check whether the values do change, see the
description of the parameter in this chapter. To view the parameter value you can run the
validate_bd_design command in the Tcl console.
You can customize the IP for use in your design by specifying values for the various
parameters associated with the IP core using the following steps:
For details, see the Vivado Design Suite User Guide: Designing with IP (UG896) [Ref 1] and
the Vivado Design Suite User Guide: Getting Started (UG910) [Ref 5].
The DSP48 Macro core has three pages used to configure the core plus two informational
tabs.
Tab 1: IP Symbol
The IP Symbol tab illustrates the core pinout.
Instructions Page
This page is used to specify the instructions that the core is to implement.
• Component Name: The name of the core component to be instantiated. The name
must begin with a letter and be composed of the following characters: a to z, 0 to 9,
and “_”.
• Available instructions: Informational parameter. When the ‘Show Filtered’ check-box is
ticked, the available instruction list dynamically updates to show the remaining valid
instructions given the instruction that is currently being entered into the user interface.
Instructions can be selected in the Available instruction panel and “drag and dropped”
into an Instruction parameter.
• Instructions 0 to 7: Specifies the operations the core is to implement. Text entry of the
desired arithmetic operation to be generated on the P output port. The left side of the
expression, P=, is implicitly declared and should not be specified. Instructions are case
insensitive.
See Instruction Format for further details on the instruction format and supported
operations.
• Pipeline Options: Specifies the pipeline method to be used: Automatic, By Tier and
Expert.
• Custom Pipeline options: Specifies the pipeline depth of the various input paths.
° Tier 1 to 6: When “By Tier” has been selected for Pipeline Options, these
parameters are used to enable/disable the registers across all the input paths for a
given pipeline stage. Some restrictions are enforced.
° Individual registers: When “Expert” has been selected for the Pipeline Options,
these parameters are used to enable/disable individual register stages. Some
restrictions are enforced.
- The P register is forced when P has been specified in an expression.
Asynchronous feedback is not supported.
See Detailed Pipeline Implementation for further details on how the various pipeline stages
relate to the core implementation.
• Control Ports:
° CE
- Global check box: enables a single CE pin for all registers in the core.
- D, A, B, CONCAT, C, M, P, SEL/CARRYIN check boxes: enable individual CE pins
for all enabled registers in the core.
° SCLR
- Global check box: enables a single SCLR pin for all registers in the core.
- D, A, B, CONCAT, C, M, P, SEL/CARRYIN checkboxes: enable an individual SCLR
pin for each datapath.
Implementation Page
• Input Port Properties: Specifies the bit-width of the D, A, B, CONCAT and C input
ports. See Table 2-1 for maximum port widths.
• Output Port Properties: Specifies the precision of the P output port; Full Precision and
User Defined.
• The Vivado IDE I automatically calculates the full precision output width given the
width of the specified input ports. When P has been used as an operand, the full
precision output width is set to the full DSP Slice width of 48 bits. When Full Precision is
selected, the output width is set to the full precision value. When User Defined is
selected, the output width can be set to any value up to 48 bits. When the specified
value is less than the full precision width, the output is truncated, that is, the LSBs are
removed. This option should be used when a rounding function has been specified.
° Width: Specifies the actual output width of the P output port. When specified to be
less than Full Precision, the DSP Slice output is truncated.
• Additional Ports: Specifies if the core has a CARRYOUT output port or the ACOUT,
BCOUT, PCOUT or CARRYCASCOUT cascaded output ports.
• Use DSP Slice: Specifies if the core implementation uses an DSP Slice or FPGA logic
equivalent. When a FPGA logic implementation is specified, the core is unlikely to
achieve the same F max as DSP Slice.
User Parameters
Table 4-1 shows the relationship between the fields in the Vivado IDE and the User
Parameters (which can be viewed in the Tcl Console).
Output Generation
For details, see the Vivado Design Suite User Guide: Designing with IP (UG896) [Ref 1].
Tab 1: Instructions
The Instruction tab is used to define the operations that the core is to implement. Each
instruction can be entered on a new line, or in a comma delimited list, and are enumerated
from the top-down. A maximum of 64 instructions can be specified. See Instructions Page
and Instruction Format of Using the DSP48 Macro IP Core for details on supported
instructions and their format.
Tab 3: Implementation
The Implementation tab is used to define implementation options. See the Implementation
Page for details of all the core parameters on this tab.
• Output Port Properties: Specifies the precision of the P output port; Full Precision and
User Defined.
• The core automatically calculates the full precision output width and binary point
position given the width and binary point of the specified input ports. When P has
been used as an operand, the full precision output width is set to the full DSP Slice
width of 48 bits.
• When Full Precision is selected, the output is set to full precision width and binary
point.
• When User Defined is selected, the output width can be set to any value up to 48 bits.
The output formatting has two modes of operation:
- When the output width is specified to be less than the full precision width, the
output is truncated, that is, the LSBs are removed.
- The binary point is anchored. The output of the core behaves in the same
manner as a System Generator Convert block with the following settings:
Quantization > Truncate and Overflow > Wrap. Some restrictions on the
binary point values are enforced; it cannot be greater than the full precision
binary point value and its permitted minimum value will be modified to ensure
that when the binary point value and output width are combined, the resulting
MSB value does not exceed 48 bits.
° Binary Point: Specifies the user-defined binary point of the P output port.
• ce: Selects either a global clock enable pin, or separate clock enable pins for each
register. When separate clock enable pins are selected, these are managed within
System Generator to correctly handle multirate constraints.
• rst: Selects either a global reset pin, or separate reset pins for each datapath. In a
similar way to the separate clock enable pins, System Generator manages the situation
where datapaths with separate reset controls have different rates.
FPGA Area Estimation: See the System Generator for DSP documentation for detailed
information about this section.
Required Constraints
This section is not applicable for this IP core.
Clock Frequencies
This section is not applicable for this IP core.
Clock Management
This section is not applicable for this IP core.
Clock Placement
This section is not applicable for this IP core.
Banking
This section is not applicable for this IP core.
Transceiver Placement
This section is not applicable for this IP core.
Simulation
For comprehensive information about Vivado simulation components, as well as
information about using supported third party tools, see the Vivado Design Suite User
Guide: Logic Simulation (UG900) [Ref 7].
The Vivado IP upgrade functionality can be used to upgrade an existing XCO/XCI file from
v2.1 to DSP48 Macro v3.0. There are no changes of functionality, port or configuration from
v2.1 to v3.0.
Parameter Changes
There are no parameter differences between DSP48 Macro versions v2.0 and v2.1 and
DSP48 Macro v3.0.
Port Changes
There are no port changes between DSP48 Macro versions v2.0 and v2.1 and DSP48 Macro
v3.0.
Functionality Changes
There are no functionality changes between DSP48 Macro versions v2.0 and v2.1 and DSP48
Macro v3.0.
Parameter Changes
No change.
Port Changes
No change.
Other Changes
No change.
Simulation
Starting with DSP48 Macro v3.0 (2013.3 version), behavioral simulation models have been
replaced with IEEE P1735 Encrypted VHDL. The resulting model is bit and cycle accurate with
the final netlist. For more information on simulation, see the Vivado Design Suite User Guide:
Logic Simulation (UG900) [Ref 7].
Debugging
This appendix includes details about resources available on the Xilinx Support website and
debugging tools.
Documentation
This product guide is the main document associated with the DSP48 Macro. This guide,
along with documentation related to all products that aid in the design process, can be
found on the Xilinx Support web page or by using the Xilinx® Documentation Navigator.
Download the Xilinx Documentation Navigator from the Downloads page. For more
information about this tool and the features available, open the online help after
installation.
Answer Records
Answer Records include information about commonly encountered problems, helpful
information on how to resolve these problems, and any known issues with a Xilinx product.
Answer Records are created and maintained daily ensuring that users have access to the
most accurate information available. DSP48 Macro
Answer Records for this core can be located by using the Search Support box on the main
Xilinx support web page. To maximize your search results, use keywords such as:
• Product name
• Tool message(s)
• Summary of the issue encountered
A filter search is available after results are returned to further target the results.
AR: 54500
Technical Support
Xilinx provides technical support in the Xilinx Support web page for this LogiCORE™ IP
product when used as described in the product documentation. Xilinx cannot guarantee
timing, functionality, or support if you do any of the following:
• Implement the solution in devices that are not defined in the documentation.
• Customize the solution beyond that allowed in the product documentation.
• Change any section of the design labeled DO NOT MODIFY.
To contact Xilinx Technical Support, navigate to the Xilinx Support web page.
Debug Tools
There are many tools available to address DSP48 Macro design issues. It is important to
know which tools are useful for debugging various situations.
The Vivado logic analyzer is used with the logic debug IP cores, including:
See the Vivado Design Suite User Guide: Programming and Debugging (UG908) [Ref 9].
Xilinx Resources
For support resources such as Answers, Documentation, Downloads, and Forums, see Xilinx
Support.
References
These documents provide supplemental material useful with this product guide:
Revision History
The following table shows the revision history for this document.