FPGA IPUG 02037 2 2 CNN Accelerator IP Core
FPGA IPUG 02037 2 2 CNN Accelerator IP Core
User Guide
FPGA-IPUG-02037-2.2
December 2020
CNN Accelerator IP Core
User Guide
Disclaimers
Lattice makes no warranty, representation, or guarantee regarding the accuracy of information contained in this document or the suitability of its
products for any particular purpose. All information herein is provided AS IS and with all faults, and all risk associated with such information is entirely
with Buyer. Buyer shall not rely on any data and performance specifications or parameters provided herein. Products sold by Lattice have been
subject to limited testing and it is the Buyer's responsibility to independently determine the suitability of any products and to test and verify the
same. No Lattice products should be used in conjunction with mission- or safety-critical or any other application in which the failure of Lattice’s
product could create a situation where personal injury, death, severe property or environmental damage may occur. The information provided in this
document is proprietary to Lattice Semiconductor, and Lattice reserves the right to make any changes to the information in this document or to any
products at any time without notice.
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
2 FPGA-IPUG-02037-2.2
CNN Accelerator IP Core
User Guide
Contents
Acronyms in This Document ................................................................................................................................................. 5
1. Introduction .................................................................................................................................................................. 6
1.1. Quick Facts .......................................................................................................................................................... 6
1.2. Features .............................................................................................................................................................. 6
2. Functional Descriptions ................................................................................................................................................ 7
2.1. Overview ............................................................................................................................................................. 7
2.2. Interface Descriptions ......................................................................................................................................... 8
2.2.1. Control and Status Interface ........................................................................................................................ 10
2.2.2. Input Data Interface ..................................................................................................................................... 12
2.2.3. Result Interface ............................................................................................................................................ 12
2.2.4. DRAM Interface ............................................................................................................................................ 13
2.3. Clock Domain .................................................................................................................................................... 14
2.4. Reset Behavior .................................................................................................................................................. 14
2.5. Register Description .......................................................................................................................................... 15
2.6. Operation Sequence.......................................................................................................................................... 15
2.6.1. Command Format ........................................................................................................................................ 15
2.6.2. Input Data Format ........................................................................................................................................ 16
2.6.3. Output Data Format ..................................................................................................................................... 16
2.7. Supported Commands ...................................................................................................................................... 16
3. Parameter Settings ..................................................................................................................................................... 17
4. IP Generation and Evaluation ..................................................................................................................................... 20
4.1. Licensing the IP.................................................................................................................................................. 20
4.2. Generation and Synthesis ................................................................................................................................. 20
4.2.1. Getting Started ............................................................................................................................................. 20
4.2.2. Configuring the IP Core in Clarity ................................................................................................................. 21
4.2.3. Instantiating the IP Core ............................................................................................................................... 21
4.3. Running Functional Simulation ......................................................................................................................... 22
4.4. Hardware Evaluation ......................................................................................................................................... 22
5. Ordering Part Number ................................................................................................................................................ 23
References .......................................................................................................................................................................... 24
Technical Support Assistance ............................................................................................................................................. 25
Appendix A. Resource Utilization ....................................................................................................................................... 26
Revision History................................................................................................................................................................... 27
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
FPGA-IPUG-02037-2.2 3
CNN Accelerator IP Core
User Guide
Figures
Figure 2.1. Functional Block Diagram ...................................................................................................................................7
Figure 2.2. CNN Accelerator IP Core Interface Diagram .......................................................................................................8
Figure 2.3. Control and Status Interface Timing Diagram...................................................................................................10
Figure 2.4 General Purpose Output Sample Application ....................................................................................................11
Figure 2.5. Result Interface Timing Diagram.......................................................................................................................13
Figure 2.6. Result Interface Timing Diagram.......................................................................................................................13
Figure 2.7 Clock Domain Diagram .......................................................................................................................................14
Figure 2.8. Reset Timing Diagram .......................................................................................................................................14
Figure 2.9. Command format..............................................................................................................................................15
Figure 3.1. CNN Accelerator IP Core Configuration User Interface ....................................................................................18
Figure 4.1. CNN Accelerator IP Core in Clarity Designer Catalog Tab .................................................................................20
Tables
Table 1.1. Quick Facts ...........................................................................................................................................................6
Table 2.1. CNN Accelerator IP Core Signal Descriptions .......................................................................................................8
Table 3.1. Attributes Table .................................................................................................................................................17
Table 3.2. Attributes Descriptions ......................................................................................................................................18
Table 3.3 Combinations of Use Paired Convolution Engine and NO Convolution Selection Mux Settings ........................19
Table 4.1. File List ...............................................................................................................................................................21
Table A.1. Performance and Resource Utilization¹ .............................................................................................................26
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
4 FPGA-IPUG-02037-2.2
CNN Accelerator IP Core
User Guide
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
FPGA-IPUG-02037-2.2 5
CNN Accelerator IP Core
User Guide
1. Introduction
The Lattice Semiconductor CNN Accelerator IP Core is a calculation engine for Deep Neural Network with fixed point
weight or binarized weight. It calculates full layers of Neural Network including convolution layer, pooling layer, batch
normalization layer, and full connect layer by executing sequence code with weight value which is generated by Lattice
SensAI Neural Network Compiler. The engine is optimized for convolutional neural network, so it can be used for vision-
based application such as classification or object detection and tracking. The IP Core does not require an extra
processor; it can perform all required calculations by itself.
The design is implemented in Verilog HDL. It can be targeted to ECP5 and ECP5-5G FPGA devices, and implemented
using the Lattice Diamond® software Place and Route tool integrated with the Synplify Pro® synthesis tool.
1.2. Features
The key features of the CNN Accelerator IP Core include:
Support for convolution layer, max/ave pooling layer, batch normalization layer, and full connect layer
Configurable bit width of weight (16-bit, 1-bit)
Configurable bit width of activation (16/8-bit, 1-bit)
Dynamic support for 16-bit and 8-bit width of activation
Configurable number of memory blocks for tradeoff between resource and performance
Configurable number of convolution engines for tradeoff between resource and performance
Optimization for 3x3 2D convolution calculation
Dynamic support for various 1D convolution from 1 to 72 taps
Supports max pooling with overlap (For example, kernel 3, stride 2)
Supports average pooling for 2x2 convolution
Supports global average pooling by full connect engine
Supports paired convolution engines to improve performance
Configurable input byte mode (signed, unsigned, disable)
Partial DRAM access
Configurable maximum burst length (32, 256)
Supports MobileNet
Supports general purpose output signal for controlling external logic through command code
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
6 FPGA-IPUG-02037-2.2
CNN Accelerator IP Core
User Guide
2. Functional Descriptions
2.1. Overview
The CNN Accelerator IP Core performs a series of calculations per command sequence that is generated by the Lattice
Neural Network Compiler tool. Commands must be written at DRAM address specified by the i_code_base_addr signal
which is accessible through AXI BUS. Input data may be read from DRAM at a pre-defined address or directly written
through the input data write port. After command code and input data are available, CNN Accelerator IP Core starts
calculation at the rising edge of start signal. During calculation, intermediate data and final result may be transferred to
DRAM or fed out through the result write port. All operations are fully-programmable by command code.
Control
Memory Pool
Control Unit
Engine Pool
CONV FC Pooling
CONV
EU EU EU
EU
AXI Seq
MEM15
master Gen 15
Save/Load
input/output/intermediate data
Input Data
Result
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
FPGA-IPUG-02037-2.2 7
CNN Accelerator IP Core
User Guide
clk
Clk/Reset resetn
AXI4 Bus
DRAM I/F
i_start
o_rd_rdy
Control
o_gpo
CNN o_we
i_code_base_addr
Accelerator o_dout Result
IP Core
LMMI
Input Data
o_status
Status
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
8 FPGA-IPUG-02037-2.2
CNN Accelerator IP Core
User Guide
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
FPGA-IPUG-02037-2.2 9
CNN Accelerator IP Core
User Guide
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
10 FPGA-IPUG-02037-2.2
CNN Accelerator IP Core
User Guide
o_gpo A B
o_we
=A
Post Process
A
o_gpo
Compact CNN o_dout
Accelerator
o_we
Post Process
B
=B
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
FPGA-IPUG-02037-2.2 11
CNN Accelerator IP Core
User Guide
CNN Accelerator IP Core v2.0 and earlier versions use a simple SRAM interface to input data. This is changed to LMMI
starting from IP Core v2.1. For compatibility with existing designs The input data interface connection should be
matched as follows:
IP Core v2.0 connections below:
.i_we (w_we ),
.i_mem_sel (w_wmemsel ),
.i_waddr (w_waddr ),
.i_din (w_dout ),
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
12 FPGA-IPUG-02037-2.2
CNN Accelerator IP Core
User Guide
The command code can also simply feed result data to external logic through this Result interface when the Store
Output option is disabled. Interface consists o_we as valid indicator and o_dout as 16-bit data as shown in Figure 2.6.
Usually, it is a single burst series of 16-bit data. Also, it is fully programmable by command code.
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
FPGA-IPUG-02037-2.2 13
CNN Accelerator IP Core
User Guide
Memory Pool
aclk Domain
Control Unit
Engine Pool
CONV FC Pooling
CONV
EU EU EU
EU
AXI Seq
MEM15
master Gen 15
Save/Load
input/output/intermediate data
Input Data
Result
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
14 FPGA-IPUG-02037-2.2
CNN Accelerator IP Core
User Guide
Some AXI4 output signals are constant outputs; these are not affected by reset. Please refer to Table 2.1 for the AXI4
output signals that are constant.
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
FPGA-IPUG-02037-2.2 15
CNN Accelerator IP Core
User Guide
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
16 FPGA-IPUG-02037-2.2
CNN Accelerator IP Core
User Guide
3. Parameter Settings
The IP Catalog is used to create IP and architectural modules in the Diamond software. You may refer to the IP
Generation and Evaluation section on how to generate the IP.
Table 3.1 provides the list of user-configurable attributes for the CNN Accelerator IP Core. The attribute values are
specified using the IP core Configuration user interface in Clarity Designer as shown in Figure 3.1.
Table 3.1. Attributes Table
Attribute Selectable Values Default Dependency on Other Attributes
Machine Learning Type CNN, BNN CNN If MobileNet Enable is checked,
selected value becomes CNN.
No. of Convolution Engines 1-8 8 If NO Convolution Selection Mux is
Checked, Selectable Values are
reduced to {1, 2, 4, 8}.
If MobileNet Enable is checked,
selected value becomes 8.
No. of Internal Storage of Blob 2 - 16 16 If MobileNet Enable is checked,
selected value becomes 16.
BNN Blob Type +1/-1, +1/0 +1/-1 Valid only when
Machine Learning Type = BNN
Byte Mode SIGNED, UNSIGNED, SIGNED —
DISABLE
Use Paired Convolution Engine Unchecked, Checked Unchecked If MobileNet Enable is checked,
selected value becomes Unchecked.
NO Convolution Selection Mux Unchecked, Checked Unchecked If MobileNet Enable is checked,
selected value becomes Checked.
Maximum Burst Length 32, 256 32 —
MobileNet Enable Unchecked, Checked Unchecked —
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
FPGA-IPUG-02037-2.2 17
CNN Accelerator IP Core
User Guide
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
18 FPGA-IPUG-02037-2.2
CNN Accelerator IP Core
User Guide
Attribute Description
BNN Blob Type Selects the type of binary blob data, either +1/-1 or +1/0. This setting should be matched
to the Lattice Neural Network Compiler.
Byte Mode Specifies the byte mode of input data.
SIGNED – input data is signed 16-bit/8-bit data, similar to v1.1.
UNSIGNED – input data is signed 16-bit data or unsigned 8-bit data.
DISABLE – input data is unsigned 16-bit data only. This option saves LUT because the
byte mode support is not implemented.
Use Paired Convolution Engine Enables the use of paired convolution engine.
Unchecked – Not use paired convolution engine, similar to v1.1.
Checked – Use paired convolution engine. The total number of convolution engine is
double of the No. of Convolution Engines value.
NO Convolution Selection Mux Disables the use of convolution selection multiplexor.
Unchecked – Use of convolution selection multiplexor, similar to v1.1.
Checked – Use dedicated connection instead of multiplexor.
Maximum Burst Length Specifies the maximum burst length of AXI4 bus.
This should be set less than or equal to the maximum burst length that is supported by
the connected slave device/memory.
MobileNet Enable Enables MobileNet mode.
Unchecked – Not use MobileNet mode, similar to v2.0.
Checked – MobileNet mode.
MobileNet improves 1x1 convolution and depthwise convolution up to 8x at the cost of
more LUT and slightly reduced Fmax. This option should not be Checked when the neural
network does not have 1x1 convolution and depthwise convolution.
Any combination of Use Paired Convolution Engine and NO Convolution Selection Mux settings are supported. It is
recommended to use NO Convolution Selection Mux=Checked because it uses less LUT with no negative side effect.
However, this is only supported in SensAI v2.0. The NO Convolution Selection Mux=Unchecked is for backward
compatibility.
The Use Paired Convolution Engine has its pros and cons:
Pro – Enhances performance of convolution calculation
Con – Increase DSP and LUT consumption and may reduce operation frequency of the core clock
You should carefully choose setting based on resource and calculation requirement. For example, if neural network
does not have much convolution calculation, overall performance may be reduced when using Use Paired Convolution
Engine=Checked due to slower clock. The summary of Use Paired Convolution Engine and NO Convolution Selection
Mux setting combination is shown in Table 3.3.
Table 3.3 Combinations of Use Paired Convolution Engine and NO Convolution Selection Mux Settings
Use Paired NO Convolution Description
Convolution Engine Selection Mux
Unchecked Unchecked Backward compatible mode. Use for existing firmware and SensAI 1.x.
Unchecked Checked Recommended for SensAI 2.0 or later.
Checked Unchecked Not recommended.
Checked Checked Enhances performance of convolution calculation at the cost of increase in DSP and
LUT utilization. Note that this setting may also reduce clock frequency.
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
FPGA-IPUG-02037-2.2 19
CNN Accelerator IP Core
User Guide
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
20 FPGA-IPUG-02037-2.2
CNN Accelerator IP Core
User Guide
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
FPGA-IPUG-02037-2.2 21
CNN Accelerator IP Core
User Guide
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
22 FPGA-IPUG-02037-2.2
CNN Accelerator IP Core
User Guide
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
FPGA-IPUG-02037-2.2 23
CNN Accelerator IP Core
User Guide
References
ECP5 FPGA Web Page in latticesemi.com
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
24 FPGA-IPUG-02037-2.2
CNN Accelerator IP Core
User Guide
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
FPGA-IPUG-02037-2.2 25
CNN Accelerator IP Core
User Guide
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
26 FPGA-IPUG-02037-2.2
CNN Accelerator IP Core
User Guide
Revision History
Revision 2.2, December 2020
Section Change Summary
Acronyms in This Document Added this section.
Introduction Added General Purpose Output feature in Features section.
Updated Table 1.1.
Functional Descriptions Added o_gpo signal in Figure 2.2 and Table 2.1.
Added General Purpose Output section.
Parameter Settings Updated Figure 3.1.
References Updated this section.
© 2018-2020 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal.
All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.
FPGA-IPUG-02037-2.2 27
www.latticesemi.com