0% found this document useful (0 votes)
4 views35 pages

Vivado HLS Update

The document provides an overview of Vivado High-Level Synthesis (HLS) and its capabilities for accelerated IP generation and integration, particularly focusing on video processing libraries and AXI4 interface support. It details new functions, improved software driver support, enhanced analysis perspectives, and the integration of HLS IP into the Vivado IP Catalog and IP Integrator. Additionally, it highlights the availability of tutorials and examples for users to effectively utilize the Vivado HLS tools.

Uploaded by

yehia.mahmoud02
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views35 pages

Vivado HLS Update

The document provides an overview of Vivado High-Level Synthesis (HLS) and its capabilities for accelerated IP generation and integration, particularly focusing on video processing libraries and AXI4 interface support. It details new functions, improved software driver support, enhanced analysis perspectives, and the integration of HLS IP into the Vivado IP Catalog and IP Integrator. Additionally, it highlights the availability of tutorials and examples for users to effectively utilize the Vivado HLS tools.

Uploaded by

yehia.mahmoud02
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 35

Vivado HLS Update

© Copyright 2013 Xilinx


.
Vivado High-Level Synthesis:
Accelerated IP Generation and Integration

C based IP Creation User Preferred System Integration Environment

C, C++ or SystemC

C Libraries System Generator for DSP


• Floating point
math.h
• Fixed point
• OpenCV

VHDL or Verilog Vivado IP Integrator

Vivado
IP Catalog

Vivado RTL Integration

.
Page 2 © Copyright 2013 Xilinx
.
Vivado HLS Video Libraries

C Video Libraries
– Available within Vivado HLS header files
• hls_video.h library
• hls_opencv.h library

Enable Migration of OpenCV Designs into Xilinx FPGA


– Libraries target real-time Full HD video processing
– Libraries support standard AXI4 Interfaces for easy system integration

.
Page 3 © Copyright 2013 Xilinx
.
Video Library: 12 New Functions

Video Data Modeling AXI4-Stream IO Functions

Linebuffer class Window class AXIvideo2Mat Mat2AXIvideo

OpenCV Interface Functions


cvMat2AXIvideo AXIvideo2cvMat cvMat2hlsMat hlsMat2cvMat
IplImage2AXIvideo AXIvideo2IplImage IplImage2hlsMat hlsMat2IplImage
CvMat2AXIvideo AXIvideo2CvMat CvMat2hlsMat hlsMat2CvMat

Video Functions
AbsDiff Duplicate MaxS Remap
AddS EqualizeHist Mean Resize
AddWeighted Erode Merge Scale
And FASTX Min Set
Avg Filter2D MinMaxLoc Sobel
AvgSdv GaussianBlur MinS Split
Cmp Harris Mul SubRS
CmpS HoughLines2 Not SubS
CornerHarris Integral PaintMask Sum
CvtColor InitUndistortRectifyMap Range Threshold
Dilate Max Reduce Zero

.
Page 4 © Copyright 2013 Xilinx
.
C Test Bench: Interface Library

Interface Libraries convert to/from OpenCV image to HLS type


– HLS MAT format: synthesizable and AXI4 Stream support

Standard OpenCV
#include "hls_opencv.h" files, formats & types
//Top Level C Function
int main (int argc, char** argv) { HLS Video Libraries
IplImage* src = cvLoadImage(INPUT_IMAGE);
IplImage* dst = cvCreateImage(cvGetSize(src), src->depth, src->nChannels);

AXI_STREAM src_axi, dst_axi; Convert to Xilinx AXI4


IplImage2AXIvideo(src, src_axi);
Video Stream
image_filter(src_axi, dst_axi, src->height, src->width);
Function to Synthesize
AXIvideo2IplImage(dst_axi, dst);

cvSaveImage(OUTPUT_IMAGE, dst); Convert Xilinx AXI4


Video Stream back to
OpenCV types

.
Page 5 © Copyright 2013 Xilinx
.
C Function to Synthesize

HLS Video Library Functions


– Drop-in Replacement for OpenCV and provide High QoR

#include "hls_video.h"
#include "ap_axi_sdata.h";
HLS Video & AXI Struct Libraries
//Top Level C Function for Synthesis
void image_filter(AXI_STREAM& inter_pix, AXI_STREAM& out_pix, int rows, int cols) {
//Create AXI streaming interfaces for the core

RGB_IMAGE img_0(rows, cols);


..etc..
RGB_IMAGE img_5(rows, cols);
RGB_PIXEL pix(50, 50, 50);
#pragma HLS dataflow
Convert Xilinx AXI4 Video Stream to
hls::AXIvideo2Mat(inter_pix, img_0); HLS Mat data type
hls::Sobel(img_0, img_1, 1, 0);
hls::SubS(img_1, pix, img_2); HLS Video functions are drop-in
hls::Scale(img_2, img_3, 2, 0);
hls::Erode(img_3, img_4); replacement for OpenCV function &
hls::Dilate(img_4, img_5); provide high QoR
hls::Mat2AXIvideo(img_5, out_pix);
Convert HLS Mat type to Xilinx AXI4
} Video Stream

.
Page 6 © Copyright 2013 Xilinx
.
Application Note XAPP1167
Accelerating OpenCV Applications with Zynq using
Vivado HLS Video Libraries

• Video Processing data types

• Compares Video Architectures

• Advantages of Video Streaming

• Review Video Interfaces

• Reference Design with source files


and project directories

Download XAPP1167 from Xilinx.com

QuickTake: Leveraging OpenCV and High-Level


Synthesis with Vivado .
Page 7 © Copyright 2013 Xilinx
.
Accelerator AXI Interconnect

Zynq PS HLS Accelerator


IP Control from ARM
– AXI4-Lite & GP Port GP Port AXI4 Lite

Zynq PS
High Throughput Access to HLS Accelerator
Memory HP Port AXI
AXI4 Stream
– AXI4-Stream using AXI-DMA DMA
ACP Port
– AXI4-Master
• The Accelerator is the master Zynq PS
HLS Accelerator
External Memory Access : HP
HP Port
L2 Cache Access: ACP AXI4 Master
ACP Port
Data transfer between HLS
IP blocks
– AXI4-Stream

© Copyright 2013 Xilinx


.
IP Integrator Supported
IP Integrator Requires an Early
Access License in 2013.1

Vivado HLS IP can be exported to IP Integrator


– Export to the Vivado IP Catalog (was previously called IP-XACT format)
– Data types supported: IPI can propagate

Add to IP Catalog
Vivado HLS IP Vivado IP Integrator (IPI)

Export to Vivado
IP Catalog

Add IP block
& connect up

Supported with Two New Tutorials


.
Page 9 © Copyright 2013 Xilinx
.
HLS IP Integration

IP Integrator (IPI) Public Release 2013.2


– HLS Output Fully Supported in IPI
– Three Tutorials on using HLS IP inside IPI
• Two connect HLS IP to the Zynq PS; One connects HLS IP with Xilinx IP

HLS IP Blocks are identified in IPI

HLS and System


Generator IP shown
inside IPI

.
Page 10 © Copyright 2013 Xilinx
.
Improved Software Driver Support

Software Drivers are Created for AXI4-Lite interfaces


– Now includes support for Linux Systems
– Drivers are also now created for Vivado IP Catalog format

Add all files to the software


project: ifdef statements ensure
Files are in automatic configuration
the Drivers
sub-directory

.
Page 11 © Copyright 2013 Xilinx
.
Enhanced Report File

Easier to find hot-spots


– The term throughput has been changed to Interval or Initiation Interval
• All reports and documentation

Top-Level function
Latency and Interval

Latency and Interval for


all instances at this
level of hierarchy

All loops and sub-loops


at this level of hierarchy

.
Page 12 © Copyright 2013 Xilinx
.
Analysis Perspective

A New Perspective for Design Analysis


– Allows Interactive Analysis

Module Hierarchy
Hierarchical Summary
and Navigation
Performance View
Scheduled operations.

Loops : shown in Yellow are


expandable and collapsible

Modules: shown in Green


open the view on sub-blocks
Performance Profile
Latency and Interval
summary for this block

.
Page 13 © Copyright 2013 Xilinx
.
Performance View

Hierarchical Navigation

Loop Hierarchy

Select operations and right-


click to cross reference with
Operations, loops and
Scheduled States the C source and HDL
functions

.
Page 14 © Copyright 2013 Xilinx
.
Resource Analysis
Resource View
Scheduled operations
associated with resource:
anything on the same row
shares the same resource

Resource Profile
Resource summary for this
block

.
Page 15 © Copyright 2013 Xilinx
.
Analysis Perspective Tutorials

Fully Supported by Two New Tutorials


– Design Analysis
– Design Optimization

.
Page 16 © Copyright 2013 Xilinx
.
Assertion Support

Assertions are supported for Synthesis


– Can be used to define bit-widths for synthesis
– Replaces the need for a Tripcount directive
Without Assertions With Assertions
SUM_X:for (i=0;i<=xlimit; i++) { assert(xlimit<32);
X_accum += A[i]; SUM_X:for (i=0;i<=xlimit; i++) {
X[i] = X_accum; X_accum += A[i];
} X[i] = X_accum;
}
SUM_Y:for (i=0;i<=ylimit; i++) { assert(ylimit<16);
Y_accum += B[i]; SUM_Y:for (i=0;i<=ylimit; i++) {
Y[i] = Y_accum; Y_accum += B[i];
} Y[i] = Y_accum;
}

* Loop Latency:
+----------+-----------+----------+ Loop Latency:
|Target II |Trip Count |Pipelined | +----------+-----------+----------+
|Target II |Trip Count |Pipelined | Index counter
+----------+-----------+----------+
|- SUM_X |1 ~ 256 |no | +----------+-----------+----------+ hardware is
|- SUM_Y |1 ~ 256 |no | |- SUM_X |1 ~ 32 |no | accurately
|- SUM_Y |1 ~ 16 |no |
+----------+-----------+----------+
+----------+-----------+----------+
sized

.
Page 17 © Copyright 2013 Xilinx
.
Improved Tutorials

Vivado HLS is now provided with 10 Tutorials


– 22 Labs which cover all aspects of Vivado HLS

Tutorial Summary Design


Introduction Basic walkthrough of GUI operations (Csim, Synth, RTL FIR
Sim, IP package)
C Validation C simulation and using the debugger Filter Window
Interface Synthesis Explain design, port and AXI interface synthesis (simple Sorter Design
HLS design to allow analysis of IO)
Arbitrary Precision Review of a floating point and fixed windowing algorithm Hamming Window

Design Analysis Using the Analysis Perspective to optimize performance DCT


of multi-hierarchy, multi-loop design.
Design Optimization with Pipelining Improving performance using pipelining at loop and Matrix Multiplier
function level and impact of IO.
RTL Verification Verify and view trace files using Vivado Xsim and DUC
Modelsim (incl. Floating Point simulation)
Creating IP for an IP Integrator Design Connecting to an IP core using IPI Windower, FFT IP
Core, Sorter
Creating IP for a Zynq Design Connecting to Zyqn with IPI and integrating driver files Accelerator
into SDK design (interrupt handling etc).
Creating IP for a System Generator Packaging a design for Sys Gen and verifying IO in Sys YUV
Design Gen (connecting interfaces etc.)

.
Page 18 © Copyright 2013 Xilinx
.
Improved AXI4 & SystemC Support

SystemC
– AXI4 Master, Streams and Lite protocols now supported
• Lite: Use the RESOURCE directive to assign ports (as C/C++)
• Stream: Use the RESOUCE directive on sc_fifo_in and sc_fifo_out ports
• Master: Use the AXI4M_bus_port class

AXI4M_bus_port<sc_fixed<32, 8> > bus_if;


– Difference between SystemC and Vivado AP types fully documented
– SystemC design no longer require to be explicitly specified
• The add_files -type option retired (and check-box in the GUI C/C++ or SystemC)

AXI4 Master Interface


– Now supported on Array ports
– Array ports can be synthesized with ap_bus IO protocol

.
Page 19 © Copyright 2013 Xilinx
.
RTL cosimulation of Floating Point Designs
Floating Point Designs
– The IEEE operators are now in the RTL simulation model
– This requires the Xilinx IEEE library is used when RTL-cosimulation is
performed

Auto Support provided: No Action Required


– SystemC RTL
– Verilog and VHDL using the Xilinx Vivado (Xsim) simulator
– Verilog and VHDL using the Mentor Graphics ModelSim simulator
– Verilog and VHDL using the Xilinx Isim simulator.

All other 3rd party HDL simulators


– The libraries must be pre-compiled before simulating floating point designs
– Open Vivado and refer to : compile_simlib –help
• Note: this is Vivado, not Vivado HLS

.
Page 20 © Copyright 2013 Xilinx
.
DSP48 Adder Resource

Adders supported for implementation in DSP48


– Adders in the C code can be targeted to a AddSub_DSP RESOURCE
– Ensures the adder or subtractor is implemented in a DSP48

Resource Specification
– Targets the adder or subtractor to a DSP48 Resource

(* USE_DSP48 = "YES" *)
module adders_add_32ns_32ns_32_1_AddSub_DSP_0 (a, b, s);
endmodule

module adders_add_32ns_32ns_32_1( …)
adders_add_32ns_32ns_32_1_AddSub_DSP_0 U1 (
.a( din0 ),
.b( din1 ),
.s( dout ));

endmodule

.
Page 21 © Copyright 2013 Xilinx
.
DSP48 Adder Implementation

Adders /Subtractors Targeted to a DSP48


Solution 1 Solution 2

.
Page 22 © Copyright 2013 Xilinx
.
FFT and FIR IP in HLS

The Xilinx FFT and FIR IP are available in Vivado HLS


– C simulates with a bit-accurate model
– Fully configurable within the C++ source code
• Pre-defined C++ structs allow the IP to be configured & accessed

Supported only for C++


– Implemented with templates

High-Quality Implementation
– Same hardware as implemented by RTL versions of this IP
– Functionality fully described in Xilinx Documentation
• LogiCORE IP Fast Fourier Transform v9.0 (document PG109)
• LogiCORE IP FIR Compiler v7.1 (document PG149)

.
Page 23 © Copyright 2013 Xilinx
.
IP Examples

Examples Included in Vivado HLS Release


– Access from the Welcome Screen
– Or from C:\Xilinx\Vivado_HLS\2013.3\examples\design
• Assuming the standard PC install path

Examples IP Designs
1024-point FFT and Inverse FFT (fixed point)
Single FFT 1024-point (fixed point)

FIR with 2 interleaved channels


3 FIRs connected in series (HB, HB, SRRC)
Updating coefficients using FIR CONFIG channel
SRRC (Square Root Raise Cosine) FIR filter

.
Page 24 © Copyright 2013 Xilinx
.
FFT Function

Using the FFT


#include "hls_fft.h“

hls::fft<STATIC_PARAM> ( // Static Parameterization Struct


INPUT_DATA_ARRAY, // Input data fixed or float
OUTPUT_DATA_ARRAY, // Output data fixed or float
OUTPUT_STATUS, // Output Status
INPUT_RUN_TIME_CONFIGURATION); // Input Run Time Configuration

– Include the hls_fft.h library in the code


• This defines the FFT and supporting structs and types
• Allows hls::fft to be instantiated in your code

– Use the STATIC_PARAM template parameter to parameterize the FFT


• The STATIC_PARAM template parameter defines all static configuration values
• The Library provides a pre-defined struct hls::ip_fft::params_t to perform this

– Optionally modify the default parameters by creating a new user defined


STATIC_PARAM struct based on the default

.
Page 25 © Copyright 2013 Xilinx
.
FIR Function

Using the FIR


#include "hls_fir.h“

// Create an instance of the FIR


static hls::FIR<STATIC_PARAM> fir1; // Static parameterization

// Execute the FIR instance fir1


fir1.run(INPUT_DATA_ARRAY, // Input Data
OUTPUT_DATA_ARRAY); // Output Data

– Include the hls_fir.h library in the code


• This defines the FIR and supporting structs and types
• Allows hls::FIR to be instantiated in your code
• Unlike the FFT, the FIR is instantiated as a class and executed with the run method

– Create the STATIC_PARAM template parameter to configure the FIR


• The STATIC_PARAM template parameter defines all static configuration values
• The library provides a pre-defined struct hls::ip_fir::params_t to perform this
– There are no default values for the Coefficients
• You Must Always create a user defined struct based on hls::ip_fir::params_t

.
Page 26 © Copyright 2013 Xilinx
.
Using the FFT and FIR IP

FFT and FIR support pipelined implementations


– The functions themselves cannot be pipelined
– They should be parameterized for pipelined operation
The data arguments are always arrays
– These will be implemented as AXI4 Streams in the RTL
• By default, arrays are implemented as BRAM interfaces

Recommendation
– Use these IP in regions where dataflow optimization is used
– This will auto-convert the input and output arrays into streaming
arrays
Alternatively, a Requirement:
– The input and output arrays must be marked as streaming using the
command set_directive_stream (pragma STREAM)

.
Page 27 © Copyright 2013 Xilinx
.
Fixed Point Math Functions

Further support for math functions

The hls_math.h library


– Now includes fixed-point functions for sin, cos and sqrt

Function Type Accuracy (ULP) Implementation Style


cos ap_fixed<32,I> 16 Synthesized
sin ap_fixed<32,I> 16 Synthesized
– The
sqrt sin and cos functions are all 32-bit
ap_fixed<W,I> 1 ap_fixed<32,Int_Bit>
Synthesized
ap_ufixed<W,I>
• Where Int_Bit specifies the number of integer bits
– The sqrt function is any width but must have a decimal point
• Cannot be all intergers or all bits
– The accuracy above is quoted with respect to the equivalent floating
point version

.
Page 28 © Copyright 2013 Xilinx
.
AXI4 Stream Interface: Ease of Use

Native Support for AXI4 Stream Interfaces


– Native = An AXI4 Stream can be specified with set_directive_interface
• No longer required to set the interface then add a resource
• This AXI4 Stream interface is part of the HDL after synthesis
• This AXI4 Stream interface is simulated by RTL co-simulation
Interface Type “axis” is AXI4 Stream

set_directive_interface –mode axis “foo” portA


Or
#pragma HLS interface axis port=portA

.
Page 29 © Copyright 2013 Xilinx
.
Pre-2013.3 Approach to AXI Streams
#if 1
// Use New Method
#pragma HLS interface axis port=portA

Existing Functionality Deprecated #else

– BUT NOT REMOVED!! // Or use old Method


#pragma HLS interface ap_fifo port=portA
– We don’t want to break existing designs #pragma HLS resource core=AXI4Stream variable=portA \
metadata="-bus_bundle Agroup“
#end

Warning:
– If you use the method for adding AXI4 Streams before 2013.3
• This is were you set the interface as a FIFO then add an AXI Resource
– You will get a FIFO interface in the RTL
– And the AXI4 Stream adapter is added during export_design

Recommendation
– Change existing AXI4 Stream directives to use the INTERFACE
directive

.
Page 30 © Copyright 2013 Xilinx
.
AXI4 Master Interface: Pipeline Support
Transaction involving an AXI4 Master Interface is now Pipelined
– Prior to 2013.3 this interface would not pipeline
– Each transfer was an “atomic” process
• The for-loop/memcpy waits until a transfer completes before starting next transfer
• This was the limiting factor in the pipeline interval

Improved performance in 2013.3


– Accesses to an AXI master interface can now be pipelined
• The performance will be much better than before

Further improvements in 2014.1


– Existing limitations: Cannot configure the based address, infer bursts, reads
and writes cannot be performed simultaneously (sequential only)
– We expect to get more performance in 2014.1
– At that time we’ll publish statistics and make more noise about this feature

.
Page 31 © Copyright 2013 Xilinx
.
Enhanced Support for Exporting IP

Sys Gen and AXI Stream Interfaces


– Design with AXI Stream interfaces now
be exported to System Generator
– The AXI Interfaces will be present and
can be connected
– Previously, AXI interfaces were not
supported in Sys Gen

AXI Lite Drivers


– Software drivers are now included in
the IP package
– When creating a local repository in
SDK simply point to the IP package
• No need to manually copy files
• Further EoU enhancements coming
.
Page 32 © Copyright 2013 Xilinx
.
New Clang Front-end
Vivado HLS has upgraded it’s front-end parser
– Now using clang instead of gcc
– Provides 64-bit support on windows
– In addition this enables continued growth of features and functionality
• More optimizations possible, messages can reference line and column etc.

Clang Side-effect: Different command options


– The new front-end does not support all gcc flags
• For example, -fpermissive is now ignored as this is not supported by clang
• If an option is not supported but provided, it will be ignored
• Clang Options: http://
clang.llvm.org/docs/UsersManual.html#command-line-options
Clang Side-Effect: More strict Syntax Checking
– Some existing working designs may fail
• Not expected to occur often, but is possible
• Example –fpermissive workaround : memcpy(dest, src), if src is volatile
pointer, cast it to a constant pointer to pass syntax checking

.
Page 33 © Copyright 2013 Xilinx
.
Design Hubs: Easier Access to Documentation

DocNav Designs Hubs


– Improved Ease-of-Use
– Find things faster Standard
Introduction
– Open Docs at the exact page Docs and
Videos
High Level Synthesis
– Getting Started Videos
– Tutorials
– Key Concepts
– FAQs
• These and the solution center will
be updated in the coming weeks App Notes and
• Others such as “Designing with Videos all
grouped
Video” etc will be added
• Ideas for topics are welcome

.
Page 34 © Copyright 2013 Xilinx
.
Thank You

.
Page 35 © Copyright 2013 Xilinx
.

You might also like