Lab 06
Lab 06
Objectives
After completing this lab, you will be able to:
Profile a software application
Understand the steps and directives involved in creating an IP-XACT adapter in Vivado HLS
Create a processor system using IP Integrator in Vivado
Integrate the generated IP-XACT adapter into the created processor system
Profile an application having an hardware accelerator
The Design
The design consists of a FIR filter to filter a 4 KHz tone added to CD quality (48 KHz) music. The
characteristic of the filter is as follows:
FS=48000 Hz
FPASS1=2000 Hz
PSTOP1=3800 Hz
FSTOP2=4200 Hz
FPASS2=6000 Hz
APASS1=APASS2=1 dB
ASTOP=60 dB
This lab requires you to develop a peripheral core of the designed filter that can be instantiated in a
processor system.
Procedure
This lab is separated into steps that consist of general overview statements that provide information on
the detailed instructions that follow. Follow these detailed instructions to progress through the lab.
1-1-1. Open Vivado Tcl Shell by selecting Start > All Programs > Xilinx Design Tools > Vivado
2015.2 > Vivado 2015.2 Tcl Shell
1-1-2. In the shell window, either change the directory to <2015_2_zynq_sources>/lab6_zed for the
ZedBoard or <2015_2_zynq_sources>/lab6_zybo for the Zybo using the cd account.
1-1-3. Run the provided script file to create an initial system having either zed_audio_ctrl for the
Zedboard or zybo_audio_ctrl for the Zybo, and GPIO peripherals by typing the following command:
source audio_project_create.tcl
The script will be run and the initial system, shown below, will be created.
Figure 1. Block design after I2C based zed_audio_ctrl core added and connections made
for the ZedBoard
Figure 1. Block design after I2C based zybo_audio_ctrl core added and connections made
for the Zybo
1-2. Verify addresses and validate the design. Generate the system_wrapper
file, and add the provided Xilinx Design Constraints (XDC) file either from
the <2015_2_zynq_sources>\lab6_zed for the ZedBoard or
<2015_2_zynq_sources>\lab6_zybo for the Zybo directory.
1-2-1. Click on the Address Editor, and expand the processing_system7_0 > Data if necessary.
1-2-2. Run Design Validation (Tools > Validate Design) and verify there are no errors
1-2-3. In the sources view, right-click on the block diagram file, system.bd, and select Create HDL
Wrapper to update the HDL wrapper file. When prompted, click OK with the Let Vivado manage
wrapper and auto-update option.
1-2-4. Click Add Sources in the Flow Navigator pane, select Add or Create Constraints, and click
Next.
1-2-5. Click the Green Plus button then Add Files…, browse either to <2015_2_zynq_sources>\
lab6_zed folder and select zed_audio_constraints.xdc or the <2015_2_zynq_sources>\
lab6_zybo folder and select zybo_audio_constraints.xdc
1-2-7. Click on the Generate Bitstream in the Flow Navigator to run the synthesis, implementation, and
bitstream generation processes.
1-2-9. When the bit generation is completed, a selection box will be displayed. Click Cancel.
2-1-2. Make sure that Include Bitstream option is selected and click OK, leaving the target directory set
to local project directory.
2-1-5. In SDK, select File > New > Board Support Package.
2-1-6. Click Finish with the default settings (with standalone operating system).
This will open the Software Platform Settings form showing the OS and libraries selections.
2-1-7. Select the Overview > standalone entry in the left pane, click on the drop-down arrow of the
enable_sw_intrusive_profiling Value field and select true.
2-1-8. Select the Overview > drivers > ps7_cortexa9_0 and add –pg in addition to the –g in
the extra_compiler_flags Value field.
2-2-2. Enter lab6 as the Project Name, and for Board Support Package, choose Use Existing
(standalone_bsp_0 should be the only option).
2-2-3. Click Next, and select Empty Application and click Finish.
2-2-4. Select lab6 in the project view, right-click the src folder, and select Import.
2-2-7. Select lab6.c, audio.h, and fir_coef.dat, and click Finish to add the file to the project. (Ignore
any errors/warnings for now).
3-1-1. Zybo only: Make sure that the JP7 is set to select USB power.
3-1-2. Connect a micro-usb cable between a PC and the JTAG port of the board.
3-1-5. Click on the Program button to download the bitstream and program the PL section.
3-1-6. Select the lab6 application, right-click, and select C/C++ Build Settings.
3-1-7. Under the ARM gcc compiler group, select the Symbols sub-group, click on the + button to
open the value entry form, enter SW_PROFILE, and click OK.
This will allow us to profile the software loop of the FIR application.
3-1-8. Under the ARM gcc compiler group, select the Profiling sub-group, then check the Enable
Profiling box, and click OK.
3-2. Set SW[1] to the up position. Create a Run Configuration, enabling the
Profile option and setting up the profiling parameters. Run the profiler and
gprof and analyze the results.
3-2-2. Select lab6, right-click and select Run As > Run Configurations… and double-click Xilinx C/C+
+ Application (GDB) to create a new configuration.
3-2-3. Select the Profile Options tab. Click on the Enable Profiling check box, enter 1000000 (1 MHz)
in the Sampling Frequency field, enter 0x10000000 in the scratch memory address field, and click
Apply.
3-2-4. Click the Run button to download the application and execute it.
When program execution has completed, a message will be displayed indicating that the profiling
results are being saved in gmon.out file at the lab6\Debug directory.
3-2-6. Expand the Debug folder under the lab6 project in the Project Explorer view, and double click on
the gmon.out entry.
3-2-7. The Gmon File Viewer dialog box will appear showing lab6.elf as the corresponding binary file.
Click OK.
3-2-9. Click in the %Time column to sort in the descending order. You can also click on the sort button
If you click on the sort button, set the sorting items, orders, and methods as shown below:
Note that the fir_software routine is called 68 times, 424 samples were taken during the profiling,
and on an average of 6.235 microseconds for ZedBoard or 6.294 microseconds for Zybo were
spent per call. Note that the samples and time/call may vary little bit.
4-1-1. Launch Vivado HLS: Select Start > All Programs > Xilinx Design Tools > Vivado 2015.2 >
Vivado HLS > Vivado HLS 2015.2
4-1-2. In the Quick Start section, click on Create New Project. The New Vivado HLS Project wizard
opens.
4-1-3. Click Browse… button of the Location field, browse to <2015_2_zynq_labs>\lab6, and then
click OK.
4-1-6. In the Add/Remove Files for the source files, type fir as the function name (the provided source
file contains the function, to be synthesized, called fir).
4-1-7. Click the Add Files… button, select fir.c and fir_coef.dat files either from the
<2015_2_zynq_sources>\lab6_zed for the ZedBoard or <2015_2_zynq_sources>\lab6_zybo
for the Zybo, and then click Open.
4-1-9. In the Add/Remove Files for the testbench, click the Add Files… button, select fir_test.c file
either from the <2015_2_zynq_sources>\lab6_zed for the ZedBoard or
<2015_2_zynq_sources>\lab6_zybo for the Zybo and click Open.
4-1-11. In the Solution Configuration page, leave Solution Name field as solution1 and either leave the
default period of 10 for the ZedBoard or change the clock period to 8 for the Zybo. Leave
Uncertainty field blank as it will use 12.5% as the default value.
4-1-12. Click on Part’s Browse button, and select the following filters to select either the xc7z020clg484-
1 for the ZedBoard or the xc7z010clg400-1 for the Zybo, and click OK:
Family: Zynq
Sub-Family: Zynq
Package: clg484 (for ZedBoard) or clg400 (for Zybo)
Speed Grade: –1
You will see the created project in the Explorer view. Expand various sub-folders to see the
entries under each sub-folder.
4-1-14. Double-click on the fir.c under the source folder to open its content in the information pane.
The FIR filter expects x as a sample input and pointer to the computed sample out. Both of them
are defined of data type data_t. The coefficients are loaded in array c of type coef_t from the file
called fir_coef.dat located in the current directory. The sequential algorithm is applied and
accumulated value (sample out) is computed in variable acc of type acc_t.
4-1-15. Double-click on the fir.h in the outline tab to open its content in the information pane.
The header file includes ap_cint.h so user defined data width (of arbitrary precision) can be used.
It also defines number of taps (N), number of samples to be generated (in the testbench), and
data types coef_t, data_t, and acc_t. The coef_t and data_t are short (16 bits). Since the
algorithm iterates (multiply and accumulate) over 59 taps, there is a possibility of bit growth of 6
bits and hence acc_t is defined as int38. Since the acc_t is bigger than sample and coefficient
width, they have to cast before being used (like in lines 16, 18, and 21 of fir.c).
4-1-16. Double-click on the fir_test.c under the testbench folder to open its content in the information
pane.
Notice that the testbench opens fir_impulse.dat in write mode, and sends an impulse (first sample
being 0x8000.
5-1-1. Select Project > Run C Simulation or click on from the tools bar buttons, and Click OK in
the C Simulation Dialog window.
The testbench will be compiled using apcc compiler and csim.exe file will be generated. The
csim.exe will then be executed and the output will be displayed in the console view.
Figure 12. Initial part of the generated output in the Console view
6-1-1. Select Solution > Run C Synthesis > Active Solution to start the synthesis process.
6-1-2. When synthesis is completed, several report files will become accessible and the Synthesis
Results will be displayed in the information pane.
6-1-3. The Synthesis Report shows the performance and resource estimates as well as estimated
latency in the design.
6-1-4. Using scroll bar on the right, scroll down into the report and answer the following question.
Question 1
Estimated clock period:
Worst case latency:
Number of DSP48E used:
Number of BRAMs used:
Number of FFs used:
Number of LUTs used:
6-1-5. The report also shows the top-level interface signals generated by the tools.
You can see the design expects x input as 16-bit scalar and outputs y via pointer of the 16-bit
data. It also has ap_vld signal to indicate when the result is valid.
6-2. Add PIPELINE directive to loop and re-synthesize the design. View the
synthesis results.
6-2-1. Make sure that the fir.c is open in the information view.
6-2-2. Select the Directive tab, right click on loop and select Insert Directive. Select PIPELINE under
Directive and click OK to apply it.
6-2-3. Select Solution > Run C Synthesis > Active Solution to start the synthesis process.
6-2-4. When synthesis is completed, the Synthesis Results will be displayed in the information pane.
6-2-5. Note that the latency has reduced to 62 (63 for Zybo) clock cycles. The DSP48 and BRAM
consumption remains same; however, LUT and FF consumptions have changed.
7-1-1. Select Solution > Run C/RTL Cosimulation or click on the button to open the dialog box so
the desired simulations can be run.
The Co-simulation will run, generating and compiling several files, and then simulating the design.
In the console window you can see the progress. When done the RTL Simulation Report shows
that it was successful and the latency reported was 62.
8-1-1. Make sure that fir.c file is open and in focus in the information view.
8-1-4. In the Vivado HLS Directive Editor dialog box, select INTERFACE directive using the drop-
down button.
8-1-5. Click on the button beside mode (optional). Select s_axilite and click OK.
8-1-6. In the bundle (optional) field, enter fir_io and click OK.
8-1-7. Similarly, apply the INTERFACE directive (including bundle) to the y output.
8-1-1. Apply the INTERFACE directive to the top-level module fir to include ap_start, ap_done, and
ap_idle signals as part of bus adapter (the variable name shown will be return). Include the
bundle information too.
Figure 17. Applying bundle to assign function control signals to s_axilite interface
Note that the above steps 8-1-3 through 8-1-8 will create address maps for x, y, ap_start
ap_valid, ap_done, and ap_idle, which can be accessed via software. Alternately, ap_start,
ap_valid, ap_done, ap_idle signals can be generated as separate ports on the core by not
applying INTERFACE directive to the top-level module fir. These ports will then have to be
connected in a processor system using available GPIO IP.
9-1-1. Since the directives have been added, it is safe to re-synthesize the design. Select Solution >
Run C Synthesis > Active Solution.
9-1-2. Once the design is synthesized, select Solution > Export RTL to open the dialog box so the
desired IP can be generated.
9-1-4. When the run is completed, expand the impl folder in the Explorer view and observe various
generated directories; ip, verilog and vhdl.
Expand the ip directory and observe several files and sub-directories. One of the sub-directory of
interest is the drivers directory which consists of header, c, tcl, mdd, and makefile files. Another
file of interest is the zip file, which is the ip repository file that can be imported in an IP Integrator
design
10-1-1. In Vivado’s Flow Navigator pane, click Project Settings under Project Manager.
10-1-3. Click the Green Plus button. Browse to <2015_2_zynq_labs>\fir.prj\solution1\impl\ip and click
Select.
The directory will be scanned and added in the IP Repositories window, and Fir IP entry will be
displayed in the IP in Selected Repository window.
10-1-4. Click OK to accept the settings, and No, if prompted to create a new synthesis run.
10-2. Open the block design. Instantiate fir_top core twice, one for each side
channel, into the processing system naming the instances as fir_left and
fir_right.
10-2-1. Select Open Block Design > system.bd from the Flow Navigator pane.
10-2-2. Click the Add IP icon and search for FIR in the catalog by typing FIR and double-click on the
FIR entry to add an instance.
Notice that the added IP has HLS logo in it indicating that this was created by Vivado HLS.
10-2-4. Select the added instance in the diagram, and change its instance name to fir_left by typing it in
the Name field of the Block Properties form in the left.
10-2-5. Similarly, add another instance of the HLS IP, and name it fir_right.
10-2-6. Click on Run Connection Automation, and select /fir_left/S_AXI_FIR_IO and click OK.
10-2-7. Similarly, click on Run Connection Automation again, and select /fir_right/S_AXI_FIR_IO and
click OK.
At this stage the design should look like shown below (click the regenerate button [ ]).Note that
the interrupt signals for the fir IP blocks are not required and will remain unconnected.
10-3. Verify addresses and validate the design. Generate the system_wrapper
file.
10-3-1. Click on the Address Editor, and expand the processing_system7_0 > Data if necessary.
10-3-2. Run Design Validation (Tools > Validate Design) and verify there are no errors
10-3-3. In the sources view, right-click on the block diagram file, system.bd, and select Create HDL
Wrapper to update the HDL wrapper file. When prompted, click OK with the Let Vivado manage
wrapper and auto-update option.
10-3-4. Click on the Generate Bitstream in the Flow Navigator to run the synthesis, implementation, and
bitstream generation processes.
11-1-1. If SDK is already open then skip the next two steps.
11-1-5. Make sure that Include Bitstream option is selected and click OK, leaving the target directory set
to local project directory.
The SDK will detect the change in the hardware and bitstream and may pop-up a warning box
asking you if it is OK to update.
The BSP will be re-compiled and the system.mss tab will be regenerated showing fir_top driver
assigned to fir_left and fir_right instances.
Figure 24. Updated mss file with driver assigned to the added IP
11-1-7. Right-click on the standalone_bsp_0 and select Re-generate BSP Sources. Click Yes to re-
generate the BSP.
11-2. Remove the user defined SW_PROFILE symbol and add the HW_PROFILE
symbol.
11-2-1. Select the lab6 application, right-click, and select C/C++ Build Settings.
11-2-2. Under the ARM gcc compiler group, select the Symbols sub-group, select SW_PROFILE, and
delete it by clicking on the delete button.
This will allow us to profile the hardware IP of the FIR application. The program should compile
without error.
11-3. Make sure that the SW[1] is in ON position. Power ON the board. Program
the FPGA. Profile the application using the hardware FIR filter IP.
11-3-5. From the menu bar, select Run > Run Configurations and click the Run button to profile the
application.
11-3-7. Invoke gprof by double-clicking gmon.out entry under lab6 > Debug folder in the Project Explorer
view of SDK, select the Sorts samples per function output, and sort the %Time column.
Notice that the output now shows filter_hw_accel_input function call instead of the fir_software
function call. Note that the number of calls to the filter function has not changed but the average
time spent per call is 1.852 us for ZedBoard or 1.823 us for Zybo as the filtering is done in the
hardware instead of the software.
Also notice that the amount of time spent in the filtering function reduced from about 9.45% to
4.86% for ZedBoard or 11.15% to 3.54% for Zybo.
Figure 26. Profiling the application with the hardware IP for ZedBoard
Figure 26. Profiling the application with the hardware IP for Zybo
12-1-1. Right-click on the standalone_bsp_0 project in the Project Explorer and select Board Support
Package Settings.
12-1-2. Select the Overview > standalone entry in the left pane, click on the drop-down arrow of the
enable_sw_intrusive_profiling Value field and select false.
12-1-3. Select the Overview > drivers > cpu_cortexa9 and remove –pg from the extra_compiler_flags
Value field.
12-1-5. Right-click on the lab6 project and select C/C++ Build Settings. Select the Profiling settings and
uncheck the Enable Profiling check box.
12-1-6. Select Run > Run Configurations. Select the Profiling tab, and uncheck the Enable Profiling
option.
12-2. Connect an audio patch cable between the Line In jack and the Speaker
(header) out jack of a PC. Connect a headphone to the Line Out jack of the
ZedBoard or HPH Out of the Zybo. Set the SW[1] in the OFF position. Play
the provided corrupted_music_4kHz.wav file.
12-2-1. Connect an audio patch cable between the Line In jack and the Speaker (header) out jack of a
PC
12-2-2. Connect a headphone to the Line Out jack on the ZedBoard or HPH Out on the Zybo.
12-2-4. Double-click corrupted_music_4KHz.wav or some other wave file of interest to play it using the
installed media player. Place it in the continuous play mode.
12-2-5. Right-click on the lab6 in the Project Explorer pane and select Run As > Launch On Hardware
(GDB).
The program will be downloaded and run. If you want to listen to corrupted signal then set the
SW[0] OFF. To listened the filtered signal set the SW[0] ON.
12-2-6. When done, power OFF the board, and exit SDK and Vivado using File > Exit.
Conclusion
In this lab, you profiled a software application after creating a processor system using IP Integrator. Then
you created a Vivado HLS project and added RESOURCE directive to create an IP-XACT adapter. You
generated the IP-XACT adapter during the export phase. You then updated the processor system using
the generated IP-XACT adapter, and profiled the system with the provided application.
Answers
1. Answer the following questions:
ZedBoard:
Estimated clock period: 7.95 ns
Worst case latency: 175 clock cycles
Number of DSP48E used: 3
Number of BRAMs used: 0
Number of FFs used: 138
Number of LUTs used: 393
Zybo:
Estimated clock period: 6.38 ns
Worst case latency: 175 clock cycles
Number of DSP48E used: 3
Number of BRAMs used: 0
Number of FFs used: 167
Number of LUTs used: 106
Appendix
13-1-1. Open Vivado by selecting Start > All Programs > Xilinx Design Tools > Vivado 2015.2 >
Vivado 2015.2
13-1-2. Click Create New Project to start the wizard. You will see the Create a New Vivado Project
dialog box. Click Next.
13-1-3. Click the Browse button of the Project Location field of the New Project form, browse to
<2015_2_zynq_labs>\lab6, and click Select.
13-1-4. Enter audio in the Project Name field. Make sure that the Create Project Subdirectory box is
checked. Click Next.
13-1-5. Select RTL Project in the Project Type form, and click Next.
13-1-6. Select Verilog as the Target language and Simulator Language in the Add Sources form, and
click Next.
13-1-7. Click Next two times to skip Adding Existing IP and Add Constraints dialog boxes
13-1-8. In the Default Part form, select Boards, and select ZedBoard or Zybo. Click Next.
13-1-9. Check the Project Summary and click Finish to create an empty Vivado project.
14-1-1. In the Flow Navigator, click Create Block Design under IP Integrator
14-1-3. IP from the catalog can be added in different ways. Click on Add IP in the message at the top of
the Diagram panel, or click the Add IP icon in the block diagram side bar, press Ctrl + I, or
right-click anywhere in the Diagram workspace and select Add IP.
14-1-4. Once the IP Catalog is open, type “zy” into the Search bar, find and double click on ZYNQ7
Processing System entry, or click on the entry and hit the Enter key to add it to the design.
14-1-5. Notice the message at the top of the Diagram window that Designer Assistance available. Click
on Run Block Automation and select /processing_system7_0
14-1-6. Click OK when prompted to run automation with the default settings.
Notice that external ports have been automatically added for the DDR and Fixed IO once Block
Automation has been complete, Some of the other default ports are also added to the block.
14-1-7. In the block diagram, double click on the Zynq block to open the Customization window for the
Zynq processing system.
A block diagram of the Zynq should now be open, showing various configurable blocks of the
Processing System.
At this stage, the designer can click on various configurable blocks (highlighted in green) and
change the system configuration.
14-2. Configure I/O Peripherals block to use UART 1 and I2C 1 peripherals,
disabling other unwanted peripherals. Enable FCLK_CLK1, the PL fabric
clock and set its frequency either to 10.000 MHz for the ZedBoard or to
12.288 MHz for the Zybo.
14-2-1. Select the MIO Configuration tab on the left to open the configuration form and expand I/O
Peripheral in the right pane.
14-2-2. Click on the check box of the I2C 1 peripheral. Uncheck USB0, SD 0, ENET 0, GPIO > GPIO MIO
as we don’t need them.
14-2-3. Select the Clock Configuration in the left pane, expand the PL Fabric Clocks entry in the right,
and click the check-box of FCLK_CLK1.
14-2-4. Change the Requested Frequency value of FCLK_CLK1 to 10.000 MHz for the ZedBoard or
12.288 MHz for the Zybo.
Figure A-4. Enabling and setting the frequency of FCLK_CLK1 for the ZedBoard
Figure A-4. Enabling and setting the frequency of FCLK_CLK1 for the Zybo
Notice that the Zynq block only shows the necessary ports.
14-3. Add the provided I2C-based either zed_audio_ctrl IP for the ZedBoard or
zybo_audio_ctrl IP for the Zybo to the IP Catalog
14-3-1. In the Flow Navigator pane, click IP Catalog under Project Manager.
14-3-3. Click on the Green Plus button. Browse to <2015_2_zynq_sources>\lab6_zed for the
ZedBoard or <2015_2_zynq_sources>\lab6_zybo for the Zybo directory, and click Select.
Notice that either the zed_audio_ctrl or the zybo_audio_ctrl entry is displayed in the IP in
Selected Repository field.
14-4-1. Click the Add IP button if the IP Catalog is not open and search for AXI GPIO in the catalog by
typing gpi and double-click on the AXI GPIO entry to add an instance.
14-4-3. Double-click on the added instance and the Re-Customize IP GUI will be displayed.
14-4-4. Change the Channel 1 width to 2 for the ZedBoard or width of 1 output only for the Zybo.
14-4-5. Check the Enable Dual Channel box, set the width to 2 for the ZedBoard or width of 2 input only
for the Zybo, and click OK.
14-4-6. Similarly add an instance of either the zed_audio_ctrl for the ZedBoard or the the zybo_audio_ctrl
IP for the Zybo.
14-4-7. Notice that Design assistance is available. Click on Run Connection Automation, and
select /axi_gpio_0/S_AXI
Notice two additional blocks, Proc Sys Reset, and AXI Interconnect have automatically been
added to the design.
14-4-9. Similarly, click on Run Connection Automation, and select either /zed_audio_ctrl_0/S_AXI for
the ZedBoard or the /zybo_audio_ctrl_0/S_AXI for the Zybo and click OK.
14-5-1. Select the GPIO interface of the axi_gpio_0 instance, right-click on it and select Make External to
create an external port. This will create the external port named GPIO and connect it to the
peripheral.
14-5-2. Select the GPIO2 interface of the axi_gpio_0 instance, right-click on it and select Make External
to create the external port.
14-5-3. Similarly, selecting one port at a time either of the zed_audio_ctrl_0 instance or the
zybo_audio_ctrl_0 instance, and make them external.
14-5-4. Similarly, make the IIC_1 interface and FCLK_CLK1 port of the processing_system7_0 instance
external.
At this stage the design should look like shown below (you may have to click the regenerate [ ]
button).
Figure A-7. Block design after I2C based zed_audio_ctrl core added and connections made
for the ZedBoard
Figure A-7. Block design after I2C based zybo_audio_ctrl core added and connections
made for the Zybo*