23 Profiling and Performance Improvement
23 Profiling and Performance Improvement
Zynq
Vivado 2015.2 Version
This material exempt per Department of Commerce license exception TSU © Copyright 2015 Xilinx
Objectives
Introduction
Software Profiling in XSDK
Performance Improvement
Summary
Results in two useful formats Samples per function: How much time
is spent in each routine
Introduction
Software Profiling in XSDK
Performance Improvement
Summary
Hardware/software intrusive
– Requires a hardware timer
– Requires a dedicated profile RAM area
– Executable is modified with profiler routines
A dedicated hardware timer interrupts the processor at a fixed interval
– The interrupt routine keeps track of the program counter at each interrupt
– A histogram of PC locations is kept in profile RAM
– Interrupt interval time is programmable
Every function call in the software application is annotated by the compiler to track
which functions are being called by what
Set board support package option to include the profiler in the BSP
Enable the compiler for profiling an application with the –pg option in the board support
package
Compile, link, and generate the ELF executable
Create run configuration of the executable
– Configure the profiler memory
– Set the interrupt latency time
Download the executable into a hardware or software simulator
Run the software application until completion or for an "amount of time"
Execute the GNU gprof tool to generate report output
Profile scratch memory is populated with statistics while the program is executing
– Intrusive profiling routines and the fixed interval timer interrupt use this memory
– Stored in gmon.out upon completion or execution halt
The gprof tool reads gmon.out and assembles the information into a user configurable
report
gprof is launched by the user after execution completion
Effective profiling is based on how much time is spent in functions, and how often they
are called
– If your code is just a fall-through main, profiling is not useful because 100 percent of execution time
will be in main with no calls to other functions
– Carefully architect the application with a structured architecture by using functions
– Complier does not consider macros functions – the macro will be expanded and treated as in-line
code
– Separate algorithms logically into functions that will help you analyze the flat profile view
– Think ahead when architecting code—Is this algorithm a candidate for implementing in programmable
logic?
Introduction
Software Profiling in XSDK
Performance Improvement
Summary
Introduction
Software Profiling in XSDK
Performance Improvement
Summary
Profiling allows you to analyze the software and determine where the CPU’s time is
spent
Profiling can help you rearrange or rewrite the code or even help you consider if a
function can be targeted to hardware
The gprof tool is used to generate a profiling report from collected statistics
A hardware timer and memory are required to use the profiling tool
– Sampling frequency will have direct impact on the amount of memory used to collect samples
Profiling in XSDK is provided by the Standalone BSP as a GNU service
Enabling cache can improve performance
Porting software into hardware can improve system performance
– Vivado HLS
– System Generator