Vaidya 2017 Hardware Acceleration of Image Proc
Vaidya 2017 Hardware Acceleration of Image Proc
ICICCS 2017
Abstract — In this paper, image processing The primary benefits of an HLS design methodology
algorithms are designed using high level synthesis and are improved productivity for hardware designers and
implemented on different hardware platforms. These improved system performance for software designers in
implementations are compared in terms of speed, latency terms of reduction in development time because of
and resource utilization on various hardware platforms. The abstraction from implementation details, verification at the
use of pipelining to reduce the latency is illustrated with an C-level allows validation of the functional correctness of
example. Using high level synthesis the designer has the the orders of magnitude faster than traditional hardware
opportunity to employ libraries similar to OpenCV and can description languages allows, Controlling the C synthesis
take advantage of the productivity benefits of working at a process through optimization directives allows the creation
higher level of abstraction, while creating high-performance of specific high-performance hardware implementations
hardware. Basic image processing algorithms like and allows quick creation of many different
calculating histogram, histogram equalization, averaging implementations from the C source code using
filter and laplacian filter are chosen to explain hardware optimization directives which enables easy design space
acceleration using high level synthesis. The workflow of
exploration and improves the likelihood of finding the
most-optimal implementation. [2]
implementing high level C / C++ / SystemC code in
hardware using Vivado high level synthesis tool is explained The image processing algorithm design takes advantage
along with implementation results of various Image of a high-level synthesis tool because it allows the designer
Processing algorithms. to employ libraries similar to OpenCV, a library that is
Keywords— image processing; Vivado HLS; Zynq SOC, well-known and widely used by software designers for
Virtex 7; Virtex 6; Spartan 6; Histogram; Laplacian filter. computer vision applications. [1, 9] High level synthesis
helps in reduction of time to market because designers are
already familiar with libraries. However, high-level
I. INTRODUCTION synthesis tools are far from being perfect. Developers still
Reconfigurable computing has gain increasing attention need embedded hardware knowledge and experience to
from researchers and industries over the last few years as it accomplish a successful design [3]
constitutes a very interesting combination of hardware The organization of paper is as follows. In section II
performance and software flexibility. The complexity of other similar implementation are explored, In section III
hardware design in growing day by day and because of that design flow for Vivado HLS is discussed along with
number of lines in Hardware Description Language (HDL) basic image processing algorithm details. The
code is also increasing day by day. Most engineers have to implementation results and comparison results are also
spend a significant amount of time to learn to program shown.
FPGA using hardware description language such as
Verilog and VHDL because the modeling of a hardware is
vastly different than designing a software, and it requires a II. RELATED WORK
good knowledge of hardware. To overcome this, concept of Various studies have been presented in literature
hardware / software co-design and high level synthesis is regarding the utilization of reconfigurable architectures for
introduced. acceleration of image processing applications.
High level synthesis tools like Vivado from Xilinx Developments in HLS attract many software and hardware
convert high level C / C++ / SystemC code into Register designers to enhance the implementation of different
Transfer Language (RTL) implementation that synthesizes solutions [2]. To improve the design productivity of
into Xilinx FPGAs. [1] Vivado HLS provides the implementing FPGA-based image processing, several
possibility for a software designer to accelerate application researchers have demonstrated edge detector applications
with computational complexity on the hardware which using high-level synthesis tools.
provides a massively paralleled architecture with benefits Hanaa M Abdelgawad, Mona Safar, and Ayman M
in performance, cost and power over traditional processors. Wahba proposed High level synthesis of canny edge
[1] It allows hardware designers who implement designs in detection algorithm on Zynq platform. [2] K V Ramana
a FPGA to take advantage of the productivity benefits of Reddy implemented Canny Edge Detector (CED)
working at a higher level of abstraction, while creating algorithm on Spartan 3E FPGA platform and Video
high-performance hardware. The process of HDL coding Graphics Array (VGA) interface for displaying the images
and behavioral simulation in conventional FPGA design on the monitor. [4] The maximum image size that has been
flow can be replaced by the workflow of Vivado HLS [2]. implemented was 128 x 128 with using BRAM to store the
and Table 4 shows the results of with pipelining and without so that it can be used for windowing operation that needs 3
pipelining in hardware resources. x 3 neighbors and does not have to be fetched all nine
values every time. The center pixel is returned to the Test
Table 3 Comparison of Histogram Equalization without bench and from that output image is created using OpenCV
pipelining HLS for different hardware platforms library.
Board Clock Freq. Latency LUT FF DSP
Period (MHz) 48E
(ns)
Zynq ZC 8.13 123 1769479 2351 1528 3
602
Virtex7 8.73 114.5 983044 1738 886 3
VC 709
Virtex 6 8.50 117.6 1245189 1724 1038 3
ML 605
Spartan-6 8.53 117.2 1507333 1768 1110 5
SP 605
B. Smoothing Filter
Fig 6 Result of application of smoothening filter on image
A useful filter in video and image processing is the
smoothing filter, also known as the averaging or blurring Following table shows comparison of high level
filter. This filter is widely used to reduce noise in the
image particularly whit e noise. It is also used as a synthesis of averaging filter code for various hardware
pre-processing stage in computer vision algorithms in platforms.
order to enhance images for use in later stages of image Table 5 Comparison of Averaging filter HLS for different
processing and computer vision. It is considered a low hardware platforms
pass filter; it gets rid of sharp edges and quick changes in
pixel values, i.e. the high frequency part of the image. [13] Board Clock Freq. Latenc LUT FF DSP BRAM
Averaging filter runs through the image pixel by pixel, and Period MHz y 48E _18K
replaces each pixel with a new value that is the average (ns)
value of the window of pixels. For window size of 3x3, Zynq 7.79 128.3 327680 401 219 2 2
ZC
aver a gin g filter will apply following window to all 602
pixels: Virtex 6.76 148.4 196608 407 217 2 2
7 VC
709
(2) Virtex 8.72 114.5 196608 411 217 2 2
6 ML
Fig 5 indicates flowchart of implementing 605
Averaging filter in C++. In implementation of averaging Spartan 8.71 115.0 458752 459 221 2 2
filter concept of row buffer is used to store past pixel values -6 SP
605
C. Laplacian Filter