Design of A 2D Median Filter With A High Throughput FPGA Implementation
Design of A 2D Median Filter With A High Throughput FPGA Implementation
FPGA Implementation
Anish Goel, Student Member, IEEE, M. Omair Ahmad, Fellow, IEEE, and M.N.S. Swamy, Fellow, IEEE
Department of Electrical and Computer Engineering
Concordia University, Montreal, QC, Canada H3G 1M8
email: {an_goe, omair, swamy}@ece.concordia.ca
Abstract — In this paper, a hybrid technique for median quality of filtered image. Of these, frequency of the design
filtering of images affected by impulse noise is proposed. Our plays a vital role in many real-time applications. The
technique combines impulse noise detection, histogram-based minimum frequency requirement also known as the ‘pixel
median calculation and bit-plane processing to obtain clock’ for a real-time Full High Definition (FHD) vision
approximate median with the aim of optimizing the throughput
system at a rate of 60 frames per second (fps) is 148.5 MHz
at minimum cost of image quality. The proposed median filter is
implemented on FPGA with pipelining and is significantly faster [2], which must be satisfied by every block of the vision
than existing FPGA based pipelined median filter architectures. system. For offline processing of massive data, even higher
Implementation of the proposed median filter hardware provides frequencies are desired.
a throughput of 282 Full High Definition (FHD) frames per
second on Zynq-7 FPGA; 48% higher than the throughput of
The throughput of algorithms that use sorting-based
low-latency median filter. Compared to FPGA implementation of median filters [3] [4] is limited, as these architectures use
a low complexity noise removal, the proposed median filter cascaded stages of sorting blocks (compare and swap) for
utilizes only 45% of FPGA slices and provides a speed-up of 2.2 median calculation. A Low Energy Adaptive Median Filter
on Zynq-7 FPGA. (LEAMF) architecture presented in [5] processes only higher
4-bits of pixel values for median calculation. Although
Keywords— Noise Detection, Bit-Planes, Histogram-Based, LEAMF processes only 4-bits of pixel values, its FPGA
Pipelining, Approximate Median. implementation results in lower throughput compared to the
adaptive median filter presented in [4] which utilizes all 8-bits.
I. INTRODUCTION On the other hand, FPGA implementation of a histogram-
Median filtering is one of the most commonly used based technique [6] results in large area and low operating
technique for removal of impulse noise in images [1]. The frequency due to its design complexity. The quality of images
defect in sensing or capturing device, memory corruptions and filtered by techniques [3-6] is equivalent to filtered image
shot noise affects an image in such a way that some pixel quality of Conventional Median Filter (CMF) [7]. On the other
hand, median filter based on a low complexity noise removal
values are set to minimum while some are set to maximum; a
technique [8] provides high quality of filtered images by
phenomenon that characterizes salt and pepper noise. For
filtering only noisy pixels (decision-based filtering). Work
recovery of original image from a corrupted image, several
presented in [9] is a bit-plane processing architecture that
techniques have been proposed where the best match for each processes N number of W-bit integers in smaller blocks of B-
corrupted pixel is calculated. Finding a pixel from the bits to achieve area and speed optimization. In the proposed
neighborhood or a close approximation based on the work, we design only one block for processing higher B-bits
neighborhood pixels to replace the corrupted pixel is the and ignore the remaining (W-B) lower bits. Although ignoring
typical approach in image filtering. This approach is lower bits compromises the filtered image quality by a small
characterized by a 2-Dimentional (2D) window, where pixels amount, the improvement achieved in hardware performance
surrounding the noisy pixel are processed. Several algorithms is significantly higher to balance this trade-off.
and techniques have tried to solve the problem of retrieving
the corrupted pixels to achieve high quality of filtered images. The quality of a filtered image can be judged
qualitatively by visualizing the image and quantitatively by
However, it may be interesting to note that the actual value of
calculating its PSNR. In general, higher PSNR shows a better
all the corrupted pixels may never be known. This is supported
approximation to the original image. Qualitatively, there is no
by the fact that none of such algorithms have resulted in
significant loss of information even if some of the pixels of a
infinite Peak Signal to Noise Ratio (PSNR) between the noisy image are not retrieved to near-original values.
original and filtered image, in-spite of low noise density. Following this concept, techniques based on approximate
For implementation of median filtering algorithms on arithmetic [10] have been implemented to achieve
hardware platforms such as FPGAs, performance parameters optimization in area and power at the cost (< 1dB) of PSNR.
like area, frequency and power are equally important as the
This work was supported in part by the Natural Science and Engineering
Research Council (NSERC) of Canada and in part by the Regroupement
Stratégique en Microélectronique du Québec (ReSMiQ).
1074
based median filter that reads and processes all input pixels of
a window simultaneously, is feasible for hardware
implementation only if limited bits of input pixel values are
considered. For more bits, the hardware grows exponentially.
1075
evaluation board equipped with Xilinx XC7Z010 SoC. Real- TABLE I. IMPLEMENTATION RESULTS WITH COMPARISON
time input to the complete system was provided using FPGA Architecture # Slices Frequency (MHz) Throughput (fps)
APEMAN 1080P FHD camera capturing video at 30 fps with a Virtex - 2 [4] 1506 305 140
resolution of 1080×1920 and the frames were fed to the [5] 366 140 56
median filter block sequentially with the processed output [6] 2300 72 35
[8] 2718 101 48
visualized on Monitor via HDMI interface as seen in fig. 7. Proposed 1688 384 185
Virtex - 6 [5] 136 263 105
[8] 1713 169 91
Proposed 527 531 256
Zynq - 7 [3] 1550 394 190
[8] 1990 261 125
Proposed 900 586 282
V. CONCLUSION
This paper presented a hybrid technique for high speed
filtering of images affected by impulse noise. The proposed
hybrid median filter performs significantly better than many
relevant architectures in terms of image quality as well as
Fig.7. Evaluation of Proposed Hybrid Median Filter on Zybo Z7 Board hardware performance. We validated the proposed hybrid
median filter by designing and testing the complete system on
C. Experimental Results Xilinx Zynq-7 SoC. Experimental results show that the
The hardware implementations are verified using RTL proposed median filter hardware can work at a frequency of
simulations and the VHDL code for proposed hybrid median 586 MHz when implemented on Zynq-7 FPGA with an
filter is mapped to Virtex-2, Virtex-6 and Zynq-7 FPGAs. All estimated throughput of 282 FHD frames per second.
experiments were performed on Xilinx Vivado Design Suite.
Implementation results of the proposed filter are compared REFERENCES
with FPGA implementations of other relevant pipelined [1] Tai, Y. M., Tu, W.C. and Chien, S. Y., “VLSI Architecture Design of
median filters for same window size (5×5). These results are Layer-Based Bilateral and Median Filtering for 4k2k Videos at 30fps”,
presented in Table I. The throughput Tp in fps is calculated 2017 International Symposium on Circuits and Systems.
[2] Kowalczyk, A., Przewlocka, D. and Kryjak, T., “Real-time
using (1) from the operating frequency fmax (MHz) obtained Implementation of Contextual Image Processing Operations for 4K
from timing summary report after inserting the timing Video Stream in Zynq UltraScale+ MPSoC”, 2018 Conference on
constraints. Pipeline latency ‘δ’ is very small and may be Design and Architectures for Signal and Image Processing (DASIP).
ignored. Fs is 1080×1920; the frame size for FHD video [3] Kumar, V., Asati, A. and Gupta, A., “Low-latency Median Filter Core
for Hardware Implementation of 5 × 5 Median Filtering”, IET Image
frames. The throughput obtained from (1) is an estimate. The Processing, 2017, Vol. 11 Issue 10, pp. 927-934.
reference implementations also estimate the throughput of [4] Vasicek, Z. and Sekanina, L., “Novel Hardware Implementation of
their architectures using similar calculations. Adaptive Median Filters”, 11th IEEE Workshop on Design and
Diagnostics of Electronics Circuits and Systems, Apr. 2008.
[5] Kalali, E. and Hamzaoglu, I., “A Low Energy 2D Adaptive Median
Tp = [(fmax × 10E6) + δ] / Fs (1) Filter Hardware”, Design, Automation & Test in Europe Conference &
Exhibition (DATE) , Grenoble, France, 2015.
[6] Fahmy, S.A., Cheung, P.Y.K and Luk, W., “High-throughput One
It is observed from Table I that the proposed hybrid median Dimensional Median and Weighted Median Filters on FPGA”, IET
filter architecture provides the highest throughput compared to Computers & Digital Techniques (Volume: 3, Issue: 4 , July 2009).
reference architectures. Although the proposed median filter [7] T. Nodes, and N. Gallager, "Median filters: Some Modifications and
their Properties," IEEE Trans. Acoustics, Speech and Signal Processing,
hardware utilizes more area compared to some architectures, voI.ASSP-30, no.5. (739-746), Oct.l982.
the throughput achieved in such cases is significantly higher. [8] T. Matsubara, V. Moshnyaga, and K. Hashimoto, “A FPGA
In comparison with LCMF [8], the proposed median filter Implementation of Low-Complexity Noise Removal,” in Proc. of 17th
requires substantially low area while providing significantly International Conference on Electronics, Circuits, and Systems (ICECS),
pp. 255 – 258, 2010.
more operating frequency leading to a high throughput on all [9] Cadenas, J.O., Megson, G.M. and Sherratt, R.S., “Median Filter
the three FPGAs. Thus, there is a trade-off between image Architecture by Accumulative Parallel Counters”, IEEE Transactions on
quality and hardware performance when the proposed median Circuits and Systems-II: Express Briefs, VOL. 62, NO. 7, (661-665) July
filter is compared with LCMF. 2015.
[10] Sekanina, L., Vasicek, Z. and Mrazek, V., “Approximate Circuits in
For comparing performance with LEAMF [5], energy Low-Power Image and Video Processing: The Approximate Median
Filter”, Radio Engineering, VOL. 26, NO. 3, September 2017.
required by the proposed architecture is calculated for noisy [11] R. C. Gonzalez and R. E. Woods, Eds., Digital Image Processing, 3rd
FHD video frames by importing the simulation activity file in ed. Englewood Cliffs, NJ, USA: Prentice-Hall, 2008.
Xilinx XPower analyzer for Virtex-6 implementation. It is [12] Image Database: http://www.eecs.qmul.ac.uk/~phao/IP/Images/
observed that the proposed median filter consumes only 30% [13] Video Database: https://media.xiph.org/video/derf/
[14] Batcher, K.E., “Sorting Networks and their Applications”, Spring Joint
extra energy compared to LEAMF while providing better Computer Conference, 1968.
image quality and a speed-up of 2.4 as compared to it.
1076