Aydin 2019
Aydin 2019
Abstract—Latest generation FPGAs determine the future CLBs (configurable logic blocks). This structure provides
usage of FIR filters. Their DSP blocks are able to implement speed, flexibility, cost and performance compared to
fixed-point data types for efficient computations. The systolic conventional DSP (Digital signal processing) processors [4].
multiply-accumulate architecture is utilized for various order
and parallel multiple channel to efficiently handle resource and Noisy signals which are involved to control cycle without
timing considerations. Implementing various order filter taps, filtering cause some troubles related to instability and
resource and latency of the particular architecture of Xilinx misdetection. Yet another important issue is that the systems
Artix-7 (XC7A100T-1CSG324C) series with the clock frequency that are processed only microprocessors have performance and
of 100 MHz and 12 bit input and 12 bit output is observed. The response time loss due to long time of filtering the signals. In
proposed design also shows that this design is suitable for power electronics applications, a large number of signals
multichannel parallel implementation such as power electronics mostly included switching noises need to read and evaluate
applications in smart grids. regarding protection or control. That is why the need of
filtering the signals with minimum latency is a must to enhance
Keywords—FPGA, FIR filter, parallel multichannel, adc, dac, the performance of the systems.
smart grids
In this paper, implementation of parallel multichannel and
I. INTRODUCTION various taps on Xilinx Artix-7 (XC7A100T-1CSG324C) FPGA
Digital filtering is one of the important aspect in digital device regarding resource, latency and multichannel aspects is
signal processing world. These filters are essentially used to demonstrated. The rest of the paper is organized as follow.
filter unwanted portions of the signal for various applications Section-2 gives theory of a FIR filter. Section-3 gives an idea
such as power electronics and control systems. Application of of how to design an FIR filter. Section-4 gives detailed
digital filters utilizes adders, multipliers and shift register explanation of architecture used in this implementation.
blocks. Architecture of the digital filters manipulates these Section-5 expresses how to quantize coefficients. Section-6
blocks and determines the speed, complexity and power [1]. gives details about output rounding. Section-7 states
implementation of an FPGA. Section-8 gives detailed
A digital filter receives digital inputs and gives digital explanation of testing elements, tables and results. Finally, in
outputs. Typically, in a filtering, digital signal processor reads the last section, conclusion and future works are explained.
sample from an analog to digital converter, manipulates
mathematical processes according to the filter type and extracts II. SMART GRIDS APPROACH
the result to a digital to analog converter [2]. Fig.1 shows that A smart grid complies with a vast number of variance
conventional flow of above statements with an FPGA (Field which may include distributed generation, demand response
programmable gate array). and electric grid itself. Because of that, the communication
need to handle the data obtaining from sensors and monitoring
that are more complicated in the consistency of utilizing
enhanced power electronics equipment [5].
In today smart grids, the power electronics equipment
needs high data density and very fast communication units.
Data processing from sensors and monitoring equipment need
the FPGAs to overcome the problems related to slow response
from classic microprocessors. Fig.2 can display this approach
and context of the using FPGAs in smart grids.
Authorized licensed use limited to: Auckland University of Technology. Downloaded on June 04,2020 at 11:57:51 UTC from IEEE Xplore. Restrictions apply.
Fig. 4. Ideal low pass filter
e-jwM/2 , |w|<wc
Hp ejw = (4)
0, wc <|w|≤π
The corresponding ideal impulse response is:
Fig. 3. FIR filter basic direct form
M
1 wc -jwM sin wc * n- 2
IV. DESIGN OF FIR FILTERS hp [n]= * -wc
e 2 * ejwn dw= M (5)
2π π* n-
2
The features of digital filters are frequently remarked in the
frequency domain, and hence, design is predicated on V. ARCHITECTURE
magnitude-response features. In fact, it cannot fulfill infinitely Systolic Multiply-Accumulate is first-hand encouraged by
sharp cutoff as in Fig.4. the DSP slices, outcomes area-efficient and high efficient filter
executions. Besides, this architecture expands to make use of
coefficient symmetry, hence ensuring more resource saving [7].
Authorized licensed use limited to: Auckland University of Technology. Downloaded on June 04,2020 at 11:57:51 UTC from IEEE Xplore. Restrictions apply.
Fig.6 displays the Systolic Multiply-Accumulate architecture
implementation.
Authorized licensed use limited to: Auckland University of Technology. Downloaded on June 04,2020 at 11:57:51 UTC from IEEE Xplore. Restrictions apply.
Fig. 14. 8 taps filtered signal in hardware ILA
Fig. 15. Output of the 3 taps filtered signal and noisy signal
Fig. 17. Output of the 5 taps filtered signal and noisy signal
Authorized licensed use limited to: Auckland University of Technology. Downloaded on June 04,2020 at 11:57:51 UTC from IEEE Xplore. Restrictions apply.
TABLE V. LUT USAGE (SINGLE CHANNEL)
LUT
Taps
Utilization Available Utilization %
3 13812 63400 21.79
4 13844 63400 21.84
8 13908 63400 21.94
16 14036 63400 22.14
32 14292 63400 22.54
64 14805 63400 23.35
128 16326 63400 25.75
256 21080 63400 33.25
LUT
Taps
Fig. 18. Output of the 8 taps filtered signal and noisy signal
Utilization Available Utilization %
3 14036 63400 22.14
TABLE I. FILTER ANALYSIS (PASS BAND) 4 14292 63400 22.54
8 14804 63400 23.35
Pass Band 16 15828 63400 24.97
Taps
32 17876 63400 28.20
Min Max Ripple
3 -38.401220 dB -21.330237 dB 17.070983 dB 64(max 7-Channel) 22629 63400 35.69
4 -38.416314 dB -13.901644 dB 24.514670 dB 128(max 3-Channel) 21414 63400 33.78
8 -39.653911 dB 5.255691 dB 44.909602 dB 256(max 1-Channel) 21080 63400 33.25
16 -44.288399 dB 3.451442 dB 47.739841 dB
32 -54.391398 dB 1.186509 dB 55.577907 dB TABLE VII. FLIP-FLOP USAGE (SINGLE CHANNEL)
64 -84.288399 dB 0.086351 dB 84.374750 dB
128 -76.329599 dB 0.005684 dB 76.335282 dB FLIP-FLOP
256 -72.247199 dB 0.010586 dB 72.257785 dB Taps
Utilization Available Utilization %
TABLE II. FILTER ANALYSIS (STOP BAND) 3 14769 126800 11.65
4 14802 126800 11.67
8 14900 126800 11.75
Stop Band 16 15096 126800 11.91
Taps
Min Max Ripple 32 15488 126800 12.21
3 - -38.249385 dB - 64 16272 126800 12.83
4 - -38.416314 dB - 128 17840 126800 14.07
8 - -39.489903 dB - 256 21044 126800 16.60
16 - -43.989592 dB -
32 - -53.345216 dB - TABLE VIII. FLIP-FLOP USAGE (MAX 8-PARALLEL CHANNEL)
64 - -61.919041 dB -
128 - -59.358466 dB - FLIP-FLOP
256 - -60.048297 dB - Taps
Utilization Available Utilization %
TABLE III. DSP SLICE USAGE (SINGLE CHANNEL) 3 15665 126800 12.35
4 15922 126800 12.56
8 16692 126800 13.16
DSP SLICE
Taps 16 18232 126800 14.38
Utilization Available Utilization % 32 21312 126800 16.81
3 2 240 0.83 64(max 7-Channel) 26776 126800 21.12
4 3 240 1.25 128(max 3-Channel) 24112 126800 19.02
8 5 240 2.08 256(max 1-Channel) 21044 126800 16.60
16 9 240 3.75
32 17 240 7.08 TABLE IX. LATENCY (SINGLE VS MAX 8-PARALLEL CHANNEL)
64 33 240 13.75
128 65 240 27.08
LATENCY (CYCLES)
256 129 240 53.75
Taps
Single
Multichannel Difference
TABLE IV. DSP SLICE USAGE (MAX 8-PARALLEL CHANNEL) Channel
3 9 9 -
DSP SLICE 4 10 10 -
Taps 8 12 12 -
Utilization Available Utilization % 16 16 16 -
3 16 240 6.67 32 24 24 -
4 24 240 10 64(max 7-Channel) 40 40 -
8 40 240 16.67 128(max 3-Channel) 72 72 -
16 72 240 30 256(max 1-Channel) 140 - -
32 136 240 56.67
64(max 7-Channel) 231 240 96.25
128(max 3-Channel) 195 240 81.25
256(max 1-Channel) 129 240 53.75
Authorized licensed use limited to: Auckland University of Technology. Downloaded on June 04,2020 at 11:57:51 UTC from IEEE Xplore. Restrictions apply.
Minimum latency and parallel multichannel processing
LATENCY-TAPS features enable the users to real-time process with no latency
for critical systems such as power electronic devices used for
smart grids for the future scope and also the unwanted portions
LATENCY
300
200 of the signals (noise) that are produced from switching devices
100
0 etc. can be eliminated. Thanks to the parallel processing
1 2 3 4 5 6 7 8 implementation feature of the FPGAs, the users can use these
Taps 3 4 8 16 32 64 128256 systems required high speed applications.
Latency(cycle) 9 10 12 16 24 40 72 140 REFERENCES
[1] S. M. Qasim, M. S. BenSaleh and A. M. Obeid, “Efficient FPGA
Fig. 19. Latency taps graph implementation of microprogram control unit based FIR filter using
Xilinx and Synopsys tools“, Proc. of Synopsys User Group Conference
(SNUG), Silicon Valley, USA, vol. 3, pp. 1-14, 2012.
X. CONCLUSION AND FUTURE SCOPE
[2] A. A. AlJuffri, A. S. Badawi, M. S. BenSaleh, A. M. Obeid and S. M.
Various orders low pass FIR filters are implemented in the Qasim, “FPGA implementation of scalable microprogrammed FIR filter
Artix 7-series FPGA. Systolic Multiply-Accumulator architectures using Wallace tree and Vedic multipliers”, 3rd IEEE
architecture is used to preserve resources. As it can be seen International Conference on Technological Advances in Electrical,
Electronics and Computer Engineering, Beirut, Lebanon, vol. 29, pp.
from the results that parallel multichannel implementation is 159-162, 2015.
possible as long as the DSP slices are available. For one [3] B. Mamatha and V. V. S. V. S. Ramachandram “Design and
channel, 478 taps can be executed to fully use the whole DSP implementation 120 order FIR filter based on FPGA”, International
slice. 8-channel is executed to show parallel implementation, Journal of Engineering Sciences & Emerging Technologies, vol. 3, pp.
however, it can be seen from the results that up to 64 taps, there 90-97, 2012.
are more resources if we want to increase the number of [4] S. Mirzaei, A. Hosangadi and R. Kastner, “FPGA implementation of
channel. The flip-flops and the LUTs (Look Up Table) change high speed FIR filters using add and shift methods”, 2006 International
Conference on Computer Design, San Jose, CA, USA, vol. 10, pp. 308-
slightly with respect to the number of taps and parallel 313, 2007.
multichannel. As long as the DSP slices are available for
[5] P. Faria and Z. Vale, “Digital signal processing issues in the context of
multichannel implementation, the latency does not change. the future smart grids”, Advanced Science and Technology Letters,
Filtered signals are captured through ILA and DAC. Phase (SUComS 2015), vol.97, pp. 30-35, 2015.
difference between the original noisy signal and the filtered [6] S. M. Kuo, B. H. Lee and W. Tian,”Design and implementation of FIR
signal is measured such that 12,5µs, 17,3µs, 22,1µs, 26,2µs for filters,” in Real-Time Digital Signal Processing Implementations and
3-taps, 4-taps, 5-taps, and 8-taps respectively. In this Applications, 2nd ed., England: Wiley, 2006, ch. 4, pp. 185-245.
implementation, it is seen that hardware that are based on the [7] Xilinx, “FIR Compiler v7.2 LogiCore IP Product Guide”, Vivado
FPGA parallel data processing capabilities can get rid of the Design Suit, PG149, November 18, 2015.
unwanted portions of signals with FIR filters in minimum time [8] Xilinx, “DSP: Designing for Optimal Results High Performance DSP
Using Virtex-4 FPGAs”, DSP Solutions-Advanced Design Guide,
considerations. The low cost Artix 7-series FPGA with low- Edition 1.0, March 2005.
taps filter the noisy signal as intended. Considering the cost of
the conventional DSP processors, as it can be seen that an
FPGA with DSP handling capabilities displays better results.
Authorized licensed use limited to: Auckland University of Technology. Downloaded on June 04,2020 at 11:57:51 UTC from IEEE Xplore. Restrictions apply.