0% found this document useful (0 votes)
9 views6 pages

Fir Filters

Uploaded by

Dhoni 007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views6 pages

Fir Filters

Uploaded by

Dhoni 007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

2015 International Conference on Pervasive Computing (ICPC)

Investigation in FIR Filter to Improve Power


Efficiency and Delay Reduction

Rameshwari Sathe Vijay Patil Nitin Patil


Department of Electronics and Department of Electronics and Department of Instrumentation
Telecommunication Telecommunication D. N. Patel college of Engineering
D. N. Patel college of Engineering D. N. Patel college of Engineering Shahada, Dist – Nandurbar,
Shahada, Dist – Nandurbar, Shahada, Dist – Nandurbar, Maharashtra, India
Maharashtra, India Maharashtra, India [email protected]
[email protected] [email protected]

Abstract— In design of Finite Impulse Response (FIR) filter into reduction of time which resulted into complexity and
using adder, coefficients and multiplication are used. Multiple delay generation. To reduce delay of the FIR, Gallagher et al
Constant Multiplication (MCM) is the algorithm which is used in [3] produced the High Radix Booth Multiplier which gave
FIR designing to minimize complexity of the circuit, increased
delay and multiplication using large area. These problems can be
better results in terms of the delay, but failed to avoid the
optimized by using new technique known as digit-serial multiple problem of complexity over larger filter area consideration. It
constant multiplications. It reduces the complexity, delay and was found that the complexity in control logic designing
area utilization. Along with this already existed method, the limiting this research comfortable for the small area and clock
modified carry select adder implemented in the current paper. It period limitation. To avoid the above explain errors or
shows that there should be 10-20% increment in power efficiency constraints the need of identification and combination of
and 50% reduction in delay compared to already exist techniques.
common expression generated, which was fulfilled by
Index Terms— CSA; Delay; FIR filter; GB; MCM; VLSI. Hartley’s [4] investigation. The technique for identification of
the common expressions known as Canonical Signed Digit
(CSD) was introduced, that saved the program count by 50%
I. INTRODUCTION and by combining common expressions 33% more saving in
program counts was achieved. The area utilization and time
The multiplication process in the digital system is very
variables were ignored in this analysis.
popular in engineering applications. It is the combination of
addition and shifting operation that utilize more number of The above study shows that all techniques have the errors
machine cycles to implement small multiplication due to relating to time, delay and power efficiency. Till 1996 there
repetition of the terms. To overcome these repetition problems, was no effective work generated in FIR designing. The
the MCM method is extensively used. But such method used concept of Hertly was proceed further by Nguyen et al [5] and
along the combination of different adders. The structure of implemented the shift add concept in the Digital Signal
MCM along various adders does not produce perfect solution Processing/ Processors (DSP) with number splitting concept by
for power efficiency and delay. modification from [4]. This implementation fails to resolve the
crosstalk problem. After this the work of FIR in DSP comes
The first digital filter (FIR) with different frequencies as per
into focus. And for this the Multiple Constant Multiplication
the requirements of user was designed by James et al [1]. It
(MCM) technique is used which is invented by the Potokonjak
was designed using Hilbert Transformation which is under
[6] in February 1996. By utilizing MCM technique Park et al
general linear filters category. This general filters leads to the
[7] develop the Computational Sharing Multipliers (CSHM).
issues such as latency, delay, time and power consumption.
In which they scale the multiplication process in vector form to
After this invention of implementation in computational simplify add and shift operation. But he limits his
world, researchers’ started to improve the FIR filters. implementation up to circuit level, also the circuit level
Dempster et al [2] implemented the FIR filter in VLSI implementation require 9.93 mm2 are for each transistor while
methodology using Bull and Horrocks method and succeeded total transistors in circuit are 130K. With this disadvantage he

978-1-4799-6272-3/15/$31.00(c)2015 IEEE
not able to consume less power and less delay. Voronenko et al disadvantage of both individual serial and parallel
[8] also present work in computational implementation but he implementation. In this paper, delay and area reduction
also not succeeded in implementation. This all problem arises technique is invented therefore DSA method is used. Along
due to long formation of expressions, as Hartly [4] only this to achieve the target, following already predefine [14]
succeed to identify and combine the expression. Aksoy [9] go methods are utilized.
one step ahead and form the technique to eliminate sub 1) Exact Common Subexpression Elimination (CSE)
expressions in MCM using Boolean Network but in this they algorithm
not able to control the delay with binary and CSD data.
According to above discussion and study the all papers These algorithms can be formed using the steps mentioned
below
facing the problem of energy consumption and the delay. From
above literature survey, the attention had given to the use of a) Detection of partial terms
MCM technique with Graph Based (GB) method. It needs to b) Construction of the Boolean network
improve and this paper contributes to include those techniques.
c) Formalization of 0-1 IPL problem
The combination of this two along solution of 0 Integrated
Linear Programming (ILP) Problem was used. These methods d) Then determine the minimum area solution
are used along Carry Select Adder (CSA) which gives the
better results. The proposed work has been discussed in section 2) GB Algorithm
II. Also the required component necessary to implement FIR
filter with CSA explained in the same section. The section III For the implementation of FIR with CSA, only consider
explores the all results and discussion. The conclusion of the main part of the MINAS-DS algorithm given below,
research with future direction was mentioned in section IV.
MINAS-DS(T)
II. PROPOSED WORK R ← {1}
A. Background (R,T) = Synthesize (R,T)
From the study of examples present in reference [10][11]
of FIR implementation shows that partial products in GB While T ≠ ∅ do
algorithm leads to best results of area reduction at gate level. For j = 1 TO 2BW + 1 - 1 Step 2 do
The shifting operation in MCM uses D flip-flops, as this If j ∉ R and j∉ T then
techniques works on bits parallel algorithm it is free from
hardware. Hence, for sharing of shift, addition and subtraction Impcostj = ComputeCost ({j},R)
operation the high-level algorithms invented in digit-serial If Impcostj ≠ 0 then
MCM design. From the Study it concluded that science digit-
serial operators occupy less area and are independent of the A←RU {j}
data word length, digit-serial architectures offer alternative low
complexity designs when compared to bit-parallel ImpcostT = ComputeTCost (T,R)
architectures. For the implementation of the concept we used Iccost j = impcost j + impcost T
the GB method.
end if
As CSD having 33% reduction in nonzero elements as
compared to binary. So to increase the efficacy of code, Sign end if
Digit (SD) was used. It help in DSP to get low power efficient, end for
high speed arithmetic structure and reduce area consumption
[12]. In this the numbers get represented in to set of {1, 0, -1} find the intermediate constant , ic, with the minimum Iccost j
and they must be as below, cost among all possible constants, j
1. A pair of non-zeros digits does not appear. R ← R U {ic}
2. Non-zero elements must be equal to (n+1)/2, where n is digit (R,T) = synthesize (R,T)
number.
end while
The Boolean function is important part of any system for
implementing the circuit, and it represented by the proportional D = synthesizeMinArea(R)
formula which represented by Conjunctive Normal Form Return D
(CNF) [13]. After this the arithmetical arrangement of the bit
should be done using the Digital Series Arithmetic (DSA). In
DSA the bits are divided into d-bits and processed serially by
applying each bit in parallel manner. It overcomes the
B. Implementation
1) FIR filter

The transfer function of FIR is given as (1),

𝑁−1
𝑌(𝑧)
𝐻(𝑧) = = � ℎ[𝑛]. 𝑧 −𝑛 (1)
𝑋(𝑧)
𝑛=0

For FIR filter realization according to hardware and Fig. 2. Multiplier block that collects all multiplication operation together
software frequency response must be in time domain format.
To find the time domain output sample z transform was used 2) CSA
and computed using (2) as below. This is used for getting more precise result than already
exist methods. In carry select adder we use two Ripple Carry
𝑁−1
Adder (RCA) to produce the output along with full adder
circuit.
𝑦(𝑛) = � ℎ[𝑘]. 𝑥[𝑛 − 𝑘] (2)
𝑛=0 It has complex circuit with high propagation delay of ripple
carry adder. So instead of using two RCA in CSA,
In formula x [n-k] denote the sample input delayed by factor combination of one RCA and one Binary to Excess Converter
n. h [k] is the coefficient of FIR filter and y(n) is filter sample (BEC) is used. But to replace “n bit “RCA “n+1 bit” BEC
output. require. For 4 bit system following Fig. 3 shows the
As FIR are less sensitive than Infinite Impulse Response implementation of BEC.
(IIR) to constant accuracy for same order. FIR is implemented
in various realization formats [15][16][17]. Only direct and
optimized formats were used for software implementation
purpose due to its convenience over other realizations.
In Digital Signal Processing FIR plays important role due to
its linear phase characteristic along with feed forward
implementation. It results into stable filter in digital world with
high performance. FIR filter implementation is shown below
in Fig. 1.

Fig. 3. BEC implementation for 4 bit

To replace the one RCA circuit with BEC, first need to


implement or simplify the CSA using two RCA. This
implementation show in below Fig. 4.
Fig. 1. FIR filter implementation

The above architecture has similar complexity in hardware.


Hence, we go for the implementation of transposed form FIR
filter with generic multipliers. The multiplier block of the
digital FIR filter in its transposed Form, where the
multiplication of filter coefficients with the filter input is
realized, has significant impact on the complexity and
performance of the design because a large number of constant
multiplications are required. So just collect all multiplication
operation together and name it as multiplier block Fig. 2. This
is shown below,

Fig. 4. Implementation of CSA with two RCA


In the above circuit to eliminate the disadvantages of RCA multiplication of set of variables with the particular constants.
and make the circuit according to specification which are In FIR filter it detects the common addition and subtraction
describe in the whole paper. Replace one RCA with BEC then operation and reduces them to improve speed of the execution,
circuit become Fig. 5. which helps to reduce the delay in somewhat manner.
a) Full Adder
It is used for the addition of binary numbers and produces
two output sum and carry. The gate structure is shown in Fig.
7. It consists of A and B inputs, Cin is the carry of least
significant bit. S and Cout are the outputs of Sum and Carry
respectively.
Input Output
A B Cin Cout S
0 0 0 0 0
1 0 0 0 1
0 1 0 0 1
1 1 0 1 0
0 0 1 0 1
1 0 1 1 0
Fig. 5. Replacement of RCA with BEC 0 1 1 1 0
1 1 1 1 1
Fig. 7. Full Adder
3) D-latch
RS flip-flop works as basic fundamental blocks for D flip- To increase number of bits (2n ) the single bit adder must be
flop. It consist of single input (D) driven from S input of RS place in cascade to get desire output. Typical representation of
flip-flop and another is D � from R. With this strategy input Carry and Sum along with relation between them as given in
combination error are reduced. The enable input is second (3).
input for D flip-flop which help f/f to hold. This is ANDed
with D input. The holding condition occurs when enable is 0
resulting in R=0 and S=0 and if it is 1 S=D and R =D � . That is
𝑆𝑢𝑚 = 2 × 𝐶𝑜𝑢𝑡 + 𝑆 (3)
output of system in this condition equals to D. When enable
goes to 0, the output remain in previous state. The implementation of full adder is as (4) and (5),
The working of it indicates that it changes only for or rising
falling edge of clock. Also it stores the data which is 𝑆 = 𝐴⨁𝐵⨁𝐶𝑖𝑛 (4)
advantageous for us for implementing the FIR. As it only do
the work of passing the output of BEC and RAC combination
to the multiplexer but, due to this advantage of the latch, the 𝐶𝑜𝑢𝑡 = (𝐴. 𝐵) + (𝐶𝑖𝑛 . (𝐴⨁𝐵)) (5)
delay of circuit get reduces. The position of this component
exactly below the BEC can be observed from Fig. 6.

In this implementation, the final OR gate before the carry-


Input Output
D Q � out output may be replaced by an XOR gate without altering
Q
0 0 0 the resulting logic. Using only two types of gates is convenient
0 1 1 if the circuit is being implemented using simple IC chips
1 0 0 which contain only one gate type per chip.
1 1 1
b) Full Subtractor
2’s complement method is used to implement subtraction
operation (Fig. 8). In which D f/f get initialized by 1 along
Fig. 6. Dlatch operation of d inverter. The subtraction operation is
implemented using 2’s complement, requiring the initialization
4) Full adder and subtractor of the D flip-flop with 1 and additional d inverter operation.
These two components are used to implement the sifting and The operation of subtraction is,
adding operation. As probably the FIR filter is form by MCM
(Multiple Constant Multiplication) concept[8][18]. It is the
method in which, the number of variables are reduce and the
speed of multiplication improves. Also it shows the
This equation further provided to CSA which generate final
output. The algorithm run in Xilinx 14.2 shows RTL sketch of
final GB algorithm with CSA and design summary.

Fig. 8. Subtraction Operation


Fig. 10. General GB Representation
The full subtractor is combinational circuit which is used to
perform subtraction of three bits A, B and C shown in Fig. 9. It
has three inputs, A (minuend) and B (subtrahend) and C
(subtrahend) and two outputs D (difference) and Borrow;
D=A-B-C (neglecting the sign convention) Borrow = 1 If A <
(B+ZC)
The logic diagram and truth table shown below, Fig. 11. Simplified Structure of GB Representation

Input Output IV. RESULT AND DISCUSSION


A B C Diff Bor. In this paper, primary focus is on reduction of the power
0 0 0 0 0
consumption and delay along reduction in area. In this, the
0 0 1 1 1
0 1 0 1 1
BEC method used to simplify the addition as well as the GB
0 1 1 0 1 method that produces great output using less computation
1 0 0 1 0 resources. BEC method is implemented in VHDL coding and
1 0 1 0 0 executed using the Xilinx software.
1 1 0 0 0
1 1 1 1 1
The above screen shot shows the difference between the two
techniques (CSE and GB). It shows the number of the cycle’s
utilization and the amount of data utilized by the particular
Fig. 9. Full subtractor
variables.
III. IMPLEMENTATION EXAMPLE
Detail explanation of solving FIR filter coefficient for simplify
the structure with the help of CSE. The filter coefficient with
value h0 = 110011010101, h1 = 001010101011, h2 =
011010110100, h3 = 101010100011 resulted as (6) below,

2−1 𝑋3 + 2−5 𝑋1 + 2−6 𝑋7 +


{2−3 𝑋7 + 2−11 𝑋3 }ℎ[−1] Fig. 12. Output of GB using CSA+BEC
(6)
+{2−2 𝑋6 + 2−7 𝑋6 }𝐻[−2]
+{2−1 𝑋7 + 2−11 𝑋3 }𝐻[−3] It was shown that as number of the variable are more in GB
algorithm; still we got minimum number of the data utilization.
Before providing to CSA equation 6 was plotted (Fig. 10 The further results generated by the implementation of the
and Fig. 11). It shows the different combination with minimum above two methods for FIR in Xilinx shown in below graphs.
number of shift and add structure. For implementation from
above equation consider final value of h0 (100011000000)2 and
h1 (001000000010)2. Then convert it into decimal for
simplicity h0 (812)10 and h1 (202)10. It is time and space
consuming to show multiplication using GB. Therefore
following example in Fig. 10 and Fig. 11 shows exact method.
and 50% CSE and GB components respectively as compared
Delay with CSA. Power efficiency was increased by 56% and 64% in
60 CSE and GB respectively as compare to CSE and GB using
CSA. This work was limited to the simulation. The
50 experimental implementation on hardboard would add more
CSE CSA
CSE CSA+BEC impact in this research.
Delay generation (ns)

40
GB CSA
GB CSA+BEC
30
REFERENCES
20

[1] J. McClellan, T. Parks, and L. Rabiner, “A computer program for


10
designing optimum FIR linear phase digital filters,” Audio Electroacoust.
…, 1973.
0 [2] A. G. Dempster, S. Member, M. D. Macled, and A. G. Representation,
4 8 16 32 64 “Blocks in FIR Digital Filters,” vol. 42, no. 9, 1995.
Number of Bits [3] W. L. Gallagher and E. E. Swartzlander, “High Radix Booth Multipliers
Using Reduced Area Adder Trees,” pp. 545–549, 1995.
[4] R. I. Hartley, “Subexpression Sharing in Filters Using Canonic Signed
Fig. 13. Delay comparison of FIR filter Digit Multipliers,” vol. 43, no. October, 1996.
[5] H. T. Nguyen and A. Chatterjee, “Number-Splitting with Shift-and-Add
Decomposition for Power and Hardware Optimization in Linear DSP
For two techniques (CSE and GB) with Delay, Area, power Synthesis,” vol. 8, no. 4, pp. 419–424, 2000.
consumption and memory were plotted for CSA and CSA + [6] M. Potkonjak, M. B. Srivastava, and A. P. Chandrakasan, “Multiple
BEC components in Fig. 13 and Fig. 14 respectively. The Constant Multiplications: Efficient and Versatile Framework and
Algorithms for Exploring Common Subexpression Elimination,” IEEE
above graph shows utilization and consumptions of different
Trans. Comput. Des. Integr. CIRCUITS Syst., vol. 15, no. 2, 1996.
parameters of FIR filter implementation using two different [7] H. Applications, J. Park, W. Jeong, H. Mahmoodi-meimand, S. Member,
techniques, along with the two different components. Fig. 13 Y. Wang, H. Choo, and K. Roy, “Computation Sharing Programmable
shows that the proposed method gives 50% delay reduction. FIR Filter for,” vol. 39, no. 2, pp. 348–357, 2004.
[8] Y. Voronenko and M. Püschel, “Multiplierless multiple constant
multiplication,” ACM Trans. Algorithms, vol. V, pp. 1–39, 2005.
Power Consumption [9] L. Aksoy, S. Member, E. Costa, and P. Flores, “Exact and Approximate
Algorithms for the Optimization of Area and Delay in Multiple Constant
70 Multiplications,” vol. 27, no. 6, pp. 1013–1026, 2008.
[10] Z. Milivojević, “FIR Filter,” in Digital Filter Design, 1st ed.,
60 CSE (CSA) MikroElektronika, 2009.
CSE (CSA+BEC) [11] L. Aksoy, C. Lazzari, E. Costa, P. Flores, and J. Monteiro, “Efficient
Power Consumption (nW)

50 GB (CSA) shift-adds design of digit-serial multiple constant multiplications,” Proc.


GB (CSA+BEC) 21st Ed. Gt. lakes Symp. Gt. lakes Symp. VLSI - GLSVLSI ’11, vol. 2, p.
40 61, 2011.
[12] I. Koren, Computer Arithmatic Algorithms. A.K.Peters ltd., 2002.
30
[13] T. Larrabee, “Test Pattern Generation Using Boolean Satisfiability,” vol.
11, no. 1, pp. 4–15, 1992.
20
[14] L. Aksoy, C. Lazzari, E. Costa, P. Flores, J. Monteiro, and S. Member,
“Design of Digit-Serial FIR Filters : Algorithms , Architectures , and a
10
CAD Tool,” vol. 21, no. 3, pp. 498–511, 2013.
[15] “Digital Filter Design.” [Online]. Available:
http://www.mikroe.com/chapters/view/72/chapter-2-fir-filters/#id24.
0
4 8 16 32 64 [16] L. Tag and Jiang jean, Digital Signal processing. 2013, p. 896.
Number of Bits
[17] J. Gurung, Signals and System. 2009, p. 636.
[18] L. Aksoy, C. Lazzari, E. Costa, P. Flores, and J. Monteiro, “Optimization
Fig. 14. Power utilized by FIR filter of area in digit-serial Multiple Constant Multiplications at gate-level,”
2011 IEEE Int. Symp. Circuits Syst., vol. 2, pp. 2737–2740, May 2011.
It was observed that area utilization decreased by 5%, power
consumption decreased by 55% (Fig. 14). Additionally
memory utilization decreased by 4% of improvement in
previous methods or structures instead of having more step
utilization of GB method.

IV. CONCLUSION
FIR filter is important part of the DSP system and it widely
used implemented to any VLSI and the Communication
circuits. In the current research, we have presented the GB
method along with the CSA which uses combination of RCA
and BEC that gives better results. It improves delay by 48%

You might also like