FIR Filter Using Distributed Arithmetic v3

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 16

FIR Filter Using Distributed Arithmetic

Introduction
Distributed Arithmetic (DA) is a different approach for implementing digital filters. The basic idea is to replace all multiplications and additions by a table and a shifteraccumulator. DA relies on the fact that the filter coefficients are known, so multiplying c[n]x[n] becomes a multiplication with a constant. This is an importance difference and a prerequisite for a DA design. Sysgen has a built-in DA token which we will not use to implement our design because we will learn how to integrate VHDL code to System Generator by using black boxes and co-simulation tokens. Finally, we will download the DA design to the VirtexII Pro board and use it to run hardware verification.

Distributed Arithmetic
Distributed Arithmetic (DA) can be used to compute sum of products. Many DSP algorithms like convolution and correlation are formulated in a sum of products (SOP) fashion. Consider the following sum of products: y = c, x = c[ n] x[ n ] = c[ 0] x[ 0] + c[1] x[1] + + c[ N 1] x[ N 1]
n =0 N 1

Further assume that the coefficients c[n] are known values and that the variable x[n] can be represented by x[ n] = xb [ n ] 2 b with xb[n] [0,1] , where xb [ n] represents the bth bit position of the numbers binary representation. The SOP can be represented as: y = c, x = c[ n] xb [ n] 2 b
n =0 b =0 N 1 B 1 b =0 B 1

Expanding the summations yields to:


y = c, x = c[ 0] ( x B 1 [ 0] 2 B 1 + x B 2 [ 0] 2 B 2 + x 0 [ 0] 2 0 )
+ c[1] ( x B 1 [1]2 B 1 + x B 2 [1]2 B 2 + x 0 [1]2 0 )

+ c[ N 1] ( x B 1 [ N 1]2 B 1 + x B 2 [ N 1]2 B 2 + x 0 [ N 1]2 0 )

Redistributing the terms we have:


y = c, x = ( c[ 0] x B 1 [ 0] + c[1] x B 1 [1] + c[ N 1] x B 1 [ N 1]) 2 B 1
+ ( c[ 0] x0 [ 0] + c[1] x 0 [1] + c[ N 1] x 0 [ N 1] ) 2 0 + ( c[ 0] x B 2 [ 0] + c[1] x B 2 [1] + c[ N 1] x B 2 [ N 1] ) 2 B 2

In more compact form: y = c, x = 2 b c[ n] xb [ n]


b =0 n =0 B 1 N 1

The key is to realize that the second summation can be mapped to a Look Up Table (LUT). The coefficients c[n] are known and the xb [ n] values are either 1 or 0 then each SOP is just a combination of the c[n]s for which a true table can be constructed. Suppose we have:

( c[ 0] xB 2 [ 0] + c[1] xB 2 [1] + c[ N 1] xB 2 [ N 1] ) 2 B 2
Where each xB-2 digit belongs to a different x[n] variable; nevertheless we can form an N bit word that can take 2N values, i.e. with N=7 one of the possible outcomes is:

( c[ 0] 0 + c[1] 1 + c[ 2] 1 + c[3] 0 + c[ 4] 1 + c[5] 0 + c[ 6] 0 ) 2 B 2

= ( c[1] + c[ 2] + c[ 4] ) 2 B 2

Multiplication by a power of 2 is no more that a bit shift, so what need to do is to slice and concatenate the bits of the different x[n] in order to build a table given that the c[n] are all known. What is left is to show how we can deal with signed implementations of DA. A minor modification needs to be introduced when working with signed twos complement numbers. In twos complement, the MSB is used to determine the sign of the number. We use, therefore, the following B-bit representation: x[ n] = 2 B 1 x B 1 [ n] + xb [ n] 2 b
b =0 B 2

Then, the output y[n] is defined by:

y[ n ] = 2

B 1

c [ n ] x B 1 [ n ] + 2 c [ n ] x b [ n ]
b n= 0 b= 0 n= 0

N 1

B 2

N 1

Finally, a block diagram for the DA implementation of a FIR filter is shown in figure 1.

Bit shift register


XB-1[0] XB-1[1] ... ... X1[0] X1[1] X0[0] X0[1]

Arith. Table

Scaling Accumulator

. . .

. . .

. . .

+/-

XB-1[N-1]

...

X1[N-1]

X0[N-1]

Fig.1 DA Block Diagram

SysGen Implementation
We will use VHDL to implement all major parts of the design. According to fig. 1, we need: Register SOP Table Pre-Adder (LUT Adder) Scaling Accumulator

Download lab2.zip and uncompress it on C:\DSP_Spring07\Lab2 .The following files should appear: dsp_fir.mdl, register_1to7.mdl, filter_lut_a.mdl, filter_lut_b.mdl, lut_adder.mdl regne_config.m, filter_lut_a_config.m, filter_lut_b_config.m, lut_adder_config.m regne.vhd, filter_lut_a.vhd, filter_lut_b.vhd, lut_adder.vhd, scaling_accumulator.vhd

Black Boxes for HDL Co-Simulation


System Generator libraries provide high and low level functions for building systems. However, there may be instances when you need to build blocks using HDL modules. These HDL modules need to be simulated a long with other SysGen blocks. The black box block provides an interface between the Simulink model and the HDL source code. An HDL component associated with a black box must adhere to the following System Generator requirements and conventions

Register

LUT

The entity name must not collide with any other entity name in the design Bidirectional ports are not allowed on the top-level black box entity For Verilog black boxes, the module and port names must be lower case and must follow standard VHDL naming conventions Any port that is not a clock or clock enable must be of type std_logic_vector Any port that is a clock or clock enable must be of type std_logic Clock and clock enables must appear as pairs Each clock name (and clock enable name) must contain the substring CLK and CE

A black box must describe its interface through a MATLAB M-function. The configuration M-function is generated automatically by System Generator and some editing needs to be done in order to specify the characteristics of the black box entity. The M-function contains: The top-level entity name of the HDL component that should be associated with the black box. The language, i.e. VHDL or Verilog Describes ports, including type, direction, bit width, binary point position, name, and sample rate Defines any generics required by the black box HDL Specifies the black box HDL and other files that are associated with the block Defines the clocks and clock enables for the block Declares whether the HDL has any combinational feed-through paths

Lets proceed to create a black box for our Scaling Accumulator VHDL code
1. Set your working directory to C:\DSP_Spring07\Lab2, open Simulink and create a new model and named it scaling_accumulator.mdl. 2. Add the ModelSim block from Xlinx BlocksetTools. 3. Add the black box block from Xilinx BlocksetBasic Elements. 4. The Configuration Wizard detects HDL files and opens a new window. Select the scaling_accumulator.vhdl file which contains the entity description. Figure 2.

Fig.2 Select the vdhl file that contains the black box description file 5. Click on the OK box of the Wizard Notice. The configuration M-File will open. 6. Configure the input ports by editing commented parts in the configuration M-file. Replace the comments in:
if (this_block.inputTypesKnown) % do input type checking, dynamic output type and generic setup in this code block. % (!) Port 'LUT0' appeared to have dynamic type in the HDL -- please add type checking as appropriate; % (!) Port 'ALUT' appeared to have dynamic type in the HDL -- please add type checking as appropriate; % (!) Port 'BLUT' appeared to have dynamic type in the HDL -- please add type checking as appropriate; % (!) Port 'CLUT' appeared to have dynamic type in the HDL -- please add type checking as appropriate; % (!) Port 'DLUT' appeared to have dynamic type in the HDL -- please add type checking as appropriate; % (!) Port 'LUT5' appeared to have dynamic type in the HDL -- please add type checking as appropriate; % (!) Port 'LUT6' appeared to have dynamic type in the HDL -- please add type checking as appropriate; % (!) Port 'LUT7' appeared to have dynamic type in the HDL -- please add type checking as appropriate; % (!) Port 'Filter_out' appeared to have dynamic type in the HDL % --- you must add an appropriate type setting for this port end % if(inputTypesKnown)

With the following piece of code:


if (this_block.inputTypesKnown) % do input type checking, dynamic output type and generic setup in this code block. this_block.port('LUT0').useHDLVector(true); if (this_block.port('LUT0').width ~= 23); this_block.setError('Input data type for "LUT0" end this_block.port('ALUT').useHDLVector(true); if (this_block.port('ALUT').width ~= 23); this_block.setError('Input data type for "ALUT" end this_block.port('BLUT').useHDLVector(true); if (this_block.port('BLUT').width ~= 23); this_block.setError('Input data type for "BLUT" end this_block.port('CLUT').useHDLVector(true); if (this_block.port('CLUT').width ~= 23); this_block.setError('Input data type for "CLUT" end this_block.port('DLUT').useHDLVector(true); if (this_block.port('DLUT').width ~= 23); this_block.setError('Input data type for "DLUT" end this_block.port('LUT5').useHDLVector(true); if (this_block.port('LUT5').width ~= 23); this_block.setError('Input data type for "LUT5" end this_block.port('LUT6').useHDLVector(true); if (this_block.port('LUT6').width ~= 23); this_block.setError('Input data type for "LUT6" end this_block.port('LUT7').useHDLVector(true); if (this_block.port('LUT7').width ~= 23); this_block.setError('Input data type for "LUT7" end

must have width of 23.');

must have width of 23.');

must have width of 23.');

must have width of 23.');

must have width of 23.');

must have width of 23.');

must have width of 23.');

must have width of 23.');

7. To configure the output port, we need to specify the output bit width, binary point position, signed or unsigned data, and generic values. Add the following code after the previous block:
% (!) Port 'Filter_out' appeared to have dynamic type in the HDL Filter_out_port = this_block.port('Filter_out'); input_bitwidth = this_block.port('LUT0').width; % Set up the fixed parameters of the filter % Calculate the width of the output based on worst case values for data % and coefficicients output_bitwidth = input_bitwidth+7; % Set the output data type Filter_out_port.makeSigned; Filter_out_port.width = output_bitwidth; Filter_out_port.binpt = 25; % (!) Customize the following generic settings as appropriate. If any settings depend % on input types, make the settings in the "inputTypesKnown" code block. this_block.addGeneric('Nb_in', this_block.port('LUT0').width); this_block.addGeneric('Nb_out', this_block.port('Filter_out').width);

% --- you must add an appropriate type setting for this port

8. Finally, delete the following lines:


% (!) Custimize the following generic settings as appropriate. If any settings depend % on input types, make the settings in the "inputTypesKnown" code block. this_block.addGeneric('Nb_out','integer','30'); this_block.addGeneric('Nb_in','integer','23');

9. Save and close the M-configuration file. On the Simulink model (mdl file) click over the name Black Box and proceed to change it to Scaling Accumulator. 10. Open the dsp_fir.mdl file. Copy and paste the newly created Scaling Accumulator. Open the register_1to7.mdl file and copy the Register block to the dsp_fir.mdl 11. Save and close dsp_fir.mdl, register_1to7.mdl, and scaling_accumulator.mdl files. Close and launch again MATLAB and Simulink to verify that the Scaling Accumulator block is under the DSP Spring 07 Library.

Fig.3.a 12. On the Matlab menu, click on FileSet Path and add the folder C:\DSP_Spring07\Lab2 .Click on save and close. Note: You may receive a warning message if you do not have write permission to update the MATLAB installation directory. Click Yes to save the file in your working directory (in this case, lab2). If you close MATLAB you will need to set the path again.

Now, when you load Simulink you should be able to see under the DSP Spring 07 Library five new blocks: Register, LUT A, LUT B, and LUT Adder, and Scaling Accumulator.

Fig.3.a Scaling Accumulator block and (3.b) dsp_fir.mdl Simulink model.

DA FIR Design Implementation and Code Generation


The FIR filter to be implemented in DA is the Sixth order FIR filter of the previous laboratory. Figure 4 presents a more detailed block diagram description of the DA implementation. Notice: a) The input of the filter is a B-bit binary number formed from each b-th position of all N input numbers. Therefore, we need an entity that performs the slicing and concatenation of the binary positions for the new input number. b) The reason for two LUTs in the design is that LUTs are 4 input blocks and our filter has 7 coefficients, so one LUT will have a 4-bit wide input and the other a 3 bit-wide input. The BitBasher block performs slicing, concatenation and augmentation of inputs attached to the block. The block may have up to four output ports and the number of outputs is equal to the number of expressions specified in the BitBasher Expression dialog. The advantage of this block over others is that it does not cost anything to implement in hardware.

DA 7-Tap FIR Filter


X0

X1

X2

Partial Product ROM

Pre-Adder

X3

+ Partial Product ROM


Fig.4 Block Diagram of a 7-Tap DA FIR filter

+/-

Z-1

X4

X5

X6

Filter specifications: Low Pass Filter. Signed 8 bit input number. Represented in Fix 8_4. Sampling frequency of 1kHz. Coefficients quantized to 23 bits. Represented in Fix 23_21. Full precision Adders and Mult. blocks. Filter output of 30 bits. Represented in Fix 30_25.

Filter implementation:
1. Set your working directory to C:\DSP_Spring07\Lab2\ and open a new model. 2. Add the Register block from DSP Spring 07 Library.

3. Add and connect two Sine Wave inputs, a Sum Block and the Xilinx Gateway In block to the Register block. The input to the register is the addition of the two waves. By double clicking on the Gateway In block set the output type to Signed, Number of Bits to 8, Binary Point to 4, and Sample Period to 0.001 sec.

4. Set the frequencies of one of the sine to 2*pi*5 rad/sec. and the other to 2*pi*300 rad/sec. 5. From SimulinkCommonly Used Block add a Constant block. Set the constant value to 1. Add a Xilinx Gateway In and set the output type to Unsigned, Number of Bits to 1, Binary Point to 0, and Sample Period to 0.001. So far the model should look like figure 4. 6. From Xilinx BlocksetBasic Elements add the BitBasher block. Double click on the BitBasher block and enter the following expressions: outLSB1={d[0],c[0],b[0],a[0]}; outMSB1={g[0],f[0],e[0]}; outLSB2={d[1],c[1],b[1],a[1]}; outMSB2={g[1],f[1],e[1]}; This block slices and concatenates the first two LSB positions of the 8-bit register outputs. Add three more BitBasher blocks to the design and configure them accordingly. Hint: Copy and paste the previous block expression and change the bit positions 7. Connect the BitBasher block to the Register. Click on the Register, hold down the Ctrl key while left-clicking on the BitBasher block. When connecting the remaining three blocks be sure to connect Reg1 to a, Reg2 to b, Reg3 to c, Reg4 to d, Reg5 to e, Reg6 to f, and Reg7 to g.

Fig.4 Input blocks to the Register block in the DA implementation.

8. Add the filter_lut_a, and filter_lut_b blocks from the DSP Spring 07 Library and connect them to the BitBasher blocks. Also add the lut_adder block to the model and connect it to the outputs of the filter_lut_a and filter_lut_b blocks. Your design should look like figure 5. 9. When designs grow big it is good practice to add registers in order to reduce the propagation delay among the components of the design. From Xilinx BlocksetBasic Elements add Delays (registers) and connect them to the outputs of all the lut_adder blocks. 10. Add the Scaling Accumulator block from the DSP Spring 07 Library and connect it to the output of the delay blocks. Add a Gateway Out block and a Scope block. Set the Number of Axes to 2 in the Scope properties and connect one to the Scaling Accumulator output and the other to the output of the summation block. Figure 6 11. Add the System Generator block and set the Simulink system period to 0.001. Verify that the following settings: Compilation: HDL Netlist Part: Virtex2p xc2vp30-7ff896 Target directory: ./netlist Synthesis Tool: XST

12. Set the simulation time to 2 seconds. Double click on all the filter_lut_a, filter_lut_b, and lut_adder blocks as well as on the Scaling_accumulator block and verify that ISE Simulator is selected on the Simulation mode field. 13. Run the Simulation 14. Double Click on the System Generator Block and click on the Generate Box.

Fig. 5 BitBasher, LUTs and LUTs adder in the DA FIR implementation

Fig. 6 Final connections for the DA Filter

Verifying through Hardware Co-Simulation


On this final and simple step we will create a hardware co-simulation block and perform both hardware and software HDL co-simulation.
1. Create a subsystem of the DA Filter. Select all components except Gateways I/O, input sources, System Generator token and Scope then right-click and select the Create Subsystem block. Re-arrange the input and output connection so the design will look like figure 7. Note: To edit the names of inputs and output of the subsystem double click over the block and edit the names accordingly.

Fig. 7 Compact form of the DA filter 2. Save the model as da_dir_filter_hwcosim.mdl. 3. Double click the System Generator block, click Compilation and select Hardware CoSimulationxupxup_virtex_ii_pro 4. Enter ./netlis_hw as the Target Directory and click Apply to accept changes. 5. Click Generate and wait for the compilation process to finish. 6. Set the Number of Axes to 3 in the Scope properties. Copy the da_fir_filter_hwcosim block to the design and connect it to the input and output so it looks like figure 8.

Fig. 8 Hardware co-simulation 7. Right-click the DA Filter subsystem, select Block Properties and type 10 in the Priority field. 8. Right-click the da_fir_filter_hwcosim block, select Block Properties and type 0 in the Priority field. 9. Save the model. 10. Connect the power cable and the usb cable of hardware board. Turn on the power. Wait for Windows to finish the New Hardware installation. 11. Double-click on the hardware model and set the cable to Plataform USB. 12. Click the Run button in the Simulink window to run the simulation.

IIR Implementation
Implementing IIR filters is not that different from implementing FIR filters. In fact, there are more structures dedicated to IIR than FIR filters. Find about IIR structures, chapter 6 of Oppenheim and Schafer is a good place to start, and turn in a simulation of the implementation that you pick for the following system function:
H ( z) = 0.0465829 + 0.1863316z -1 + 0.2794974z -2 + 0.1863316z -3 + 0.0465829z -4 1 0.782095z -1 + 0.6799785z - 2 0.1826756z -3 + 0.0301188z -4

1. The inputs are two sine waves of frequencies 2**5 rad/sec and 2**450 rad/sec. Set both amplitudes to 1. 2. In the Gateway In block, set Number of bits to 16 and Binary Point to 12. 3. Be sure to set the Simulink System Period to 0.001 sec in the System Generator token and the Sample Period of the Gateway In to 0.001 sec.

4. To implement the coefficients of the filter use the Constant block from the Xilinx BlocksetBasic Elements and set the Number of bits to 27 and Binary Point to 22.
5. Configure the Mult. Block to Number of bits 27, Binary Point 22, Quantization to Round, Overflow to Saturate, and Latency to 0.

Your report must include: A little description of you design. Explan why you choose it Print outs of your Simulink model and the filter output. From the previous section. Print the outputs of the DA FIR implementation and the configuration file of the sacaling_accumulator block.

You might also like