DSP Course File2020-21 With 5 Units
DSP Course File2020-21 With 5 Units
DSP Course File2020-21 With 5 Units
Course File
Course Name : DIGITAL SIGNAL PROCESSING
Department of
ELECTRONICS AND COMMUNICATION ENGINEERING
1
COURSE FILE CONTENTS
Sl.No. Topic Page no.
1. Department Vision and Mission 4
2. Course Description 4
3. Course Overview 5
4. Course Pre-requisites 5
5. Marks Distribution 5
6. POs and PSOs 5,6
7. Course outcomes (COs) 6
8. CO mapping with POs and PSOs 6
9. Syllabus, Textbooks and Reference Books 7
10. Gaps in syllabus 8
11. Course plan/Lesson Plan 8-10
12. Lecture Notes 11
Unit-I 12-30
Unit-II 31-67
Unit-III 68-80
Unit-IV
Unit-V
13. Unit wise Question Bank 81-85
a. Short answer questions
b. Long answer questions
14. Previous University Question papers 86-89
15. Internal Question Papers with Key 93-98
16. Unit wise Assignment Questions 90-92
17. Content Beyond Syllabus 99
18. Methodology used to identify Weak and bright 100
students
Support extended to weak students
Efforts to engage bright students
2
CERTIFICATE
Verifying authority:
1. Head of the Department: Dr. N. Srinivasa Rao
2.
3.
PRINCIPAL
3
MATRUSRI ENGINEERING COLLEGE
Saidabad, Hyderabad-500 059.
(Approved by AICTE & Affiliated to Osmania University)
DEPARTMENT MISSION
COURSE DESCRIPTOR
Course Code
PC502EC
Programme BE
Semester V ECE
Theory Practical
3 1 4
4
I. COURSE OVERVIEW:
Digital Signal Processing (DSP) is concerned with the representation, transformation and manipulation of
signals on a computer. After half a century advances, DSP has become an important field, and has
penetrated a wide range of application systems, such as consumer electronics, digital communications,
medical imaging and so on. With the dramatic increase of the processing capability of signal processing
microprocessors, it is the expectation that the importance and role of DSP is to accelerate and expand.
Discrete-Time Signal Processing is a general term including DSP as a special case. This course will
introduce the basic concepts and techniques for processing discrete-time signal. By the end of this course,
the students should be able to understand the most important principles in DSP. The course emphasizes
understanding and implementations of theoretical concepts, methods and algorithms.
SEE CIA
Subject Total Marks
Examination Examination
Digital Signal Processing 70 30 100
CO1 Necessity and use of digital signal processing and its application.
6
VIII. SYLLABUS :
UNIT I- No.of Hrs
Discrete Fourier Transform and Fast Fourier Transform: Discrete Fourier Transform
(DFT), Computation of DFT- Linear and Circular Convolution, FFT algorithms: Radix-
10
2 case, Decimation in Time and Decimation in Frequency algorithms- in place
computation- bit reversal.
UNIT II-
Infinite Impulse- response Filters (IIR): Introduction to filters, comparison between
practical and theoretical filters, Butterworth and ChebyShev approximation, IIR digital
filter design techniques Impulse Invariant technique- Bilinear transformation technique, 12
Digital Butterworth & Chebyshev filters. Implementation.
UNIT III-
Finite impulse-response Filters (FIR) : Linear phase filters, Windowing techniques for
design of Linear phase FIR filters- Rectangular, triangular, Bartlett, Hamming,
Hanning, Kaiser windows, Realization of filters, Finite word length effects, Comparison 10
between FIR and IIR.
UNIT IV-
Multirate Digital Signal Processing: Introduction- Decimation by factor D and
interpolation by a factor I- Sampling Rate conversion by a Rational factor I/D-
Implementation of Sampling Rate conversion- Multistage implementation of Sampling 8
Rate conversion- Sampling conversion by a Arbitrary factor, Application of Multirate
Signal Processing.
UNIT V-
Introduction to DSP Processors: Difference between DSP and other microprocessors
architecture their comparison and need for ASP, RISC and CPU- General Purpose DSP
processors: TMS 320C54XX processors, architecture, addressing modes- instruction 10
set.
TEXT BOOKS:
1. Alan V. Oppenheim and Ronald W. Schafer, “Digital Signal Processing”, 2/e, PHI, 2010
2. John G. Praokis and Dimtris G. Manolakis, “Digital Signal Processing: Principles, Algorithms and
Application”, 4/e, PHI, 2007.
3. Avathar Singh and S. Srinivasan, “Digital Signal Processing using DSP Micrprocessor”, 2/e,
Thomson Books, 2004.
4. John G Proakis and Vinay K Ingle, “ Digital Signal Processing using MATLAB” 3/e, Cengage
Learning, 1997.
5. Richard G Lyons, “Understanding Digital Signal Processing”, 3/e, Prentice Hall.
REFERENCES:
7
IX. GAPS IN THE SYLLABUS - TO MEET INDUSTRY / PROFESSION REQUIREMENTS:
PPT/BB/
OHP/ No. Text
Lecture Relevant
Topics to be covered Book/Reference
No. e- of Hrs COs
Book
material
1. Discrete Fourier Transform and Fast
BB/ e- 1 CO1 DSP by Ramesh
Fourier Transform: Introduction babu
material
2. Discrete Fourier Transform (DFT) BB 1 CO1 DSP by Ramesh
babu
3. Computation of DFT- Linear
BB 1 CO1 DSP by Ramesh
Convolution babu
4. Computation of DFT- Circular
BB 1 CO1 DSP by Ramesh
Convolution babu
5. Introduction to FFT and its types BB 1 CO1 DSP by Ramesh
babu
6. FFT algorithms: Radix-2 case 1 CO1 DSP by Ramesh
BB
babu
7. Decimation in Time algorithms 1 CO1 DSP by Ramesh
BB
babu
8. Decimation in Frequency algorithms BB 2 CO1 DSP by Ramesh
babu
9. In place computation- bit reversal. 1 CO1 DSP by Ramesh
BB
babu
10. Problems on DFT and FFT BB 2 CO1 DSP by Ramesh
babu
11. Infinite Impulse- response Filters
BB 1 CO2 DSP by Ramesh
(IIR): Introduction to filters babu
12. comparison between practical and
1 CO2 DSP by Ramesh
theoretical filters BB
babu
13. Butterworth approximation BB 2 CO2 DSP by Ramesh
babu
14. Chebyshev approximation BB 2 CO2 DSP by Ramesh
babu
15. Problems on filters BB 1 CO2 DSP by Ramesh
babu
16. IIR digital filter design techniques BB 1 CO2 DSP by Ramesh
babu
17. Impulse Invariant technique BB 1 CO2 DSP by Ramesh
babu
18. Bilinear transformation technique BB 1 CO2 DSP by Ramesh
babu
19. Digital Butterworth & Chebyshev BB 2 CO2 DSP by Ramesh
babu
8
PPT/BB/
Text
Lecture OHP/ No. Relevant
Topics to be covered Book/Reference
No. e- of Hrs COs
Book
material
filters
43. Need for ASP, RISC and CPU BB 1 CO4 DSP by Ramesh
babu
44. General purpose DSP processors: BB 1 CO4 DSP by Ramesh
babu
45. TMS 320C 54XX processors,
BB, PPT 1 CO4 DSP by Ramesh
architecture babu
46. TMS 320C 54XX addressing modes BB, PPT 1 CO4 DSP by Ramesh
babu
47. TMS 320C 54XX instruction set BB, PPT 1 CO4 DSP by Ramesh
babu
48. TMS 320C 54XX Applications BB, PPT 1 CO4 DSP by Ramesh
babu
49. Revision BB 1 CO4 DSP by Ramesh
babu
50. Test 1 CO4 DSP by Ramesh
BB
babu
10
LECTURE NOTES
11
UNIT-I
DISCRETE FOURIER TRANSFORM AND FAST FOURIER
TRANSFORM
12
Introduction:
Let’s start with the individual meaning of the words defining Digital Signal Processing
in its entirety.
o Digital: In digital communication, we use discrete signals to represent data
using binary numbers.
o Signal: A signal is anything that carries some information. It’s a physical
quantity that conveys data and varies with time, space, or any other independent variable. It
can be in the time/frequency domain. It can be one-dimensional or two-dimensional. Here are
all the major types of signals.
o Processing: The performing of operations on any data in accordance with
some protocol or instruction is known as processing.
o System: A system is a physical entity that is responsible for the processing. It
has the necessary hardware to perform the required arithmetic or logical operations on a
signal. Here are all the major types of systems.
Putting all these together, we can get a definition for DSP.
The first step is to get an electrical signal. The transducer (in our case, a microphone)
converts sound into an electrical signal. You can use any transducer depending upon the case.
Once you have an analog electrical signal, we pass it through an operational amplifier
(Op-Amp) to condition the analog signal. Basically, we amplify the signal. Or limit it to
protect the next stages.
13
The anti-aliasing filter is an essential step in the conversion of analog to a digital
signal. It is a low-pass filter. Meaning, it allows frequencies up to a certain threshold to pass.
It attenuates all frequencies above this threshold. These unwanted frequencies make it difficult
to sample an analog signal.
The next stage is a simple analog-to-digital converter (ADC). This unit takes in analog
signals and outputs a stream of binary digits.
The heart of the system is the digital signal processor. These days we use CMOS chips
(even ULSI) to make digital signal processors. In fact, modern processors, like the Cortex M4
have DSP units built inside the SoC. These processor units have high-speed, high data
throughputs, and dedicated instruction sets.
The next stages are sort of the opposite of the stages preceding the digital signal
processor.
The digital-to-analog converter does what its name implies. It’s necessary for the slew
rate of the DAC to match the acquisition rate of the ADC.
The smoothing filter is another low-pass filter that smoothes the output by removing
unwanted high-frequency components.
The last op-amp is just an amplifier.
The output transducer is a speaker in our case. You can use anything else according to
your requirements.
15
even detect these signals. However, digital signal processing systems are adept at picking up
even the tiniest of disturbances and also process them easily.
Cost
o When working at scale, DSPs are cheaper.
16
Disadvantages of a Digital Signal processing system:
Complexity
o As we saw in the block diagram above, there are a lot of elements preceding
and following a Digital Signal Processor. Stuff like filters and converters add to the
complexity of a system.
Power
o A digital signal processor is made up of transistors. Transistors consume more
power since they are active components. A typical digital signal processor may contain
millions of transistors. This increases the power that the system consumes.
Learning curve and design time
o Learning the ins and outs of Digital Signal processing involves a steep learning
curve. Setting up digital processing systems thus takes time. And if not pre-equipped with the
right knowledge and tools, teams can spend a lot of time in setting up.
Loss of information
o Quantization of data that is below certain Hz causes a loss in data according to
the Rate-Distortion Theory.
Cost
o For small systems, DSP is an expensive endeavor. Costing more than
necessary.
In this section we will encapsulate the differences between Discrete Fourier Transform (DFT)
and Discrete-Time Fourier Transform (DTFT). Fourier transforms are a core component of
this digital signal processing course. So make sure you understand it properly. If you are
having trouble understanding the purpose of all these transforms, check out this simple
explanation of signal transforms.
DTFT:
DTFT stands for Discrete-Time Fourier Transform. We can represent it using the following
equation. Read the equation carefully.
17
Probably the only things that you can notice in this equation are the fact that the summation is
over some finite series. Additionally, the exponential function seems to have gotten a bit more
complicated. Let’s address what these differences actually translate to.
18
What are twiddle factors?
Twiddle factors (represented with the letter W) are a set of values that is used to speed up
DFT and IDFT calculations.
For a discrete sequence x(n), we can calculate its Discrete Fourier Transform (DFT) and
Inverse Discrete Fourier Transform (IDFT) using the following equations.
DFT:
IDFT:
Rewriting the equations for calculating DFT and IDFT using twiddle factors we get:
DFT:
IDFT:
19
(From Euler’s formula: )
Similarly calculating for the remaining values we get the series below:
As you can see, the value starts repeating at the 4th instant. This periodic property can is
shown in the diagram below.
21
Problems on DFT:
1. Find the DFT of a sequence x[n]=[1,0,-1,0]
N=4
Similarly an IDFT can be calculated using a matrix form using the following equation.
x(n) =
Here, is the complex conjugate of the twiddle factor. To get the values of the complex
conjugate, just invert the signs of the complex components of the twiddle factor. For example:
The complex conjugate of 0.707+0.707j will become 0.707-0.707j.
22
DFT as linear transform:
The matrix of is known as the matrix of linear transformation. Check out the scene of the
x(n) = X(k)
We also have the formula for calculating the IDFT using a matrix as:
x(n) =
Here, ‘I’ is an identity matrix of order N. This equation represents the fact that the DFT
displays linear transformation characteristics.
Now that we understand the twiddle factor, we can also see how it is used practically in
the calculation of IDFT using the Decimation in Frequency FFT algorithm
DFT Linear Filtering:
DFT provides an alternative approach to time domain convolution. It can be used to perform
linear filtering in frequency domain.
Thus, Y(ω)=X(ω).H(ω)⟷y(n).
The problem in this frequency domain approach is that Y(ω), X(ω) and H(ω) are continuous
function of ω, which is not fruitful for digital computation on computers. However, DFT
provides sampled version of these waveforms to solve the purpose.
The advantage is that, having knowledge of faster DFT techniques likes of FFT, a
computationally higher efficient algorithm can be developed for digital computer computation
in comparison with time domain approach.
Consider a finite duration sequence, [x(n)=0, for n<0 and n≥L] generalized equation, excites a
linear filter with impulse response [h(n)=0,for n<0 and n≥M].
x(n)h(n)
From the convolution analysis, it is clear that, the duration of y(n) is L+M−1.
In frequency domain,
Y(ω)=X(ω).H(ω)
Now, Y(ω) is a continuous function of ω and it is sampled at a set of discrete frequencies with
number of distinct samples which must be equal to or exceeds L+M−1.
DFT size = N ≥ L+M−1
23
With ,
Y(ω)=X(k).H(k), where k=0,1,….,N-1
Where, X(k) and H(k) are N-point DFTs of x(n) and h(n) respectively. x(n) & h(n) are padded
with zeros up to the length N. It will not distort the continuous spectra X(ω) and H(ω).
Since N≥L+M−1, N-point DFT of output sequence y(n) is sufficient to represent y(n) in
frequency domain and these facts infer that the multiplication of N-point DFTs of X(k) and
H(k), followed by the computation of N-point IDFT must yield y(n).
This implies, N-point circular convolution of x(n) and h(n) with zero padding, equals to linear
convolution of x(n) and h(n).
Thus, DFT can be used for linear filtering.
Caution − N should always be greater than or equal to L+M−1. Otherwise, aliasing effect
would corrupt the output sequence.
Graphical interpretation:
Reflection of h(k) resulting in h(-k)
Shifting of h(-k) resulting in h(n-k)
Element wise multiplication of the sequences x(k) and h(n-k)
Summation of the product sequence x(k)h(n-k) resulting in the convolution value for
y(n).
Example:
x(n) = {1, 2, 3, 1}
h(n) = {1, 1, 1}
length(y(n)) = length(x(n)) + length(h(n)) -1
=4+3–1=6
24
The linear convolution output is y(n) = {1, 3, 6, 6, 4, 1}
25
Now, we will try to find the DFT of another sequence x3(n), which is given as X3(k)
X3(k) = X1(k) × X2(k)
By taking the IDFT of the above we get
x3(n) = IDFT[X3(k)]
Example:
Let’s take x1(n) = {1, 1, 2, 1} and x2(n) = {1, 2, 3, 4}
Arrange x1(n) and x2(n) in circular fashion as shown below.
27
x3(1) = x1(m) x2(1-m)
= x1(0) x2(1) + x1(1) x2(0) + x1(2) x2(3) + x1(3) x2(2)
= 2+ 1+8+3
= 14
To get x3(2) rotate x2(1-m) by one sample in anti-clockwise direction.
x2(2-m)
x3(n) = IDFT[X3(k)]
1 1 3 2 1 1 2 6 2 11
x3(n) = 2 1 1 3 2 2 2 2 3 9
3 2 1 1 2 3 4 2 1 10
1 3 2 1 1 1 6 4 1 12
Suppose, the input sequence x(n) of long duration is to be processed with a system having
finite duration impulse response by convolving the two sequences. Since, the linear filtering
performed via DFT involves operation on a fixed size data block, the input sequence is
divided into different fixed size data block before processing.
The successive blocks are then processed one at a time and the results are combined to
produce the net result.
As the convolution is performed by dividing the long input sequence into different fixed size
sections, it is called sectioned convolution. A long input sequence is segmented to fixed size
blocks, prior to FIR filter processing.
Two methods are used to evaluate the discrete convolution −
Overlap-save method
Overlap-add method
29
By appending L−1 zeros, the impulse response of FIR filter is increased in length and
N point DFT is calculated and stored.
Multiplication of two N-point DFTs H(k) and Xm(k) : Y′m(k) = H(k).Xm(k), where
K=0,1,2,..N-1
Then, IDFT[Y′m(k)] = y′(n) = [y′m(0), y′m(1), y′m(2),.......y′m(M−1),
y′m(M),.......y′m(N−1)]
here, N−1=L+M−2
First M-1 points are corrupted due to aliasing and hence, they are discarded because
the data record is of length N.
The last L points are exactly same as a result of convolution, so
y′m(n) = ym(n) where n = M, M+1,….N-1
To avoid aliasing, the last M-1 elements of each data record are saved and these points
carry forward to the subsequent record and become 1st M-1 elements.
Result of IDFT, where first M-1 Points are avoided, to nullify aliasing and remaining
L points constitute desired result as that of a linear convolution.
x(n)e2πjln/N X(k+l)
Circular frequency shift
x(n)e-2πjln/N X(k-l)
Parseval’s theorem =
Let us take an example to understand it better. We have considered eight points named
from x0 to x7. We will choose the even terms in one group and the odd terms in the other.
Diagrammatic view of the above said has been shown below
Here, points x0, x2, x4 and x6 have been grouped into one category and similarly, points x1,
x3, x5 and x7 has been put into another category. Now, we can further make them in a group
of two and can proceed with the computation. Now, let us see how these breaking into
further two are helping in computation.
Initially, we took an eight-point sequence, but later we broke that one into two parts G[k] and
H[k]. G[k] stands for the even part whereas H[k] stands for the odd part. If we want to realize
it through a diagram, then it can be shown as below
32
Fig.(3): Butterfly structure of DIT-FFT
33
Fig.(4): Butterfly structure of DIT-FFT
Example
Consider the sequence x[n]={ 2,1,-1,-3,0,1,2,1}. Calculate the FFT.
Solution − The given sequence is x[n]={ 2,1,-1,-3,0,1,2,1}
Arrange the terms as shown below;
In-Place Computation
This efficient use of memory is important for designing fast hardware to calculate the FFT.
The term in-place computation is used to describe this memory usage.
Decimation in Time Sequence:
In this structure, we represent all the points in binary format i.e. in 0 and 1. Then, we reverse
those structures. The sequence we get after that is known as bit reversal sequence. This is also
known as decimation in time sequence. In-place computation of an eight-point DFT is shown
in a tabular format as shown below –
Now let us make one group of sequence number 0 to 3 and another group of sequence 4 to 7.
Now, mathematically this can be shown as;
We take the first four points x[0], x[1], x[2], x[3] initially, and try to represent them
mathematically as follows
35
now
We can further break it into two more parts, which means instead of breaking them as 4-point
sequence, we can break them into 2-point sequence.
How can we use the FFT algorithm to calculate inverse DFT (IDFT)?
Check out the formulae for calculating DFT and inverse DFT below.
DFT:
IDFT:
As you can see, there are only three main differences between the formulae.
36
In DFT we calculate discrete signal X(k) using a continuous signal x(n). Whereas in
the IDFT, it’s the opposite.
In the IDFT formula, we have two different multiplying factors.
o The factor 1/N
o The factor which is the complex conjugate of the twiddle factor .
Thus if we multiply with a factor of 1/N and replace the twiddle factor with its
complex conjugate in the DIF algorithm’s butterfly structure, we can get the IDFT
using the same method as the one we used to calculate FFT.
In this case, DIF and DIT algorithms are the same.
We’ll see the modified butterfly structure for the DIF FFT algorithm being used to
calculate IDFT.
From the above butterfly diagram, we can notice the changes that we have
incorporated. The inputs are multiplied by a factor of 1/N, and the twiddle factors are
replaced by their complex conjugates.
=1 =1
= 0.707-0.707j = 0.707+0.707j
37
= -j =j
= -0.707-0.707j = -0.707+0.707j
= -1 = -1
= -0.707+0.707j = -0.707-0.707j
=j = -j
= 0.707+0.707j = 0.707-0.707j
=1 =1
38
UNIT II
INFINITE IMPULSE- RESPONSE FILTERS (IIR)
Introduction to filters
Comparison between practical and theoretical filters
Butterworth and ChebyShev approximation
IIR digital filter design techniques
Impulse Invariant technique
Bilinear transformation technique
Digital Butterworth & Chebyshev filters
Implementation
39
INTRODUCTION TO FILTERS
Ideal Filter Types, Requirements, and Characteristics
What is an ideal filter?
You’ve heard of filter coffee. It’s a very popular type of coffee where you filter out the
coarse coffee grains to get the silky-smooth concoction that is enjoyed all over the world.
Similarly, in digital signal processing, we send a signal with a mix of several different
frequencies through a filter to get rid of the unwanted frequencies and give us the desired
frequency response.
The picture below shows us the response when a low pass filter is applied, and we get
a practical response from it. No filter can entirely get rid of all the frequencies that we set as a
limit. There will be certain discrepancies, and some unwanted frequencies will give rise to a
transition band.
However, for calculation purposes, we consider an ideal filter which does not take into
account these practical discrepancies. Hence, it has a steep transition from the pass-band to the
stop-band. Like the picture above. The practical filter will look more like the graph below.
40
Fig.(3): Block Diagram of Filter
Where A is the amplitude of the signal, ω is the angular frequency of the input signal,
and Φ is the phase of the system.
The frequency response of a filter is the ratio of the steady-state output of the system
to the sinusoidal input. It is required to realize the dynamic characteristic of a system. It
represents the magnitude and phase of a system in terms of the frequency.
The phase of the frequency response or the phase response is the ratio of the phase
response of the output to that of the input of any electrical device which takes in an input
modifies it and produces an output.
Hence, the operating frequency of the filter can be used to further categorize these
filters into low pass filter, high pass filter, band-pass filter, band-rejection filer, multi-pass
filter, and multi-stop filter.
Low-pass filter:
A certain cut-off frequency, ωc radians per second is chosen as the limit, and as the
name suggests, the portion with low frequency is allowed to pass. Hence, the frequencies
before ωc are what consists of the pass-band and the frequencies after ωc are attenuated as part
of the stop-band. This is pictorially depicted below:
41
Fig.(6): Circuit of Low Pass Filter
As these filters are ideal, there will be no presence of the transition band, only a
vertical line at the cut off frequency. Low Pass Filters are often used to identify the continuous
original signal from their discrete samples. They tend to be unstable and are not realizable as
well.
High-Pass filter:
The High Pass Filter allows the frequencies above the cut off frequency to pass, which
will be the pass band, and attenuates the frequencies below the cut off frequency, consisting of
the stop-band. An ideal high pass filter response ought to look like the figure below.
42
Once again, there will be no transition band due to the precise cutting off of the signal
in an ideal filter. An ideal high pass filter passes high frequencies. However, the strength of
the frequency is lower. A high pass filter is used for passing high frequencies, but the strength
of the frequency is lower as compared to cut off frequency. High pass filters are used to
sharpen images.
Band-pass filter:
The band-pass filter is actually a combination of the low pass and high pass filter, both
the filters are used strategically to allow only a portion of frequencies to pass through hence
forming the pass-band and the all frequencies that do not fall in this range are attenuated
43
This band has two cut-off frequencies, the lower cut off frequency, and the upper cut
off frequency. So, it is known as a second-order filter. Band-Pass filters can also be formed
using inverting operational amplifiers. These types of filters are used primarily in wireless
transmitters and receivers.
Band-pass filters are widely used in wireless transmitters and receivers. It limits the
frequency at which the signal is transmitted, ensuring the signal is picked up by the receiver.
Limiting the bandwidth to the particular value discourages other signals from interfering.
44
Fig.(15): Circuit for Band Rejection Filter
The band Rejection filter is heavily used in Public Address Systems, Speaker System
and in devices that require filtering to obtain good audio. It is also used to reduce static radio
related devices. The frequencies that bring in the static are attenuated by giving their
frequencies as the limits.
Multi-Pass filter:
Multi-pass filters are basically several band-pass filters put together. It is used when
you want only certain bands of frequencies to pass through and want to attenuate the majority.
The Band Pass Filter itself is achieved by having the circuit of a high pass filter
connected to that of a low pass filter in series, connecting this to another pair of filters will
give us two bands of frequencies that are allowed to pass through with two second-order cut
off frequencies each making its response look something like the following graph.
Multi-Stop filter:
Multi-stop filters are several band-stop filters put together. It is used when you want
certain bands of frequencies to not be passed and when you want the majority of the
frequencies to be attenuated.
The Band Stop Filter is achieved by having a circuit of high pass filter and low pass
filter connected in parallel. Hence, this would give us two pairs of second-order cut off
frequencies as well, as depicted by the graph below.
The sinc function suggests that the frequency response exists for all values from -∞ to
∞. Therefore, it is non-realizable as it requires infinite memory.
So, to avoid waiting for an eternity to get a frequency response, we do not design ideal
filters.
Analog filter processes analog inputs and A digital filter processes and generates digital
generate analog outputs. data.
Analog filters are constructed from active A digital filter consists of elements like
(or) passive electronic components. adder, multiplier & delay unit.
Analog filter is described by a differential Digital filter is described by a difference
equation. equation.
The frequency response of an analog filter The frequency response of a digital filter can
can be modified by changing the be modified by changing the filter
components. coefficients.
46
Hence, realistic filters may have ripples in the pass-band or the stop, in some cases,
even both. But, one thing is for sure, it is impossible to design an entirely flat response or such
a steep transition
However, to try and overcome this problem of ripples or change the output to the
desired response, we can use one of the five filter approximation techniques available.
Filter approximation techniques are special processes using which we can design our
target filter with the closest response to an ideal filter. Each of these techniques has a close-to-
ideal response in certain regions of the frequency band. Some approximation methods give
you a flat pass-band, whereas some give you a sharp transition band. Let’s get acquainted with
some of the filter approximation techniques.
First, we have the Butterworth filter which gives a flat pass-band (also called
maximally flat approximation for this reason), a stop-band with no ripples and a roll-off of -
20n dB/decade where n is the order of the filter. These filters are usually used for audio
processing.
Secondly, we have the Chebyshev filter which gives a pass-band with ripples, a stop-
band with no ripples and a roll-off faster than the Butterworth filter. This is the type of filter
one should go for if a flat response is not their priority but a faster transition instead.
The number of ripples is equal to the order of the filter divided by 2. The amplitude of
each of these ripples is also the same. Hence, it is also known as an Equal Ripple Filter.
Next, we have the inverse Chebyshev which has the same roll-off as Chebyshev and
also a flat pass-band, fast transition but ripples in the stop-band. This filter can be opted for
when there is no concern with the attenuated band. However, one should ensure the ripples in
the stop-band do not surpass the rejection threshold.
Then, we have the Elliptic filter which has a ripple in both pass-band and stop-band
but the fastest transition out of all the filters.
The Bessel filter has the opposite response, which is a flat response in the pass-band,
no ripple in the stop-band but the slowest roll-off among all approximation filters.
Here is a summary,
Table 1: Filter Approximation Table
47
Filter Approximation and its types – Butterworth and Chebyshev
What is Filter approximation and why do we need it?
To achieve a realizable and a practical filter, we introduce some approximations to the filter
design. Let us compare an ideal filter with a practical one:
The cut-off frequencies are the only specifications we needed to define an ideal filter. The
practical realization requires a couple or more specifications:
Cut-off Frequency ωc
Stop band frequency ωs
Pass-band Ripple 1 − Ap: Amount of variation (fluctuation) in the magnitude response
of the filter. You can expect the pass-band frequencies of your signal to be attenuated
by a factor within the pass-band ripple.
Stop-band attenuation As is the maximum attenuation to the frequencies in stop-band.
Let us play around with these specifications to define different types of approximations.
What are the different types of Filter approximations?
Butterworth filter:
As smooth as butter
Stephen Butterworth was known for solving impossible mathematical problems and he
took up the challenge of making the pass-band ripple-free, flat and as smooth as possible.
The gain Gn(ω) on nth-order low pass Butterworth filter as a function of discrete frequency ω
is given as:
48
Advantages of Butterworth filter approximation
No pass-band ripple, that means, all pass-band frequencies have identical magnitude
response.
Low complexity.
Disadvantages of Butterworth filter approximation
Bad selectivity, not really applicable for designs that require a small gap between pass-
band and stop-band frequencies.
Elliptic:
As sharp as a whip
Has the sharpest (fastest) roll-off but has ripple in both the pass-band and the stop-band.
The gain for low pass elliptic filter is given by:
where, s is the ripple factor derived from pass-band ripple, Rn is known as nth order
elliptical rational function and ξ is the selectivity factor derived from stop-band attenuation.
Advantages of Elliptic filter approximation
Best selectivity among the three. Ideal for applications that want to effectively
eliminate the frequencies in the immediate neighborhood of pass-band.
Disadvantages of Elliptic filter approximation
Ripples in both the bands and hence, all frequencies experience non-identical changes
in magnitude.
Non-linear phase, that leads to phase distortion.
High complexity.
Chebyshev:
Jack of all trades, Master of none
Faster roll-off than Butterworth, but not as fast as Elliptical. Ripples in either one of
the bands, Chebyshev-1 type filter have ripples in pass-band while the Chebyshev-2 type filter
has ripples in stop-band.
The gain for low pass Chebyshev filter is given by:
49
Moderate complexity
Disadvantages of Chebyshev filter approximation
Ripples in one of the bands.
Non-linear phase, that leads to phase distortion.
In this we will study the steps for Butterworth Filter approximation and design analog and
digital Butterworth filters using the impulse invariance and bilinear transform techniques to
design IIR filters.
Let us examine a Low Pass Butterworth Filter. The amplitude response of a low-pass
Butterworth filter is:
where wc is the filter’s cut-off frequency and n is the order of the filter.
50
Fig.(19): Amplitude Response of Butterworth Filter
The graph you see above represents the magnitude of the frequency response of the
filter. From the graph, we can infer that at low frequencies, the gain is very close to 1. But as
the frequency increases, the gain goes off and starts rejecting higher frequency components as
general low pass filters do.
The colors that denote the different values for n are signifying the filter order. Observe
that when the filter order is low, the graph rolls off smoothly. However, when the filter order
is high, it almost looks like a step function. Hence, we can conclude that to get a smaller
transition band, we have to increase the order of the filter.
To simplify the process of designing the Butterworth filter, we are going to be
observing a normalized filter first. Then we will be scaling the values to our desired cut-off.
The transfer function of a filter is given by H(s) which is the impulse response of a
filter in the s-domain where s = σ+ jw
The frequency response of a filter can be determined by evaluating using the
imaginary axis when s = jw and σ = 0.
The inverse is also possible, given H(jw), you can evaluate H(s) by substituting
51
The 2n denotes the poles of which occur when
As you can see from the image above, depending on the value of n the number of
poles on each side of the plane will correspond. For example, if N=4, there are four poles in
the left-hand plane which correspond to Hn(s), which suggests a causal and stable filter. The
four poles on the right-hand plane correspond to the conjugate. Therefore, the number of poles
on each side depends on n.
can be computed by working it out or using a Butterworth Lookup Table, we will look
into this in detail while solving the problems.
52
Scaling Normalized Filter
Now, we know the transfer function for a normalized filter. What if we want a cut off
frequency other than unity?
Depending on the type of filter we would be designing, we would replace s by the
respective transformations shown below to find H.
Low Pass
High Pass
Band Pass
Band Stop
We also need to have specifications of the gain definitions; there are specifically four values
that we have to define which can be obtained from the graph below.
Here
GP is the Pass band gain at Pass band frequency WP
GS is the Stop band gain at Stop band frequency WS
From these filter specifications; we can determine the order of the filter. The gain is usually
represented in dB.
At frequency WX, the gain of the filter would be,
53
Hence, Pass band Gain and Stop band Gain can be determined
Rearranging the equation in terms of n so that we can identify the filter order,
54
substituting the particular values
n=3.142 ≈ 4
When you get a decimal number always round up, as this will give a better roll off than
rounding down.
Now, with the filter order and the filter requirements we can identify the cut off
frequency
We will use both the pass band values and stop band values to carry out this calculation
Now, why are there two different values for the same cutoff frequency? Well, we did
round off the filter order number for easier design purposes, that’s why!
Let us choose to match up with the pass band cut off frequency and carry on
calculating the transfer function taking = 20.01
Next up, we have to design the normalized function before moving on to scaling.
You can either look it up on the Butterworth filter table or you can compute the H(S) using
the equation below
Where
but why do all that calculating when you can just copy the numbers from the table.
(Don’t worry I’ll show you how to do it in the preceding examples)
Table 2: The coefficients in the System function of a Normalized Butterworth Filter (t1 = 1) for
orders 1 < N < 8
55
So, since N=4 in this case,
Therefore, the coefficients would be a1=2.6131, a2=3.4142, a3=2.6131, a4=1.0000
Putting these coefficients into the normalized transfer function
Now, we are going to scale the normalized function to obtain the actual transfer function,
We’ve got to replace s with
According to the stop band, the attenuation should be at -25dB, which is the cut off
limit for this filter. Looking at the red curve which depicts the frequency response of the filter
if we choose to go with the stop band cut off frequency, it touches -25dB which is well within
the limit. Over attenuation is totally fine too, as long as it does not under attenuate which will
allow signals beyond the frequency range to enter.
Over attenuation (when the attenuation is less than -25db) will not lead to this
problem, so over attenuation and exact attenuation are a thumb up for your Low Pass Filter.
Hence, the blue curve which depicts the frequency response if we choose to go with the pass
band cut off frequency over attenuates.
I know the question popping up in your mind right now. If the red curve gives exact
attenuation, why not just go with the stop band cut off frequency?
56
The problem is with the transition band. At 20 rad/s, the graph should be down by 3dB
already but the red curve hardly dipped, it only came down by 1 dB. This is not advisable for
the Low Pass Filter. So, the only thing you have to keep in mind is that when choosing the cut
off frequency to scale your transfer function equation, go with the pass band cut off
frequency.
57
Conditions to design a stable Infinite Impulse Response (IIR) filter:
To design a stable and causal IIR filter, the following requirements are necessary:
1. The transfer function (H(z)) should be a rational function of z, and the coefficients of z
should be real.
2. The poles (values, where the denominator turns 0 / output is infinite) should lie inside
the unit circle of the z-plane.
3. The number of zeros should be less than or equal to the number of poles.
4. For the effective conversion from an analog filter to a digital one, the imaginary axis
of the s-plane should map into the unit circle of the z-plane. This establishes a direct
relationship between the analog frequency and digital frequency in two domains.
5. The left half of the s-plane should map into the inside of the unit circle of the z-plane.
We can design this filter by finding out one very important piece of information i.e.,
the impulse response of the analog filter. By sampling the response we will get the time-
domain impulse response of the discrete filter.
When observing the impulse responses of the continuous and discrete responses, it is
hard to miss that they correspond with each other. The analog filter can be represented by a
transfer function, Hc(s).
58
Zeros are the roots of the numerator and poles are the roots of the denominator. For
every pole of the transfer function of the analog filter, it can be mapped to a pole on the
transfer function of the IIR filter’s transfer function given by H(z).
where A are the real coefficients and P are the poles of the function and k can be 1, 2 …N.
h(t) is the impulse response of the same analog filter but in the time domain. Since ‘s’
represents a Laplace function Hc(s) can be converted to h(t), by taking its inverse Laplace
transform.
Using this transformation,
We obtain
Now, to obtain the transfer function of the IIR Digital Filter which is of the ‘z’
operator, we have to perform z-transform with the newly found sampled impulse response,
h(n). For a causal system which depends on past (-n) and current inputs (n), we can get H(z)
with the formula shown below
We have already obtained the equation for h(n). Hence, substitute eqn (2) into the above
equation
59
Based on the standard summation formula, (3) is modified and written as the required transfer
function of the IIR filter.
Hence (4) is obtained from (1), by mapping the poles of the analog filter to that of the digital
filter.
Compare the real and imaginary parts separately, where the component with ‘j’ is imaginary.
and
60
Hence, we can make the inference that
To understand the relationship between the s-plane and Z-plane, we need to picture how they
will be plotted on a graph. If we were to plot (7) in the ‘s’ domain, σ would be the X-
coordinates and jΩ would be the Y-coordinate. Now, if we were to plot (8) in the ‘Z’ domain,
the real portion would be the X-coordinate, and the imaginary part would be the Y-coordinate.
Let us take a closer look at equation (9),
There are a few conditions that could help us identify where it is going to be mapped on the s-
plane.
Case 1:
when σ <0, it would translate that r is the reciprocal of ‘e’ raised to a constant. This
will limit the range of r from 0 to 1.
Since σ <0, it would be a negative value and would be mapped on the left-hand side of the
graph in the ‘s’ domain
Since 0<r<1, this would fall within the unit circle which has a radius of in the ‘z’ domain.
Case 2:
When σ =0, this would make r=e0, which gives us 1, which means r=1. When the
radius is 1, it is a unit circle.
Since σ =0, which indicates the Y-axis of the ‘s’ domain.
Since r=1, the point would be on the unit circle in the ‘z’ domain.
Case 3:
When σ>0, since it is positive, r would be equal to ‘e’ raised to a particular constant,
which means r would also be a positive value greater than 1.
Since σ>0, the positive value would be mapped onto the right-hand side of the ‘s’ domain.
Since r>1, the point would be mapped outside the unit circle in the ‘z’ domain.
Here is a pictorial representation of the three cases:
Mapping of poles located at the imaginary axis of the s-plane onto the unit circle of the z-
plane. This is an important condition for accurate transformation.
61
Mapping of the stable poles on the left-hand side of the imaginary s-plane axis into the unit
circle on the z-plane. Another important condition.
Poles on the right-hand side of the imaginary axis of the s-plane lie outside the unit circle of
the z-plane when mapped.
62
4. You will have your transfer function in terms of H(z), which is the frequency transfer
function of the IIR digital filter.
How about we try an example to make sure you get the hang of it?
Solved example using Impulse Invariance method to find the transfer function of an IIR filter
Problem:
Given , that has a sampling frequency of 5Hz. Find the transfer function of
Step 2:
Applying partial fractions on H(s),
Step 3:
Step 4:
That is how you obtain the transfer function of the IIR digital filter.
Try out more questions to get a hang of solving these problems. Once you do that, the
impulse invariance method is pretty straightforward. Two points that we should remember
before going to the next topic are that the Impulse Invariance method is used for frequency-
selective filters and that they are used to transform analog filter design.
63
These methods can only be used to realize low pass filters and a limited class of band-
pass filters. We can’t design high pass filters or certain band-reject filters using these two
methods.
Moreover, the many to one mapping in the impulse invariance method (s-domain to z-
domain) causes the issue of aliasing, which is highly undesirable. Bilinear transform removes
that issue by using one-to-one mapping. Let’s check out the method.
Applying the Laplace transform to change the equation from the ‘s’ domain to the time
domain, we get
The integral can also be solved by using the trapezoidal rule for finding the area,
Let us say we have a trapezoid of lengths a and b and a height h
The area is given by
Similarly,
64
Fig.(24): Example of Trapezoidal Rule
Using the trapezoidal rule(4), the area of the graph between t and t0 is given by,
Hold on to that equation for a minute, and let us rearrange equation(2) as shown and
Arranging this to get a transfer function (output over input →Y(Z) over X(Z)) for the IIR
Digital Filter,
Simplifying this equation,
65
Comparing equation (10) with equation(1),
We can infer that , —(11)
Bilinear Approximation
Here is a little trick.
All the points on the left-hand side (LHS) of the ‘s’ plane are mapped to points inside
the unit circle in the ‘z’ plane.
All the points on the right-hand side of the ‘s’ plane are mapped to points outside the
unit circle.
All points on the imaginary axis of the ‘s’ plane are mapped points right on the unit
circle.
Mapping the point 0+j0 of the ‘s’ plane onto the ‘z’ plane is when Z=e0=1
Hence, it will fall write on the unit circle as shown in the picture below
66
Hence, taking the modulus would give us
So, the value of alpha determines whether the point lies outside or inside the unit circle.
For α <0
Z will also be less than 0 as e to the power of a negative value would give us a value
less than 1, mapping the point within the unit circle.
Fig.(26): Mapping of points inside the unit circle in the ‘z’ plane
For α>0
Z will also be greater than 0 as e to the power of a positive value is always greater than
1, mapping the point outside the unit circle.
and —(13)
67
Visiting the difference equation that we derived (11) and substituting (12) for all Z.
Remember the euler formula we used before, we’re going to use it again over here and get
If r value is less than 1, a number less than 1 subtracted by 1(numerator) would give a
negative value, hence r<1→σ<1
Similarly, when r is a value greater than 1, a number greater than 1 subtracted by
1(numerator) would give a positive value, hence r>1→σ>1
When r is equal to 1 however, 1 subtracted by 1 would give us σ=0
Frequency Warping
For an analog filter, the frequency of the filter and the sampling frequency can help
find the value of ω.
68
Fig.(28): Linear Characteristics of Filter
But, the relationship we determined through the Bilinear Transformation in equation (18) is
non-linear.
Example:
Let us say we have to design a digital IIR filter of cut off frequency 500Hz and sampling
frequency 10KHz.
FC=500Hz and FS=10KHz
ΩC=2πFC=2πx500=1000π=3141.59
ωC=2πFC/FS=1000π/(10x103t)=0.1π=0.314159(Required Cut off Frequency)
(Mapped Cutoff Frequency after Bilinear Transformation)
Therefore,
This change in the frequency value right here is Frequency warping. If we design an
analog filter with ωC and then perform Bilinear Transformation and get ωC in the digital
domain, we cannot design an accurate filter with the same frequency requirement.
Pre-warping:
Wait, hold up. You can remove the warping problem using a simple technique. All you
have to do to get a digital IIR filter with the same desired cut off frequency as ωC is to design
69
an analog filter with a cut off frequency that maps to ωC after Bilinear Transformation. We
call this process Pre-warping. We need to pre-warp the analog filters.
Look at the graph we have above, the blue line represents the frequency response after
Bilinear Transformation, and the red line represents the Linear characteristics of Ω and ω. To
obtain the expected response, which is the red line, we are going to merge the blue line with
the green line so that it cancels out and gives us the red line. This is basically what pre-
warping does.
How exactly do we calculate this,
With Bilinear Transformation:
After pre-warping, we see that the mapped cut off frequency is the same as that of the
required cut off frequency. Hence, pre-warping is used to ensure we have the same cut off
frequency for both the analog filter and the digital IIR filter redeeming Bilinear
Transformation for us.
Now, is it necessary to go through so much trouble and perform Bilinear
Transformation, why not just go with the other two methods?
How about we discuss the pros and cons of this method before coming to any conclusions?
70
All characteristics of the amplitude response of the analog filter are preserved when
designing the digital filter. Consequently, the pass band ripple and the minimum stop
band attenuation are preserved.
Disadvantages of the Bilinear Transformation method for designing IIR filters:
Frequency Warping is the only disadvantage, as the mapping is non-linear we have to
perform pre warping.
What is the difference between the Bilinear Transform and Impulse Invariance
methods?
Impulse Invariance Bilinear Transformation
Is used to design IIR filters with the unit sample
Is used to design IIR filters using the trapezoidal rule
response represented as h(n) which is obtained
in place of numerical integration to get an equation
by sampling the impulse response of an analog
that represents s in terms of Z
filter
Bilinear Transformation avoids aliasing of frequency
The value selected for T is small to reduce the
components as it is a single conformal mapping of the
effects of aliasing.
jΩ axis into the unit circle in the z plane.
Can be used to design all kinds of IIR Digital filters
Used to design Low pass and a limited class of
including Low Pass filters, High Pass filters, Band
Band pass IIR Digital filters
pass, and Band stop filters.
Frequency warping takes place as the frequency
The frequency relationship is linear
relationship is non-linear.
Though all poles are mapped from the s-plane
to the z-plane, the zeros do not satisfy the same All zeros and poles are mapped
relationship
Solution:
We know that
Comparing the given H(s) equation with the Laplace Transform equation below
Therefore, to find T
71
hence, T=1/2 sec
Substituting this,
Substituting s in term of z,
Conclusion
We hope you understood the technique of Bilinear Transformation, along with its
effects and properties, and pre-warping. This is an important method for designing digital IIR
filters.
Infinite Impulse Response Filters (IIR) Finite Impulse Response Filters (FIR)
All the infinite samples of the impulse
Only N samples of the impulse response are
response are considered in the designing of
considered in the designing of FIR filters.
IIR filters.
The construction of an IIR filter involves
The construction of an FIR filter with the desired
designing an analog filter first for the desired
specifications can be done directly using certain
specifications and then converting it into a
methods (like windowing).
digital IIR filter.
Thus we can say that IIR filters have an Thus we can say that FIR filters don’t have an
analog equivalent. analog equivalent.
The IIR filter requires past output samples in The FIR filter requires only past and current
addition to current and past inputs to obtain inputs to obtain its current output. They don’t
its current outputs. care about past outputs.
An IIR filter’s design specifications only
An FIR filter’s design specifications specify both,
specify the desired characteristics of its
the magnitude as well as the phase response.
magnitude response.
Physically realizable infinite impulse Physically realizable FIR filters can be designed
72
response filters don’t have linear phase with linear phase characteristics easily.
characteristics.
FIR filters are non-recursive. However, it is
IIR filters are recursive
possible to design recursive FIR filters too.
The transfer functions of infinite impulse The transfer functions of finite impulse response
response filters have both poles and zeros. have only zeros.
IIR filters are/have LESS:
Powerful: In terms of computational FIR filters have/are MORE:
prowess. Power
Power-hungry: In terms of power Power-hungry
supply. Stable
Stable Memory
Memory Time to setup
Time to set up Delay
Delay
IIR filters are/have MORE: FIR filters have/are LESS:
Filter coefficients Filter coefficients
Tweakable on the fly Tweakable on the fly
Efficient Efficient
Sensitive Sensitive
Easy to use Easy to use
IIR filters are used in Band Stop and Band FIR filters are used in Anti-aliasing, low pass,
Pass filters. and baseband filters.
Where
Step 5. Now this is the step that determines what method you are using,
we can use a table to do the digital transformation
Solved Example
Butterworth IIR Low Pass Filter using Impulse Invariant Transformation, T=1 sec
74
Solution:
1.
where
k=N/2=2/2=1
Therefore,
Ok so now we have the cut off frequency and we can do the substitution now,
Simplifying this we will obtain the transfer function using impulse invariance method
75
To perform the digital transform, we need to make sure it is in a form that can be transformed
to Z. So we need to make a few changes, starting with completing the square.
Problem
Design a Butterworth digital IIR high pass filter using Bilinear Transformation by taking
T=0.1 second, to satisfy the following specifications.
76
Solution:
1. For a High Pass Filter
Where
k=N/2=2/2=1
77
Therefore,
4. Time for the actual transfer function now and we need the cut off frequency again
5. This is the little twist that I warned y’all about at the beginning, the last step.
So, for this step, we have to replace
which gives us the transfer function of the High Pass Filter in the ‘z’ domain.
78
79
UNIT - III
FINITE IMPULSE-RESPONSE FILTERS (FIR)
80
Finite Impulse Response (FIR) Filters
Introduction
- In many digital processing applications, FIR filters are preferred over their IIR filters.
The following are the main advantages of the FIR filters over IIR filters.
1. FIR filters are always stable.
2. FIR filters with exactly linear phase can easily be designed.
3. FIR filters can be realized in both recursive and non-recursive structures.
4. FIR filters are free of limit cycle oscillations, when implemented on a finite word length
digital system.
5. Excellent design methods are available for various kinds of FIR filters.
where h(k), k=0,1,…,n-1, are the impulse response coefficients of the filter, H(z) is the
transfer function and N the length of the filter.
2. FIR filters can have an exactly linear phase response.
3. FIR filters are simple to implement with all DSP processors available having architectures
that are suited to FIR filtering.
Consider a signal that consists of several frequency components passing through a filter.
1. The phase delay (Tp) of the filter is the amount of time delay each frequency component of
the signal suffers in going through the filter.
2. The group delay (Tg) is the average time delay the composite signal suffers at each
frequency.
3. Mathematically,
( w)
Tp (3)
w
d ( w)
Tg ( 4)
dw
where θ(w) is the phase angle.
81
A filter is said to have a linear phase response if,
0, othrwise
Magnitude response = H ( e ) k
jw
Phase response ( w) H (e jw ) w
Follows: y[n] = kx[n-α] : Linear phase implies that the output is a replica of x[n] {LPF} with a
time shift of α.
82
n=0, 1, ……., (N/2)-1 (N even)
α = (N-1)/2
h(n)= -h(N-n-1)
where n=0, 1, ……., (N-1)/2 (N odd)
n=0, 1, ……., (N/2)-1 (N even)
Example
h(0) = h(10)
h(1) = h(9)
h(2) = h(8)
h(3) = h(7)
h(4) = h(6)
h(5) = h(5)
h(n)=h(5-n-1)=h(4-n)
h(0) = h(4)
h(1) = h(3)
h(2) = h(2)
Now,
H ( z ) h[2]z 2 [h[1]z1 h[3]z 1 ]z 2 [h[0]z 2 h[4]z 2 ]z 2
H (e jw ) H ( z ) z e jw (T 1)
H (e jw ) h[2]e j 2 w [h[1]e jw h[3]e jw ]e j 2 w [h[0]e j 2 w h[4]e j 2 w ]e j 2 w
H (e jw ) e j 2 w h[2] 2h[1] cos w 2h[0] cos 2 w
1
H (e jw ) e j 2 w h[2] 2h[n ] cos( w(n 2))
n 0
H (e jw ) e j 2 w H (e j ( w ) )
Phase = -2w
Group delay
Group delay is constant over the pass band for linear phase filters.
84
Types of FIR linear phase systems
( N 1) / 2
H ( e jw ) e jw( N 1) / 2 a[n] cos( wn)
n 0
where
a[0] h[( N 1) / 2]
a[n ] 2h[(( N 1) / 2) n ]
n 1,2,....( N 1) / 2
N / 2 1
H ( e jw ) e jw( N 1) / 2 b[n ] cos[ w( n )]
n 1 2
where
b[n ] 2h[ N / 2 n ]
n 1,2,.... N / 2
( N 1) / 2
H ( e jw ) je jw( N 1) / 2 a[n ] sin( wn)
n 0
where
a[n ] 2h[(( N 1) / 2) n ]
n 1,2,....( N 1) / 2
85
4. Type IV FIR linear phase system
The impulse response is negative symmetric and N is an even integer.
h(n) = -h(N-n-1) , 0 ≤ n ≤ (N/2)-1
N / 2 1
H ( e jw ) je jw( N 1) / 2 b[n ] sin[ w( n )]
n 1 2
where
b[n ] 2h[ N / 2 n ]
n 1,2,.... N / 2
A comparison of the impulse of the four types of linear phase FIR filters
86
Design of FIR filters using windows:
where
1
hd ( n )
2
H d ( e jw )e jwndw..........( 2)
- To reduce these oscillations, the Fourier coefficients of the filter are modified by multiplying
the infinite impulse response with a finite weighing sequence w(n) called a window, where
N 1
w(n ) w( n ) 0 for n
2
N 1
0 for n .......................................(3)
2
87
- After multiplying window sequence w(n) with hd(n), we get a finite duration sequence h(n)
that satisfies the desired magnitude response.
N 1
h ( n ) hd ( n ) w( n ), for n
2
N 1
0, for n ........................................( 4)
2
- The frequency response of the filter can be obtained by convolution of H d (e jw ) & w(e jw )
given by,
1
H (e ) H d ( e j ) w(e j ( w ) )d ..............(5)
jw
2
H ( e jw ) H d ( e jw ) w( e jw )............................(6)
- Because the both H d (e jw ) & w(e jw ) are periodic functions, the operation is often called as
periodic convolution.
- The desired frequency response & its Fourier coefficients are shown in fig. 1(a) & 1(b)
respectively.
jw
- The fig. 1(c) & 1(d) show a finite window sequence w(n) & its Fourier transform w( e )
-The Fourier transform of a window consists of a central lobe and side lobes.
- The central lobe contains most of the energy of the window.
- To get an FIR filter, the sequence hd(n) & w(n) are multiplied and a finite length of non-
causal sequence h(n) is obtained.
jw
- The fig. 1(f) & 1(e) shown h(n) & its Fourier transform H ( e )
- The frequency response H ( e jw ) is obtained using eq. (5) & (6).
- The realizable sequence g(n) in fig. 1(g) can be obtained by shifting h(n) by α number of
samples, where N 1
2
From eq. (5), we find that frequency response of the filter H ( e jw ) depends on the frequency
jw
response of window w( e )
- Therefore, the window, chosen for truncating the infinite impulse response should have
some desirable characteristics.
They are
1. The central lobe of the frequency response of the window should contain most of the
energy & should be narrow.
2. The highest side lobe level of the frequency response should be small.
3. The side lobes of the frequency response should decrease in energy rapidly as ‘w’ tends to
π.
88
Different types of windows
1. Rectangular window:
The rectangular window sequence is given by,
N 1 N 1
1, for n
wR (n) 2 2
0, Otherwise
The spectrum of the rectangular window is given by,
Nw
sin
wR ( e jw )
2
w
sin
2 Nw 2 k
- The frequency response is real & its zero occurs, when k (or ) w
2 N
where, k is an integer.
2 2
- The response for ‘w’ between N & N is called the main lobe and the other lobes are
known as side lobes.
- The main lobe width for the rectangular window is equal to 4π/N.
- The higher side lobe level is equal to approximately 22% of the main lobe amplitude (or) -
13dB relative to the maximum value at w=0.
- But it has the following disadvantages, when compared to magnitude response obtained by
using rectangular window.
1. The transition region is more.
2. The attenuation in stop band is less.
- Because of these characteristics, the triangular window is not usually a good choice.
- The raised cosine window multiplies the central Fourier coefficients by approximately unity
& smoothly truncates the Fourier coefficients toward the ends of the filter.
Nw Nw N Nw N
sin sin sin
2 1 2 N 1 1 2 N 1
w (e )
jw
w w 2 w
sin
2
sin sin
2 2 N 1 2 N 1
90
4. Hanning Window
- The Hanning window sequence can be obtained by substituting α=0.5 in Raised cosine
window,
2n
0.5 0.5 cos N 1 N 1
wHn (n) N 1 , for 2 n 2
0, Otherwise
Nw Nw N Nw N
sin sin sin
2 2 N 1 2 N 1
wHn (e ) 0.5
jw
0.25 0.25
w w w
sin sin sin
2 2 N 1 2 N 1
- The window sequence & its frequency response are shown in fig.
- The main lobe width of Hanning window is twice that of the rectangular window, which
results in a doubling of the transition region of the filter.
- The magnitude of the side lobe level is -31dB, which is 18dB lower than that of rectangular
window.
- That is, the first side lobe of Hanning window spectrum is approximately one tenth that of
the rectangular window.
- This results in smaller ripples in both pass band and stop band of the LPF designed using
Hanning window.
- The minimum stop band attenuation of the filter is 44dB, which is 23dB lower than the filter
designed using rectangular window.
- At higher frequencies the stop band attenuation is even greater.
91
5. Hamming window
- The Hamming window sequence can be obtained by substituting α=0.54 in Raised cosine
window,
2n N 1 N 1
0.54 0.46 cos , for n
wH (n) N 1 2 2
0, Otherwise
Nw Nw N Nw N
sin sin sin
2 2 N 1 2 N 1
wH (e ) 0.54
jw
0.23 0.23
w w w
sin sin sin
2 2 N 1 2 N 1
- The window sequence & its magnitude response are shown in fig.
- The peak side lobe level is down about 41dB from the main lobe peak, an improvement of
10dB relative to Hamming window.
6. Blackman window
- The blackman window sequence is given by,
- The additional cosine term (compared with the Hanning & Hamming windows) reduces the
side lobes, but increases the main lobe width to 12π/N.
- The peak side lobe level is down about 57dB from the main lobe peak, an improvement of
16dB relative to the Hamming window.
92
Figure 5: Blackman window (time and frequency responses)
93
Summary of window parameters:
From table, we can find that a trade-off exists between the main lobe width & the side lobe
amplitude.
Window Peak amplitude of side Main lobe width Minimum stop band
lobe(dB) attenuation (dB)
Rectangular -13 4π/N -21
Bartlett -25 8π/N -25
7. Kaiser window
where, α is the adjustable parameter & I0(x) is the modified zeroth-order Bessel function of
the first kind of order zero given by,
2
1 x k 0.25x 2 (0.25x 2 )2 (0.25x 2 )3
I 0 ( x) 1 1 ................
k! 2
k 1 (1! )2 (2! )2 (3! )2
Examples:
94
Ans: The desired frequency response is,
1 4 jwn
2
1 * e dw 1 *e jwn
dw
4
jwn 4 jwn
1 e e
2 jn jn
4
1
hd (n ) sin( n ) sin( n ) , n
n 4
- Truncating hd(n) to 11 samples, we have
h (n), for n 5
h( n ) d
0, Otherwise
- From the given frequency response, we can find that α=0. Therefore, the filter coefficients
are symmetrical about n=0 satisfying the condition h(n)=h(-n).
sin( n ) sin( n )
n 0 h(0) Lt Lt 4 1 1 3
n 0 n n 0 4 4
n *4
4
sin sin
n 1 h(1) h( 1) 4 0.225
sin 2 sin
n 2 h( 2) h( 2) 2 0.159
2
3
sin 3 sin
n 3 h(3) h( 3) 4 0.075
3
sin 4 sin
n 4 h( 4) h( 4) 0
4
sin 5 sin 5
n 5 h(5) h( 5) 4 0.045
5
95
- The transfer function of the filter is given by
N 1
2 5 5
H ( z) h( n ) z n
N 1
h(n) z n h(0)
n 5
h(n ) z
n 5
n
n
2
where
N 1
a[ 0 ] h h(5) 0.75
2
N 1
a[ n ] 2 h n
2
a (1) 2h(5 1) 2h( 4) 0.45
a ( 2) 2h(5 2) 2h(3) 0.318
a (3) 2h(5 3) 2h( 2) 0.15
a ( 4) 2h(5 4) 2h(1) 0
a (5) 2h(5 5) 2h(0) 0.09
( N 1) / 2
H (e jw ) a[n] cos( wn) 0.75 0.45 cos w 0.318 cos 2w 0.15 cos 3w 0.09 cos 5w
n 0
96
Realizations of FIR filters:
1. Transversal structure:
- The system function of an FIR filter can be written as,
N 1
H ( z ) h( n ) z n
n 0
Y ( z)
H ( z) h(0) h(1) z 1 h(2) z 2 ........ h( N 1) z ( N 1) (1)
X ( z)
Y ( z ) h(0) X ( z ) h(1) z 1 X ( z ) h(2) z 2 X ( z ) ........ h( N 1) z ( N 1) X ( z ) ( 2)
This structure is known as transversal structure (or) direct form realization. The transversal
structure requires N Multipliers, (N-1) Adders & (N-1) delay elements.
Cascade realization:
- The eq.(1) can be realized in cascade form from the factored form of H(z).
- For N odd, N 1
b bk 1 z 1 bk 2 z 2
2
H ( z) k0
k 1
- For N odd, (N-1) will be even & H(z) will have (N-1)/2 second order factors.
- Each second order factored form of H(z) is realized in direct form & is cascaded to realize
H(z) as shown in fig.(2)
97
Fig.(2) Cascade realization of eq.(3)
- For N even,
N
k 2
- When N is even, (N-1) is odd & H(z) will have one first order factor & (N-1)/2 second order
factors.
- Now each factored form in H(z) is realized in direct form & is cascaded to obtain the
realization of H(z) as shown in fig.(3).
1. Draw the Direct form structure for the FIR represented by following difference equation.
y(n)=x(n)-2x(n-1)-2x(n-2)+3x(n-3)
98
2. Draw direct form structure for the FIR filter represented by following transfer function.
H ( z ) 4 2 z 1 2 z 2 3z 3
Solution :
Y ( z)
H ( z) 4 2 z 1 2 z 2 3z 3
X ( z)
Y ( z ) 4 X ( z ) 2 z 1 X ( z ) 2 z 2 X ( z ) 3z 3 X ( z )
3. Using cascade structure realize the FIR filter represented by following transfer function.
1 1 2 1
H ( z ) (1 z z )(1 z 1 z 2 )
2 4
Solution :
H ( z ) H1 ( z ). H 2 ( z )
1 1 2 1 1 2
H1 ( z ) (1 z z ) H 2 ( z ) (1 z z )
2 4
Y ( z) 1 Y ( z) 1
H1 ( z ) 1 (1 z 1 z 2 ) H 2 ( z) 2 (1 z 1 z 2 )
X1( z) 2 X 2 ( z) 4
1 1 1 1
Y1 ( z ) X 1 ( z ) z X 1 ( z ) z 2 X 1 ( z ) Y2 ( z ) X 2 ( z ) z X 2 ( z ) z 2 X 2 ( z )
2 4
1 1
y1 ( n ) x1 ( n ) x1 ( n 1) x1 ( n 2) y2 ( n ) x2 ( n ) x2 ( n 1) x2 ( n 2)
2 4
99
Fig.(4) Cascade form of realization for the given example
n
2
N 3
N 1
N 1 2
2
h z h(n ) z n z N 1n (9)
2 n 0
N 2
h(n)z
2
H ( z) n
z N 1n (8)
n 0
100
Fig.(5) Direct –form realization of a linear phase FIR system for N even
- For N odd,
N 3
N 1
2
N 1
N 1
H ( z ) h(n ) z h(n ) z h
n
z
n 2
n 0 n
N 1 2
2
N 3
N 1
N 1
2
h z
2
h(n ) z n z N 1n (9)
2 n 0
Fig.(6) Direct –form realization of a linear phase FIR system for N odd
Let us consider an FIR filter with impulse response having N coefficients. The system
function of such a filter is given by,
N 1
H ( z ) h( n ) z n
n 0
101
H ( z ) h(0) h(1) z 1 h(2) z 2 h(3) z 3 h(4) z 4 h(5) z 5
h(6) z 6 h(7) z 7 h(8) z 8 h(9) z 9 h(10) z 10
- If N=11, then
- The above transfer function can be partitioned into two sub signals, one sub signal
containing even-indexed coefficients & the other containing odd- even-indexed coefficients.
That is
H ( z ) P0 ( z 2 ) z 1P1 ( z 2 ) (1)
where
P0 ( z 2 ) h(0) h(2) z 2 h(4) z 4 h(6) z 6 h(8) z 8 h(10) z 10
P1 ( z 2 ) h(1) h(3) z 2 h(5) z 4 h(7) z 6 h(9) z 8
P0 ( z ) h(0) h(2) z 1 h(4) z 2 h(6) z 3 h(8) z 4 h(10) z 5
P1 ( z ) h(1) h(3) z 1 h(5) z 2 h(7) z 3 h(9) z 4
102
- Now H(z) can be written as,
M 1
H ( z) z
m 0
m
Pm ( z M )
where
N 1 / M
Pm ( z M ) h( Mn m) z
n 0
n
,0 m M 1
where
N 1 / M
Pm ( z )
M
h( Mn m) z
n 0
n
,0 m M 1
¶
N 1 N 1
denotes the integer part of
M M
- The decomposition of H(z) in the form of eq. (1) & (2) is known as type 1 poly phase
decomposition of the transfer function of order N.
- We know that for a general case of M-sub signals.
M 1
H ( z) z
m 0
m
Pm ( z M ) (3)
where
N 1 / M
Pm ( z M ) h( Mn m) z
n 0
n
,0 m ( M 1) ( 4)
103
Now eq. (3) can be realized as shown in the fig., where each
can be realized in the direct form.
- If we replace ‘m’ by M-1-m in eq.(3), we get the type-2 poly phase decomposition.
M 1
H ( z) P
m 0
M 1 m ( z M ) (5)
M 1
Q
m 0
m ( z M ) ( 6)
where
Qm ( z M ) PM 1m ( z M )
M 1
H ( z) z
m 0
m
Rm ( z M )
where
R0 ( z M ) P0 ( z M ) & Rm ( z M ) z 1PM m ( z M )
104
Types of Windows
Kaiser w(n)
105
There is minimum computation involved in executing this technique.
No matter how much we increase the value of m, the maximum magnitude of the
ripple is fixed (except when using the Kaiser Window.)
Due to the convolution of the desired frequency response and the spectrum of the
windowing function, it is not possible to find out the specific values of the passband and the
stopband frequencies.
When the filter order, m is increased this leads to a narrower frequency response in
which the smoothing is reduced. This leads to big side lobes that are coupled with a lower
transition width which is Gibb’s phenomenon that was previously mentioned. However, the
Blackman-Harris Window helps to reduce the effects of Gibb’s phenomenon and keep it to a
minimum.
Step 2:
Now, we find the value of h(n) of the IIR filter response by taking the Inverse Discrete
Fourier Transform of H(w)
106
Hence, the integral term will be cancelled off.
Step3:
Now, you have to identify the windowing function based on what window you are using as
you see from the table. Substitute the values of n into the function to get the coefficients
pertaining to the window function. You can form a table similar to the one below for better
understanding
Function after performing
Desired Frequency Windowing Function,
Value of n Windowing method, h(n)=
Response, hd(n) w(n)
hd(n). w(n)
0 hd(0) w(0) h(0)= hd(0). w(0)
1 hd(1) w(1) h(1)= hd(1). w(1)
. . . .
. . . .
. . . .
m-1 hd(m-1) w(m-1) h(m-1)= hd(m-1). w(m-1)
Step 4:
Now, we have to find the final equation after the windowing technique has been applied. To
identify which of the values from the table above should be included, you need to solve this
equation and substitute the respective values from the table above:
Simplifying and solving this equation will give you the final transfer function
Step 5:
We are not done just yet, we have to transform this final equation using the z-transform into
the 'z' domain.
Solution
Taking into account the limits we can graph the frequency like the one shown below
107
Drawing the waveform, we understand it is a low pass filter as it allows the low-frequency
range to pass.
Now apply inverse discrete transform while changing the limits respectively,
Outside the limits the expressions tend to 0 cancelling of those terms leaving us with
108
Using the formula
So, hope you understand what really is the whole deal with this rectangular window. But,
there is more to it. As mentioned, above there are several types of filters that are used to get an
optimum response from the filter. The rectangular filter does not do enough as mentioned, it
causes many ripples in the passband and stopband.
Also while using the rectangular window, remember how infinite is turned to finite,
unfortunately, some information may be lost along with this process as well because the
convolution of the two functions may lead to the flattening of the waveform.
This flattening of the waveform can be reduced by increasing the value of M as will get
narrower. These discrepancies lead to large side lobes which bring about the ringing effect in
the FIR filter and make it inaccurate.
Now hold up, what is this ringing effect? No, it is not the bells you hear when you know you
are in trouble during an exam. In fact, it is called Gibb’s phenomenon. It is just a fancy name
109
for the annoying distortions around sharp edges in photos and videos. To avoid this ringing
effect we have the other window types that we previously mentioned.
Let us look at a problem where one of the other windowing functions are used so you can
understand how to implement them when the window function is not just simply equal to 1.
Solution
Hence,
110
Since 0 ≤ n ≤ m-1
From 3 we can deduce that α=3
α=(m-1)/2
Therefore, m=7
Find w(n)
For a hamming window, we know the function is
w(n)=0.5 – 0.5cos
Calculate h[n]
h[n] = hd[n].w[n]
we can deduce
h(3)=0.25
Figure out the magnitude response
----(4)
Now, that we have the values for n=0 to 6, we can substitute them back into the equation (4)
to get
Infinite Impulse Response Filters (IIR) Finite Impulse Response Filters (FIR)
All the infinite samples of the impulse
Only N samples of the impulse response are
response are considered in the designing of
considered in the designing of FIR filters.
IIR filters.
The construction of an IIR filter involves
The construction of an FIR filter with the desired
designing an analog filter first for the desired
specifications can be done directly using certain
specifications and then converting it into a
methods (like windowing).
digital IIR filter.
Thus we can say that IIR filters have an Thus we can say that FIR filters don’t have an
analog equivalent. analog equivalent.
The IIR filter requires past output samples in The FIR filter requires only past and current
addition to current and past inputs to obtain inputs to obtain its current output. They don’t
its current outputs. care about past outputs.
An IIR filter’s design specifications only
An FIR filter’s design specifications specify both,
specify the desired characteristics of its
the magnitude as well as the phase response.
magnitude response.
Physically realizable infinite impulse
Physically realizable FIR filters can be designed
response filters don’t have linear phase
with linear phase characteristics easily.
characteristics.
112
FIR filters are non-recursive. However, it is
IIR filters are recursive
possible to design recursive FIR filters too.
The transfer functions of infinite impulse The transfer functions of finite impulse response
response filters have both poles and zeros. have only zeros.
IIR filters are/have LESS:
Powerful: In terms of computational FIR filters have/are MORE:
prowess. Power
Power-hungry: In terms of power Power-hungry
supply. Stable
Stable Memory
Memory Time to setup
Time to set up Delay
Delay
IIR filters are/have MORE: FIR filters have/are LESS:
Filter coefficients Filter coefficients
Tweakable on the fly Tweakable on the fly
Efficient Efficient
Sensitive Sensitive
Easy to use Easy to use
IIR filters are used in Band Stop and Band FIR filters are used in Anti-aliasing, low pass,
Pass filters. and baseband filters.
113
114
UNIT - IV
MULTIRATE DIGITAL SIGNAL PROCESSING
Introduction
Decimation by factor D and interpolation by a factor I
Sampling Rate conversion by a Rational factor I/D
Implementation of Sampling Rate conversion
Multistage implementation of Sampling Rate conversion
Sampling conversion by a Arbitrary factor
Application of Multirate Signal Processing
115
Introduction:
Multirate simply means “multiple sampling rates”. A multirate DSP system uses multiple
sampling rates within the system. Whenever a signal at one rate has to be used by a system
that expects a different rate, the rate has to be increased or decreased, and some processing is
required to do so. Therefore “Multirate DSP” really refers to the art or science
of changing sampling rates.
The most immediate reason is when you need to pass data between two systems which use
incompatible sampling rates. For example, professional audio systems use 48 kHz rate, but
consumer CD players use 44.1 kHz; when audio professionals transfer their recorded music to
CDs, they need to do a rate conversion.
But the most common reason is that multirate DSP can greatly increase processing efficiency
(even by orders of magnitude!), which reduces DSP system cost. This makes the subject of
multirate DSP vital to all professional DSP practitioners.
Multirate consists of:
APPLICATIONS
THE UP-SAMPLER
The up-sampler, represented by the diagram,
116
The up-sampler simply inserts zeros between samples. For example,
if x(n) is the sequence
It is clear that
Y (z) = Z{[↑2] x(n)} = X(z2)………………….. (5)
so we have
we have
When sketching the Fourier transform of an up-sampled signal, it is easy to make a mistake.
When the Fourier transform is as shown in the following figure, it is easy to incorrectly think
that the Fourier transform of y(n) is given by the second figure. This is not correct, because
the Fourier transform is 2π-periodic. Even though it is usually graphed in the range
, outside
117
Decimation:
Sampling rate reduction by integer factors after band-limiting the input signal.
The sampling rate of a sequence can be reduced by “sampling” it, i.e. by defining a new
sequence
xd[n] = x[nM] = xc(nMT)
The Sampling rate compressor reduces the sampling rate from Fs to Fs /M. In frequency
domain, it is equivalent to multiplying the original signal band-width by a factor M. Hence, if
the signal is not band-limited to π / M, down-sampling results in aliasing as shown in this
following Example. If this happens, the original signal cannot be recovered from the
decimated version.
Avoiding Aliasing:
Aliasing can be avoided if x(n) is low-pass signal band-limited to the region |w| < π / M.
In most applications, the down-sampler is preceded by a low-pass digital filter called
“decimation filter”.
To prevent aliasing at a lower rate, the digital filter h[k] is used to band-limit the input signal
to less than Fs /2M beforehand. Sampling rate reduction is achieved by discarding M-1
samples for every M samples of the filtered signal w[n].
Introduction:
- In many practical applications of DSP, one is faced with the problem of changing the
sampling rate of a signal, either increasing it (or) decreasing it by some amount.
Ex: In Telecommunication systems that transmit & receive different types of signals (i.e.,
teletype, facsimile, speech, video etc), there is a requirement to process the various signals at
different rates commensurate with the corresponding bandwidths of the signals.
- The process of converting a signal from a given rate to a different rate is called sampling rate
conversion.
- In turn, systems that employ multiple sampling rates in the processing of digital signals are
called multirate digital signal processing systems.
- Sampling rate conversion of a digital signal can be accomplished in one of two general
methods.
- One method is to pass the digital signal through a D/A converter, filter it if necessary & then
to resample the resulting analog signal at desired rate (i.e., to pass the analog signal through
an A/D converter).
- The second method is to perform the sampling rate conversion entirely in the digital domain.
118
- One advantage of the first method is that the new sampling rate can be arbitrarily selected
and need not have any special relationship to the old sampling rate.
- A major disadvantage however is the signal distortion, introduced by the D/A converter in
the signal reconstruction & by the quantization effects in the A/D conversion.
- Sampling rate conversion performed in the digital domain avoids this major
disadvantage.
- The discrete time systems that process data at more than one sampling rate are known
as multirate systems.
- The two basic operations in multirate signal processing are decimation &
interpolation.
- Decimation reduces the sampling rate, whereas interpolation increses sampling rate.
- Consider two special cases.
1. Sampling rate reduction by an integer factor ‘D’ &
2. Sampling rate increse by an integer factor ‘I’
- The process of reducing the sampling rate by a factor ‘D’ (down sampling by ‘D’) is
called Decimation.
- The process of increasing the sampling rate by an integer factor ‘I’ (up sampling by
‘I’) is called Interpolation.
- Down sampling (or) decimation is the process of reducing the samples of the discrete
time signal.
- Let x(n)=Discrete time signal
D= Sampling rate reduction factor (and ‘D’ is an integer)
- Now, x(Dn)=Down sampled version of x(n)
- The device which performs the process of down sampling is called a down sampler
(or) decimator.
- Symbolically, the down sampler can be represented as shown in fig.(1)
Fig.(1) Decimator
- Now, x’(n) is the signal obtained after removing all zeros from x(n)p(n).
x' (n ) x(n ) p(n ), n 0, D,2 D,......
Let
m
m nD n
D
When
n m
nm
m
Y ( z) x' (m) z
m
D
(m n)
n
n
n
x' (n) z
n
D
x' (n) z
n
D
x(n ) p(n ) z
n
D
( from( 2))
1 D 1 j 2Dnk Dn
Y ( z ) x(n ) e z
n D k 0
j 2nk
1 D 1
n
Y ( z)
D k 0 n
x ( n ) e D
z D
1 D 1 120
n
j 2 k 1
Y ( z ) x ( n ) e D z D (3)
D k 0 n
- In eq. (3), the terms inside the bracket is similar to Z-Transform of
j 2k 1
y(n), except that, z e D
z D , hence Y(z) can be written as shown in eq.(4)
1 D1 jD2k D1
Y ( z) X e z (4)
D k 0
- On substituting, z e jw in eq.(4),
1 D 1 jD2k jw
Y ( e ) X e
jw
e D
D k 0
1 D 1 j wD2k
Y ( e ) X e
jw
(5)
D k 0
- The eq.(5) gives the frequency spectrum of the output signal of the decimator, i.e.,
frequency spectrum of decimated signal.
1 jw 1 j wD2 1 j wD4
Y (e ) X e X e
jw D X e ........ (6)
D D D
- Hence, we can say that the output spectrum of a decimator is sum of scaled, stretched
& shifted version of the input spectrum.
- Since, the output spectrum is sum of scaled, stretched & shifted version of the input
spectrum, the components of output will overlap & exhibit the phenomenon of
aliasing, if the input is not bandlimited to π/D.
121
Fig.(2) (a) Frequency spectrum of a signal band limited to w=π/M
(b) Frequency spectrum of down sampled signal with respect to Wx
(c) Frequency spectrum of down sampled signal with respect to Wy
Anti-aliasing filter:
- When the input signal to the decimator is not band limited, then the spectrum of
decimated signal has aliasing.
- In order to avoid aliasing, the input signal should be band limited to π/D for
decimation by a factor ‘D’.
- Hence, the input signal is passed through a low pass filter with a bandwidth of π/D
before decimation.
- Since, this low pass filter is designed to avoid aliasing in the output spectrum of
decimator, it is called anti-aliasing filter.
Fig.(4) Interpolator
Problems:
0
n 0 y1 (0) x x (0) 1
2
1
n 1 y1 (1) x x (0.5) 0
2
2
n 2 y1 ( 2) x x (1) 2
2
3
n 3 y1 (3) x x (1.5) 0
2
4
n 4 y1 ( 4) x x ( 2) 3
2
5
n 5 y1 (5) x x ( 2.5) 0
2
6
n 6 y1 (6) x x (3) 4
2
7
n 7 y1 (7) x x (3.5) 0
2
y1 ( n ) 1,0,2,0,3,0,4,0
123
(b) Sampling rate multiplication factor, I=3
n n
y2 ( n ) x x
I 3
0
n 0 y2 (0) x x (0) 1
3
1
n 1 y2 (1) x x (0.3) 0
3
2
n 2 y2 (2) x x (0.6) 0
3
3
n 3 y2 (3) x x (1) 2
3
.............
y2 (n ) 1,0,0,2,0,0,3,0,0,4,0,0
Note: Discrete time signals are defined only for integer values of ‘n’. Therefore, the value of
discrete time signal for non-integer value of ‘n’ will be zero.
Spectrum of upsampler:
- Let x(n) be an input signal to upsampler & y(n) be the output signal.
n
- Let x be an upsampled version of x(n) by an integer factor I.
I
n
y (n ) x (1)
I
- By def. of Z-transform, y(n) can be expressed as
n
Y ( z) y (n) z
n
n
x I z
n
n
substitute
n
m n mI
I
n
Y ( z) xm z
m
mI
xn z
n
nI
xn z
n
I
X ( z I ) ( 2)
- On substituting ze jw
in eq. (2)
- The eq.(3) is the frequency spectrum of the output signal of the interpolator i.e.,
frequency spectrum of upsampled signal.
jwI jw
- The term X (e ) is frequency compressed version of X (e ) by a factor ‘I’.
Since, the frequency response is periodic with periodicity of 2π, the X (e jwI ) will
repeat ‘I’ times in a period of 2π, in the spectrum of upsampled signal.
124
Anti-imaging filter:
Sample-rate conversion is the process of changing the sampling rate of a discrete signal
to obtain a new discrete representation of the underlying continuous signal. Application
areas include image scaling and audio/visual systems, where different sampling rates
may be used for engineering, economic, or historical reasons.
An example of sampling-rate conversion would take place when data from a CD is transferred
onto a DAT. Here the sampling-rate is increased from 44.1 kHz to 48 kHz. To enable this
process the non-integer factor has to be approximated by a rational number:
L 48 160
1.08844
M 44.1 147
Hence, the sampling-rate conversion is achieved by interpolating by L i.e. from 44.1 kHz to
[44.1x160] = 7056 kHz.
125
Then decimating by M i.e. from 7056 kHz to [7056/147] = 48 kHz.
Multistage Approach
When the sampling-rate changes are large, it is often better to perform the operation in
multiple stages, where Mi(Li), an integer, is the factor for the stage i.
M M1M 2 .....M I (or) L L1L2 .......LI
An example of the multistage approach for decimation is shown in Figure 9.8. The multistage
approach allows a significant relaxation of the anti-alias and anti-imaging filters, with a
consequent reduction in the filter complexity. The optimum number of stages is one that leads
to the least computational effort in terms of either the multiplications per second (MPS), or
the total storage requirement (TSR).
Anti aliasing filter is used to avoid aliasing caused by down sampling the signal x(n).
Anti imaging filter removes the unwanted images that are yielded by up sampling.
Fig.(7) A Decimator
- Let us assume that the anti-aliasing filter is an FIR filter with N coefficients. Then the
FIR filter can be realized using direct form structure shown in fig.(8). But this
126
realization is very inefficient. So, we go for an efficient realization.
- The output of FIR filter is obtained using convolution.
v (n ) x (n ) * h(n )
N 1
h(k ) x (n k )
k 0
- To avoid unnecessary calculation of the values v(m), m≠nD, the original structure of
decimator in fig.(2) is replaced by an efficient transversal structure shown in fig.(9).
- Note that here the multiplications and additions are performed at the reduced sampling rate.
127
Fig.(9) Efficient realization for Decimator
Fig.(10) Interpolator
128
Fig.(11) Transposed direct form realization for interpolator
129
Polyphase structure of Decimator:
- For a general case of D branches & a sampling rate reduction by a factor ‘D’, the
structure of polyphase decimator is shown in fig.(13)
- The splitting of x(n) into the low rate sub-sequence x0 (n), x1 (n),...... xD1 (n) is often
represented by a commutator.
- In the configuration shown in fig(1), the input values x(n) enter the dealy chain at high
rate.
- In fig.(14), to produce the output y(0), the commutator must rotate in counter
clockwise direction starting from m=D-1,…….m=2, m=1, m=0 & give the input
values x(-D+1), …….x(-2), x(-1), x(0) to the filters PD1 (n),.........P2 (n), P1 (n), P0 (n)
130
Polyphase Decimation using the Z-Transform:
- The sub filters P0 ( z ), P1 ( z ),...... PM 1 ( z ) are FIR filters & when combined in the right
phase sequence produce the original filter H(z).
- Now replacing the transfer function H(z) in fig.(1) by the polyphase structure results as
in fig.(16).
Fig.(16)
- The 3rd identity can be used to derive the version in fig.(18) in which both the no. of
filter operations & the amount of memory required are reduced by a factor of ‘D’.
131
Fig.(18)
Fig.(20)
- The output y(n) also can be obtained by combining the signals using a commutator as
shown in fig.(21).
Fig.(21)
Fig.(22)
133
Fig.(23)
Fig.(24)
134
analogue waveform by passing it through a 14-bit DAC. Then the output from this device is
passed through an analogue low-pass filter before it is sent to the speakers.
The effect of oversampling also has some other desirable features. Firstly, it causes the image
frequencies to be much higher and therefore easier to filter out. The anti-alias filter
specification can therefore be very much relaxed i.e. the cutoff frequency of the filter for the
135
previous example increases from [44.1 / 2] = 22.05 kHz to [44.1x8 / 2] = 176.4 kHz after the
interpolation.
Filter banks:
Filter banks are of two types.
1. Analysis filter bank
2. Synthesis filter bank
- It consists of D-sub filter. The individual sub filter H k (z ) is known as analysis filter.
- All the sub filters are equally spaced in frequency & each have the same bandwidth.
- The spectrum of the input signal X e jw lies in the range 0 w . The filter bank
splits the signal into number of sub bands each having a bandwidth π/D.
- As the spectrum of the signal is band limited to π/D, the sampling rate can be reduced
by a factor ‘D’. The down sampling moves all the sub band signals into the base band
0w
D
136
2. Synthesis filter bank
- The D-channel synthesis filter bank is dual of D-channel analysis filter bank.
- In this case, each U m (z ) is fed to an upsampler. The upsampling process produces the
D
signal U m ( z ) .
- These signals are applied to filters Gm (z ) & finally added to get the output signal Xˆ ( z )
- The filter G0 ( z ) GD1 ( z ) have the same characteristics as the analysis filters
H 0 ( z ) H D1 ( z )
- The analysis filter bank splits the broadband input signal x(n) into ‘D’ non overlapping
frequency band signals X 0 ( z ), X 1 ( z ),......., X D1 ( z ) of equal bandwidth.
- These outputs are coded & transmitted.
- The synthesis filter bank is used to reconstruct output signal xˆ ( n ) , which should
approximate the original signal.
- The application of sub band coding is in Speech signal processing.
137
UNIT- V
INTRODUCTION TO DSP PROCESSORS
138
Difference between DSP and other microprocessors architecture their
comparison:
Compare DSP processor and Microprocessor
139
Key difference: A microprocessor incorporates the functions of a computer's central
processing unit (CPU) on a single or few integrated circuits. The purpose of a microprocessor
is to accept digital data as input, process it as per the instructions, and then provide the output.
Most general purpose microprocessors are present in personal computers. They are often used
for computation, text editing, multimedia display, and communication over a network. The
DSP processor, on the other hand, is a particular type of microprocessor. DSP stands for
digital signal processing. It is basically any signal processing that is done on a digital signal or
information signal.
A microprocessor incorporates the functions of a computer's central processing unit
(CPU) on a single or few integrated circuits. The purpose of a microprocessor is to accept
digital data as input, process it as per the instructions, and then provide the output. This is
known as sequential digital logic. The microprocessor has internal memory and operates
basically on the binary system.
A general purpose microprocessor is a processor that is not tied to or integrated with a
particular language or piece of software. Most general purpose microprocessors are present in
personal computers. They are often used for computation, text editing, multimedia display,
and communication over a network. Other microprocessors are part of embedded systems.
These provide digital control over practically any technology, such as appliances,
automobiles, cell phones, industrial process control, etc.
The DSP processor, on the other hand, is a particular type of microprocessor. DSP
stands for digital signal processing. It is basically any signal processing that is done on a
digital signal or information signal. A DSP processor is a specialized microprocessor that has
an architecture optimized for the operational needs of digital signal processing.
DSP aims to modify or improve the signal. It is characterized by the representation of
discrete units, such as discrete time, discrete frequency, or discrete domain signals. DSP
includes subfields like communication signals processing, radar signal processing, sensor
array processing, digital image processing, etc.
The main goal of a DSP processor is to measure, filter and/or compress digital or
analog signals. It does this by converting the signal from a real-world analog signal to a digital
form. In order to convert the signal it uses a digital-to-analog converter (DAC). However, the
required output signal is often another real-world analog signal. This is turn also requires a
digital-to-analog converter.
Digital signal processing algorithms can run on various platforms, such as general
purpose microprocessors and standard computers; specialized processors called digital signal
processors (DSPs); purpose-built hardware such as application-specific integrated circuit
(ASICs) and field-programmable gate arrays (FPGAs); Digital Signal Controllers; and stream
processing for traditional DSP or graphics processing applications, such as image, video.
The main difference between a DSP and a microprocessor is that a DSP processor has
features designed to support high-performance, repetitive, numerically intensive tasks. DSP
processors are designed specifically to perform large numbers of complex arithmetic
calculations and as quickly as possible. They are often used in applications such as image
processing, speech recognition and telecommunications. As compared to general
microprocessors, DSP processors are more efficient at performing basic arithmetic operations,
especially multiplication.
Most general-purpose microprocessors and operating systems can execute DSP algorithms
successfully. However, they are not suitable for use in portable devices such as mobile
phones. Hence, specialized digital signal processors are used. Digital Signal Processors have
approximately the same level of integration and the same clock frequencies as general purpose
140
microprocessors, but they tend to have better performance, lower latency, and no requirements
for specialized cooling or large batteries. This allows them to be a lower-cost alternative to
general-purpose microprocessors.
DSPs also tend to be two to three times as fast as general-purpose microprocessors. This is
because of architectural differences. DSPs tend to have a different arithmetic Unit
architecture; specialized units, such as multipliers, etc.; regular instruction cycle, a RISC-like
architecture; parallel processing; a Harvard Bus architecture; an Internal memory
organization; multiprocessing organization; local links; and memory banks interconnection.
RISC Processor Features:
• RISC chips have fewer components and a smaller instruction set, allowing faster
accessing of “common” instructions
• RISC chips execute an instruction in one machine cycle
• Simple addressing modes
• Fewer data types
• fewer instruction sets
• A bit harder to design a compile
• Less transistors required
• Requires more memory to store the instructions
• Mostly single clock cycled instructions
141
It has a small number of general purpose It has several general purpose resisters and
registers large cache memory.
Pipelining and superscalar features are not the The instruction set of RISC micro- processors
bases to design such processors, although, typically includes only register to register
many CISC microprocessors use several load and store.
features if RISCs such as: pipelining, to
increase its performance.
The TMS320C54xE DSP is a fixed-point digital signal processor (DSP) in the TMS320E DSP
family. The C54xE DSP meets the specific needs of real-time embedded applications, such as
telecommunications. The C54x central processing unit (CPU), with its modified Harvard
architecture, features minimized power consumption and a high degree of parallelism. In
addition to these features, the versatile addressing modes and instruction set in the C54x
improve the overall system performance.
In 1982, Texas Instruments introduced the TMS32010 — the first fixed-point DSP in the
TMS320 DSP family. Before the end of the year, Electronic Products magazine awarded the
TMS32010 the title “Product of the Year”. The TMS32010 became the model for future
TMS320 DSP generations.
Today, the TMS320 DSP family consists of three supported DSP platforms:
TMS320C2000E, TMS320C5000E, and TMS320C6000E. Within the C5000E DSP platform
there are three generations, the TMS320C5xE, TMS320C54xE, and TMS320C55xE.
Devices within the C5000 DSP platform use a similar CPU structure that is combined
with a variety of on-chip memory and peripheral configurations. These various configurations
satisfy a wide range of needs in the worldwide electronics market. When memory and
peripherals are integrated with a CPU onto a single chip, overall system cost is greatly
reduced and circuit board space is reduced.
Figure 1–1 shows the performance gains of the TMS320 DSP family of devices.
142
Fig.(1) Evolution of the TMS320 DSP family
Table 1–1 lists some typical applications for the TMS320 family of DSPs. The TMS320 DSPs
offer more adaptable approaches to traditional signal-processing problems such as vocoding
and filtering than standard microprocessor/microcomputer devices. They also support
complex applications that often require multiple operations to be performed simultaneously.
The C54xE DSP has a high degree of operational flexibility and speed. It combines an
advanced modified Harvard architecture (with one program memory bus, three data memory
buses, and four address buses), a CPU with application-specific hardware logic, on-chip
memory, on-chip peripherals, and a highly specialized instruction set. Spinoff devices that
combine the C54x CPU with customized on-chip memory and peripheral configurations have
been, and continue to be, developed for specialized areas of the electronics market.
Enhanced Harvard architecture built around one program bus, three data buses, and
four address buses for increased performance and versatility.
Advanced CPU design with a high degree of parallelism and application specific
hardware logic for increased performance.
A highly specialized instruction set for faster algorithms and for optimized high-level
language operation.
Modular architecture design for fast development of spinoff devices.
Advanced IC processing technology for increased performance and low power
consumption.
Low power consumption and increased radiation hardness because of new static
design techniques.
143
Table 1-1: Typical applications for TMS320 DSPs
Each computational block of the DSP should be optimized for functionality and speed and in
the meanwhile the design should be sufficiently general so that it can be easily integrated with
other blocks to implement overall DSP systems.
Multipliers
The advent of single chip multipliers paved the way for implementing DSP functions on a
VLSI chip. Parallel multipliers replaced the traditional shift and add multipliers now days.
Parallel multipliers take a single processor cycle to fetch and execute the instruction and to
store the result. They are also called as Array multipliers. The key features to be considered
for a multiplier are:
a. Accuracy
b. Dynamic range
c. Speed
The number of bits used to represent the operands decides the accuracy and the
dynamic range of the multiplier. Whereas speed is decided by the architecture employed. If
the multipliers are implemented using hardware, the speed of execution will be very high but
the circuit complexity will also increases considerably. Thus there should be a tradeoff
between the speed of execution and the circuit complexity. Hence the choice of the
architecture normally depends on the application.
Parallel Multipliers
Consider the multiplication of two unsigned numbers A and B. Let A be represented using m
bits as (Am-1 Am-2 …….. A1 A0) and B be represented using n bits as (Bn-1 Bn-2 …….. B1
B0).
Then the product of these two numbers is given by,
This operation can be implemented paralleling using Braun multiplier whose hardware
structure is as shown in the figure 2.
145
Fig.(2) Braun Multiplier for a 4X4 Multiplication
In the Braun multiplier the sign of the numbers are not considered into account. In
order to implement a multiplier for signed numbers, additional hardware is required to modify
the Braun multiplier. The modified multiplier is called as Baugh-Wooley multiplier.
Speed
Bus Widths
Consider the multiplication of two n bit numbers X and Y. The product Z can be at
most 2n bits long. In order to perform the whole operation in a single execution cycle, we
146
require two buses of width n bits each to fetch the operands X and Y and a bus of width 2n
bits to store the result Z to the memory. Although this performs the operation faster, it is not
an efficient way of implementation as it is expensive. Many alternatives for the above method
have been proposed. One such method is to use the program bus itself to fetch one of the
operands after fetching the instruction, thus requiring only one bus to fetch the operands. And
the result Z can be stored back to the memory using the same operand bus. But the problem
with this is the result Z is 2n bits long whereas the operand bus is just n bits long. We have
two alternatives to solve this problem, a. Use the n bits operand bus and save Z at two
successive memory locations. Although it stores the exact value of Z in the memory, it takes
two cycles to store the result.
b. Discard the lower n bits of the result Z and store only the higher order n bits into the
memory. It is not applicable for the applications where accurate result is required. Another
alternative can be used for the applications where speed is not a major concern. In which
latches are used for inputs and outputs thus requiring a single bus to fetch the operands and to
store the result (Fig 3).
Shifters
Shifters are used to either scale down or scale up operands or the results. The
following scenarios give the necessity of a shifter
a. While performing the addition of N numbers each of n bits long, the sum can grow up to
n+log2 N bits long. If the accumulator is of n bits long, then an overflow error will occur. This
can be overcome by using a shifter to scale down the operand by an amount of log2N.
b. Similarly while calculating the product of two n bit numbers, the product can grow up to 2n
bits long. Generally the lower n bits get neglected and the sign bit is shifted to save the sign of
the product. c. Finally in case of addition of two floating-point numbers, one of the operands
has to be shifted appropriately to make the exponents of two numbers equal.
From the above cases it is clear that, a shifter is required in the architecture of a DSP.
Barrel Shifters
In conventional microprocessors, normal shift registers are used for shift operation. As
it requires one clock cycle for each shift, it is not desirable for DSP applications, which
generally involves more shifts. In other words, for DSP applications as speed is the crucial
147
issue, several shifts are to be accomplished in a single execution cycle. This can be
accomplished using a barrel shifter, which connects the input lines representing a word to a
group of output lines with the required shifts determined by its control inputs. For an input of
length n, log2 n control lines are required. And an additional control line is required to
indicate the direction of the shift. The block diagram of a typical barrel shifter is as shown in
figure 4.
Figure 5 depicts the implementation of a 4 bit shift right barrel shifter. Shift to right by 0,
1, 2 or 3 bit positions can be controlled by setting the control inputs appropriately.
Most of the DSP applications require the computation of the sum of the products of a
series of successive multiplications. In order to implement such functions a special unit called
a multiply and Accumulate (MAC) unit is required. A MAC consists of a multiplier and a
special register called Accumulator. MACs are used to implement the functions of the type
A+BC. A typical MAC unit is as shown in the figure 6.
148
Fig.(6) A MAC Unit
Although addition and multiplication are two different operations, they can be performed in
parallel. By the time the multiplier is computing the product, accumulator can accumulate the
product of the previous multiplications. Thus if N products are to be accumulated, N-1
multiplications can overlap with N-1 additions. During the very first multiplication,
accumulator will be idle and during the last accumulation, multiplier will be idle. Thus N+1
clock cycles are required to compute the sum of N products.
While designing a MAC unit, attention has to be paid to the word sizes encountered at
the input of the multiplier and the sizes of the add/subtract unit and the accumulator, as there
is a possibility of overflow and underflows. Overflow/underflow can be avoided by using any
of the following methods viz
Shifters
Shifters can be provided at the input of the MAC to normalize the data and at the
output to de normalize the same.
Guard bits
As the normalization process does not yield accurate result, it is not desirable for some
applications. In such cases we have another alternative by providing additional bits called
guard bits in the accumulator so that there will not be any overflow error. Here the
add/subtract unit also has to be modified appropriately to manage the additional bits of the
accumulator.
Saturation Logic
Overflow/ underflow will occur if the result goes beyond the most positive number or
below the least negative number the accumulator can handle. Thus the overflow/underflow
149
error can be resolved by loading the accumulator with the most positive number which it can
handle at the time of overflow and the least negative number that it can handle at the time of
underflow. This method is called as saturation logic. A schematic diagram of saturation logic
is as shown in figure 7. In saturation logic, as soon as an overflow or underflow condition is
satisfied the accumulator will be loaded with the most positive or least negative number
overriding the result computed by the MAC unit.
A typical DSP device should be capable of handling arithmetic instructions like ADD,
SUB, INC, DEC etc and logical operations like AND, OR , NOT, XOR etc. The block
diagram of a typical ALU for a DSP is as shown in the figure 8.
Status Flags
ALU includes circuitry to generate status flags after arithmetic and logic operations.
These flags include sign, zero, carry and overflow.
150
Overflow Management
Depending on the status of overflow and sign flags, the saturation logic can be used to
limit the accumulator content.
Register File
Instead of moving data in and out of the memory during the operation, for better speed, a
large set of general purpose registers are provided to store the intermediate results.
In order to increase the speed of operation, separate memories were used to store
program and data and a separate set of data and address buses have been given to both
memories, the architecture called as Harvard Architecture. It is as shown in figure 10.
151
Fig.(10) Harvard Architecture
Although the usage of separate memories for data and the instruction speeds up the
processing, it will not completely solve the problem. As many of the DSP instructions require
more than one operand, use of a single data memory leads to the fetch the operands one after
the other, thus increasing the delay of processing. This problem can be overcome by using two
separate data memories for storing operands separately, thus in a single clock cycle both the
operands can be fetched together (Figure 11).
152
Fig.(11) Harvard Architecture with Dual Data Memory
Although the above architecture improves the speed of operation, it requires more
hardware and interconnections, thus increasing the cost and complexity of the system.
Therefore there should be a tradeoff between the cost and speed while selecting memory
architecture for a DSP.
On-chip Memories
In order to have a faster execution of the DSP functions, it is desirable to have some
memory located on chip. As dedicated buses are used to access the memory, on chip
memories are faster. Speed and size are the two key parameters to be considered with respect
to the on-chip memories.
Speed
On-chip memories should match the speeds of the ALU operations in order to maintain
the single cycle instruction execution of the DSP.
Size
In a given area of the DSP chip, it is desirable to implement as many DSP functions as
possible. Thus the area occupied by the on-chip memory should be minimum so that there
will be a scope for implementing more number of DSP functions on- chip.
Ideally whole memory required for the implementation of any DSP algorithm has to
reside on-chip so that the whole processing can be completed in a single execution cycle.
Although it looks as a better solution, it consumes more space on chip, reducing the scope for
implementing any functional block on-chip, which in turn reduces the speed of execution.
Hence some other alternatives have to be thought of. The following are some other ways in
which the on-chip memory can be organized.
153
b. The access times for memories on-chip should be sufficiently small so that it can be
accessed more than once in every execution cycle.
c. On-chip memories can be configured dynamically so that they can serve different
purpose at different times.
In this mode, one of the registers will be holding the data and the register has to be
specified in the instruction.
Direct Addressing Mode
In this addressing mode, instruction holds the memory location of the operand.
For the implementation of some real time applications in DSP, normal addressing
modes will not completely serve the purpose. Thus some special addressing modes are
required for such applications.
154
While processing the data samples coming continuously in a sequential manner,
circular buffers are used. In a circular buffer the data samples are stored sequentially from the
initial location till the buffer gets filled up. Once the buffer gets filled up, the next data
samples will get stored once again from the initial location. This process can go forever as
long as the data samples are processed in a rate faster than the incoming data rate.
There are four special cases in this addressing mode. They are
The buffer length in the first two case will be (EAR-SAR+1) whereas for the next two cases
(SAR-EAR+1)
To implement FFT algorithms we need to access the data in a bit reversed manner.
Hence a special addressing mode called bit reversed addressing mode is used to calculate the
index of the next data to be fetched. It works as follows. Start with index 0. The present index
can be calculated by adding half the FFT length to the previous index in a bit reversed
manner, carry being propagated from MSB to LSB.
The main job of the Address Generation Unit is to generate the address of the operands
required to carry out the operation. They have to work fast in order to satisfy the timing
constraints. As the address generation unit has to perform some mathematical operations in
order to calculate the operand address, it is provided with a separate ALU.
The block diagram of a typical address generation unit is as shown in figure 12.
155
Fig.(12) Address generation unit
Program Control
Like microprocessors, DSP also requires a control unit to provide necessary control
and timing signals for the proper execution of the instructions. In microprocessors, the
controlling is micro coded based where each instruction is divided into microinstructions
stored in micro memory. As this mechanism is slower, it is not applicable for DSP
applications. Hence in DSP the controlling is hardwired base where the Control unit is
designed as a single, comprehensive, hardware unit. Although it is more complex it is faster.
Instruction set
Single-instruction repeat and block repeat operations
Block memory move instructions for better program and data management
Instructions with a 32-bit long operand
Instructions with 2- or 3-operand simultaneous reads
Arithmetic instructions with parallel store and parallel load
Conditional-store instructions
Fast return from interrupt
On-chip peripherals
Software-programmable wait-state generator
Programmable bank-switching logic
On-chip phase-locked loop (PLL) clock generator with internal oscillator or external
clock source. With the external clock source, there are several multiplier values
available from one of the following device options:
† The C541B, C545A, C546A, C548, C549, C5402, C5410, and C5420 have a software-
programmable PLL and two additional saturation modes.
Each device offers selection of clock modes from one option list only.
External bus-off control to disable the external data bus, address bus, and control
signals
157
Data bus with a bus holder feature
Programmable timer
Power
Power consumption control with IDLE 1, IDLE 2, and IDLE 3 instructions for power-
down modes
Control to disable the CLKOUT signal
Emulation: IEEE Standard 1149.1 boundary scan logic interfaced to on-chip scan-
based emulation logic
Architectural Overview
In an overview of the architectural structure of the TMS320C54xE DSP, this comprises the
central processing unit (CPU), memory, and on-chip peripherals.
The C54xE DSPs use an advanced modified Harvard architecture that maximizes processing
power with eight buses.
Separate program and data spaces allow simultaneous access to program instructions and data,
providing a high degree of parallelism. For example, three reads and one write can be
performed in a single cycle. Instructions with parallel store and application-specific
instructions fully utilize this architecture.
In addition, data can be transferred between data and program spaces. Such parallelism
supports a powerful set of arithmetic, logic, and bit-manipulation operations that can all be
performed in a single machine cycle. Also, the C54x DSP includes the control mechanisms to
manage interrupts, repeated operations, and function calling.
Figure 13 shows a functional block diagram of the C54x DSP, which includes the principal
blocks and bus structure.
158
Fig.(13): Block diagram of TMS320C54X DSP Internal Hardware
159
Fig.(14): TMS320C54x block diagram
Bus Structure
The C54xE DSP architecture is built around eight major 16-bit buses (four program/data
buses and four address buses):
The program bus (PB) carries the instruction code and immediate operands from
program memory.
Three data buses (CB, DB, and EB) interconnect to various elements, such as the
CPU, data address generation logic, program address generation logic, on-chip
peripherals, and data memory.
The CB and DB carry the operands that are read from data memory.
The EB carries the data to be written to memory.
Four address buses (PAB, CAB, DAB, and EAB) carry the addresses needed for
instruction execution.
The C54x DSP can generate up to two data-memory addresses per cycle using the two
auxiliary register arithmetic units (ARAU0 and ARAU1).
The PB can carry data operands stored in program space (for instance, a coefficient table) to
the multiplier and adder for multiply/accumulate operations or to a destination in data space
for data move instructions (MVPD and READA). This capability, in conjunction with the
feature of dual-operand read, supports the execution of single-cycle, 3-operand instructions
such as the FIRS instruction.
The C54x DSP also has an on-chip bidirectional bus for accessing on-chip peripherals. This
bus is connected to DB and EB through the bus exchanger in the CPU interface. Accesses that
use this bus can require two or more cycles for reads and writes, depending on the peripheral’s
structure.
160
Internal Memory Organization
The C54xE DSP memory is organized into three individually selectable spaces: program,
data, and I/O space. The C54x devices can contain random access memory (RAM) and read-
only memory (ROM). Among the devices, the following types of RAM are represented: dual-
access RAM (DARAM), single-access RAM (SARAM), and two-way shared RAM. The
DARAM or SARAM can be shared within subsystems of a multiple-CPU core device. You
can configure the DARAM and SARAM as data memory or program/data memory.
Table 2–2 shows how much ROM, DARAM, and SARAM are available on some C54x
devices. The C54x DSP also has 26 CPU registers plus peripheral registers that are mapped in
data-memory space. The C54x DSP memory types and features are introduced in the sections
following this paragraph.
161
On-Chip ROM
The on-chip ROM is part of the program memory space and, in some cases, part of the data
memory space. The amount of on-chip ROM available on each device varies, as shown in
Table 2–2.
On most devices, the ROM contains a bootloader that is useful for booting to faster
on-chip or external RAM. On devices with large amounts of ROM, a portion of the ROM may
be mapped into both data and program space. The larger ROMs are also custom ROMs: you
provide the code or data to be programmed into the ROM in object file format, and Texas
Instruments generates the appropriate process mask to program the ROM.
Memory-Mapped Registers
The data memory space contains memory-mapped registers for the CPU and the on-chip
peripherals. These registers are located on data page 0, simplifying access to them. The
memory-mapped access provides a convenient way to save and restore the registers for
context switches and to transfer information between the accumulators and the other registers.
162
Central Processing Unit (CPU)
The CPU is common to all C54xE devices. The C54x CPU contains:
40-bit arithmetic logic unit (ALU)
Two 40-bit accumulators
Barrel shifter
17 × 17-bit multiplier
40-bit adder
Compare, select, and store unit (CSSU)
Data address generation unit
Program address generation unit
The ALU can also function as two 16-bit ALUs and perform two 16-bit operations
simultaneously.
The 40-bit ALU, shown in Figure 15, implements a wide range of arithmetic and logical
functions, most of which execute in a single clock cycle. After an operation is performed in
the ALU, the result is usually transferred to a destination accumulator (accumulator A or B).
Instructions that perform memory to memory operations (ADDM, ANDM, ORM, and
XORM) are exceptions.
163
Fig.(15) ALU Functional Diagram
ALU Input
ALU input takes several forms from several sources.
The X input source to the ALU is either of two values:
The shifter output (a 32-bit or 16-bit data-memory operand or a shifted accumulator
value)
A data-memory operand from data bus DB
The Y input source to the ALU is any of three values:
The value in one of the accumulators (A or B)
A data-memory operand from data bus CB
The value in the T register
When a 16-bit data-memory operand is fed through data bus CB or DB, the 40-bit ALU input
is constructed in one of two ways:
If bits 15 through 0 contain the data-memory operand, bits 39 through 16 are zero
filled (SXM = 0) or sign-extended (SXM = 1).
If bits 31 through 16 contain the data-memory operand, bits 15 through 0 are zero
filled, and bits 39 through 32 are either zero filled (SXM = 0) or sign extended (SXM
= 1).
Table 4–4 shows how the ALU inputs are obtained for the ADD instructions, depending on
the type of syntax used. The ADD instructions execute in one cycle, except for cases 4, 7, and
8 that use two words and execute in two cycles.
164
Overflow Handling
The ALU saturation logic prevents a result from overflowing by keeping the result at a
maximum (or minimum) value. This feature is useful for filter calculations. The logic is
enabled when the overflow mode bit (OVM) in status register ST1 is set.
When a result overflows:
If OVM = 0, the accumulators are loaded with the ALU result without modification.
If OVM = 1, the accumulators are loaded with either the most positive 32-bit value (00
7FFF FFFFh) or the most negative 32-bit value (FF 8000 0000h), depending on the
direction of the overflow.
The overflow flag (OVA/OVB) in status register ST0 is set for the destination
accumulator and remains set until one of the following occurs:
A reset is performed.
A conditional instruction (such as a branch, a return, a call, or an execute) is
executed on an overflow condition.
The overflow flag (OVA/OVB) is cleared.
165
2. Accumulators A and B
Accumulators A and B store the output from the ALU or the multiplier/adder block. They can
also provide a second input to the ALU; accumulator A can be an input to the
multiplier/adder. Each accumulator is divided into three parts:
Guard bits (bits 39–32)
High-order word (bits 31–16)
Low-order word (bits 15–0)
Instructions are provided for storing the guard bits, for storing the high- and the low-order
accumulator words in data memory, and for transferring 32-bit accumulator words in or out of
data memory. Also, either of the accumulators can be used as temporary storage for the other.
Accumulator A and accumulator B can be configured as the destination registers for either the
multiplier/adder unit or the ALU. In addition, they are used for MIN and MAX instructions or
for the parallel instruction LD||MAC, in which one accumulator loads data and the other
performs computations.
Each accumulator is split into three parts, as shown in Figure 16 and Figure 17.
Fig.(16):Accumulator A
Fig.(17): Accumulator B
The guard bits are used as a head margin for computations. Head margins allow you to
prevent some overflow in iterative computations such as autocorrelation.
AG, BG, AH, BH, AL, and BL are memory-mapped registers that can be pushed onto and
popped from the stack for context saves and restores by using PSHM and POPM instructions.
These registers can also be used by other instructions that use memory-mapped registers
(MMR) for page 0 addressing. The only difference between accumulators A and B is that bits
32–16 of A can be used as an input to the multiplier in the multiplier/adder unit.
To store the 16 LSBs of the accumulator in memory with a shift, use the STL
instruction. For right-shift operations, bits from AH and BH shift into AL and BL,
respectively, and the LSBs are lost. For left-shift operations, the bits in AL and BL are filled
with zeros. Since shift operations are performed in the shifter, the contents of the accumulator
remain unchanged.
166
Example 4–3 shows the result of accumulator store operations with shift; it assumes
that accumulator A = 0FF 4321 1234h.
Example 4–3. Accumulator Store With Shift
STH A,8,TEMP ; TEMP = 2112h
STH A,-8,TEMP ; TEMP = FF43h
STL A,8,TEMP ; TEMP = 3400h
STL A,-8,TEMP ; TEMP = 2112h
In SFTA and SFTL, the shift count is defined as –16 ≤ SHIFT ≤ 15. SFTA is affected by the
SXM bit. When SXM = 1 and SHIFT is a negative value, SFTA performs an arithmetic right
shift and maintains the sign of the accumulator. When SXM = 0, the MSBs of the accumulator
are zero filled. SFTL is not affected by the SXM bit; it performs the shift operation for bits
31–0, shifting 0s into the MSBs or LSBs, depending on the direction of the shift.
SFTC performs a 1-bit left shift when both bits 31 and 30 are 1 or both are 0. This normalizes
32 bits of the accumulator by eliminating the most significant nonsign bit.
ROL rotates each bit of the accumulator to the left by one bit, shifts the value of the carry bit
into the LSB of the accumulator, shifts the value of the MSB of the accumulator into the carry
bit, and clears the accumulator’s guard bits.
ROR rotates each bit of the accumulator to the right by one bit, shifts the value of the carry bit
into the MSB of the accumulator, shifts the value of the LSB of the accumulator into the carry
bit, and clears the accumulator’s guard bits.
The ROLTC instruction (rotate accumulator left with TC) rotates the accumulator to the left
and shifts the test control (TC) bit into the LSB of the accumulator.
Application-Specific Instructions
Each accumulator is dedicated to specific operations in application-specific instructions with
parallel operations. These include symmetrical FIR filter operations using the FIRS
instruction, adaptive filter operations using the LMS instruction, Euclidean distance
calculations using the SQDST instruction, and other parallel operations:
FIRS performs operations for symmetric FIR filters by using multiply/ accumulates
(MACs) in parallel with additions.
LMS performs a MAC and a parallel add with rounding to efficiently update the
coefficients in an FIR filter.
SQDST performs a MAC and a subtract in parallel to calculate Euclidean distance.
In the LMS instruction, accumulator B stores the interim results of the input sequence
convolution and filter coefficients; accumulator A updates the filter coefficients. Accumulator
A can also be used as an input for MAC, which contributes to single-cycle execution of
instructions with parallel operations.
The SQDST instruction computes the square of the distance between two vectors.
Accumulator A(32–16) is squared and the product is added to accumulator B. The result is
stored in accumulator B. At the same time, Ymem is subtracted from Xmem and the
difference is stored in accumulator A. The value that is squared is the value of the
accumulator before the subtraction, Ymem – Xmem, is executed.
168
Fig.(18): Compare, Select, and Store Unit (CSSU)
The CSSU allows the C54x device to support various Viterbi butterfly algorithms used in
equalizers and channel decoders. The add function of the Viterbi operator is performed by the
ALU. This function consists of a double addition function (Met1±D1 and Met2 ± D2). Double
addition is completed in one machine cycle if the ALU is configured for dual 16-bit mode by
setting the C16 bit in ST1. With the ALU configured in dual 16-bit mode, all the long-word
(32-bit) instructions become dual 16-bit arithmetic instructions. T is connected to the ALU
input (as a dual 16-bit operand) and is used as local storage in order to minimize memory
access.
Barrel Shifter
The C54x DSP barrel shifter has a 40-bit input connected to the accumulators or to data
memory (using CB or DB), and a 40-bit output connected to the ALU or to data memory
(using EB). The barrel shifter can produce a left shift of 0 to 31 bits and a right shift of 0 to 16
bits on the input data. The shift requirements are defined in the shift count field of the
instruction, the shift count field(ASM) of status register ST1, or in temporary register T (when
it is designated as a shift count register).
The barrel shifter and the exponent encoder normalize the values in an accumulator in
a single cycle. The LSBs of the output are filled with 0s, and the MSBs can be either zero
filled or sign extended, depending on the state of the sign-extension mode bit (SXM) in ST1.
Additional shift capabilities enable the processor to perform numerical scaling, bit extraction,
extended arithmetic, and overflow prevention operations.
Multiplier/Adder Unit
The multiplier/adder unit performs 17 x 17-bit 2s-complement multiplication with a 40-bit
addition in a single instruction cycle. The multiplier/adder block consists of several elements:
a multiplier, an adder, signed/unsigned input control logic, fractional control logic, a zero
detector, a rounder (2s complement), overflow/saturation logic, and a 16-bit temporary
storage register (T).
The multiplier has two inputs: one input is selected from T, a data-memory operand, or
accumulator A; the other is selected from program memory, data memory, accumulator A, or
an immediate value.
The fast, on-chip multiplier allows the C54x DSP to perform operations efficiently
such as convolution, correlation, and filtering. In addition, the multiplier and ALU together
execute multiply/accumulate (MAC) computations and ALU operations in parallel in a single
169
instruction cycle. This function is used in determining the Euclidian distance and in
implementing symmetrical and LMS filters, which are required for complex DSP algorithms.
Data Addressing
The C54xE DSP offers seven basic data addressing modes:
Immediate addressing uses the instruction to encode a fixed value.
Absolute addressing uses the instruction to encode a fixed address.
Accumulator addressing uses accumulator A to access a location in program memory
as data.
Direct addressing uses seven bits of the instruction to encode the lower seven bits of
an address. The seven bits are used with the data page pointer (DP) or the stack pointer
(SP) to determine the actual memory address.
Indirect addressing uses the auxiliary registers to access memory.
Memory-mapped register addressing uses the memory-mapped registers without
modifying either the current DP value or the current SP value.
Stack addressing manages adding and removing items from the system stack.
Pipeline Operation
An instruction pipeline consists of a sequence of operations that occur during the execution of
an instruction. The C54xE DSP pipeline has six levels: prefetch, fetch, decode, access, read,
and execute. At each of the levels, an independent operation occurs. Because these operations
are independent, from one to six instructions can be active in any given cycle, each instruction
at a different stage of completion. Typically, the pipeline is full with a sequential set of
instructions, each at one of the six stages. When a PC discontinuity occurs, such as during a
branch, call, or return, one or more stages of the pipeline may be temporarily unused.
170
On-Chip Peripherals
All the C54xE devices have a common CPU, but different on-chip peripherals are connected
to their CPUs. The C54x devices may have these, or other, on-chip peripheral options:
General-purpose I/O pins
Software-programmable wait-state generator
Programmable bank-switching logic
Clock generator
Timer
Direct memory access (DMA) controller
Standard serial port
Time-division multiplexed (TDM) serial port
Buffered serial port (BSP)
Multichannel buffered serial port (McBSP)
Host-port interface
8-bit standard (HPI)
8-bit enhanced (HPI8)
16-bit enhanced (HPI16)
ST0 and ST1 contain the status of various conditions and modes; PMST contains memory-
setup status and control information. Because these registers are memory-mapped, they can be
stored into and loaded from data memory; the status of the processor can be saved and
restored for subroutines and interrupt service routines (ISRs).
171
172
173
174
175
176
177
The architecture of TMS320C54xx digital signal processors:
TMS320C54xx processors retain in the basic Harvard architecture of their
predecessor, TMS320C25, but have several additional features, which improve their
performance over it. Figure 4.1 shows a functional block diagram of TMS320C54xx
processors. They have one program and three data memory spaces with separate buses, which
provide simultaneous accesses to program instruction and two data operands and enables
writing of result at the same time. Part of the memory is implemented on-chip and consists of
combinations of ROM, dual-access RAM, and single-access RAM. Transfers between the
memory spaces are also possible.
The central processing unit (CPU) of TMS320C54xx processors consists of a 40- bit
arithmetic logic unit (ALU), two 40-bit accumulators, a barrel shifter, a 17x17 multiplier, a
40-bit adder, data address generation logic (DAGEN) with its own arithmetic unit, and
program address generation logic (PAGEN). These major functional units are supported by a
number of registers and logic in the architecture. A powerful instruction set with a hardware-
supported, single-instruction repeat and block repeat operations, block memory move
instructions, instructions that pack two or three simultaneous reads, and arithmetic instructions
with parallel store and load make these devices very efficient for running high-speed DSP
algorithms.
Several peripherals, such as a clock generator, a hardware timer, a wait state generator,
parallel I/O ports, and serial I/O ports, are also provided on-chip. These peripherals make it
convenient to interface the signal processors to the outside world. In these following sections,
we examine in detail the various architectural features of the TMS320C54xx family of
processors.
178
Fig.(19): Functional architecture for TMS320C54xx processors.
Bus Structure:
The performance of a processor gets enhanced with the provision of multiple buses to
provide simultaneous access to various parts of memory or peripherals. The 54xx architecture
is built around four pairs of 16-bit buses with each pair consisting of an address bus and a data
bus. As shown in Figure 4.1, these are the program bus pair (PAB, PB); which carries the
instruction code from the program memory. Three data bus pairs (CAB, CB; DAB, DB; and
EAB, EB); which interconnected the various units within the CPU. In Addition the pair CAB,
CB and DAB, DB are used to read from the data memory, while The pair EAB, EB; carries
the data to be written to the memory. The ‘54xx can generate up to two data-memory
addresses per cycle using the two auxiliary register arithmetic unit (ARAU0 and ARAU1) in
the DAGEN block. This enables accessing two operands simultaneously.
179
Central Processing Unit (CPU):
The ‘54xx CPU is common to all the ‘54xx devices. The ’54xx CPU contains a 40-bit
arithmetic logic unit (ALU); two 40-bit accumulators (A and B); a barrel shifter; a
17 x 17-bit multiplier; a 40-bit adder; a compare, select and store unit (CSSU); an exponent
encoder(EXP); a data address generation unit (DAGEN); and a program address generation
unit (PAGEN).
The ALU performs 2’s complement arithmetic operations and bit-level Boolean
operations on 16, 32, and 40-bit words. It can also function as two separate 16-bit ALUs and
perform two 16-bit operations simultaneously. Figure 3.2 show the functional diagram of the
ALU of the TMS320C54xx family of devices.
Accumulators A and B store the output from the ALU or the multiplier/adder block and
provide a second input to the ALU. Each accumulators is divided into three parts: guards bits
(bits 39-32), high-order word (bits-31-16), and low-order word (bits 15- 0), which can be
stored and retrieved individually. Each accumulator is memory-mapped and partitioned. It can
be configured as the destination registers. The guard bits are used as a head margin for
computations.
Fig.(20): Functional diagram of the central processing unit of the TMS320C54xx processors.
Barrel shifter: provides the capability to scale the data during an operand read or write. No
overhead is required to implement the shift needed for the scaling operations. The’54xx barrel
shifter can produce a left shift of 0 to 31 bits or a right shift of 0 to 16 bits on the input data.
180
The shift count field of status registers ST1, or in the temporary register T. Figure 4.3 shows
the functional diagram of the barrel shifter of TMS320C54xx processors. The barrel shifter
and the exponent encoder normalize the values in an accumulator in a single cycle. The LSBs
of the output are filled with0s, and the MSBs can be either zero filled or sign extended,
depending on the state of the sign-extension mode bit in the status register ST1. An additional
shift capability enables the processor to perform numerical scaling, bit extraction, extended
arithmetic, and overflow prevention operations.
Multiplier/adder unit: The kernel of the DSP device architecture is multiplier/adder unit.
The multiplier/adder unit of TMS320C54xx devices performs 17 x 17 2’s complement
multiplication with a 40-bit addition effectively in a single instruction cycle.
In addition to the multiplier and adder, the unit consists of control logic for integer and
fractional computations and a 16-bit temporary storage register, T. Figure 4.4 show the
functional diagram of the multiplier/adder unit of TMS320C54xx processors. The compare,
select, and store unit (CSSU) is a hardware unit specifically incorporated to accelerate the
add/compare/select operation. This operation is essential to implement the Viterbi algorithm
used in many signal-processing applications. The exponent encoder unit supports the EXP
instructions, which stores in the T register the number of leading redundant bits of the
accumulator content. This information is useful while shifting the accumulator content for the
purpose of scaling.
181
Fig.(22): Functional diagram of the multiplier/adder unit of TMS320C54xx processors.
The amount and the types of memory of a processor have direct relevance to the efficiency
and performance obtainable in implementations with the processors. The ‘54xx memory is
organized into three individually selectable spaces: program, data, and I/O spaces. All ‘54xx
devices contain both RAM and ROM. RAM can be either dual-access type (DARAM) or
single-access type (SARAM). The on-chip RAM for these processors is organized in pages
having 128 word locations on each page.
The ‘54xx processors have a number of CPU registers to support operand addressing and
computations. The CPU registers and peripherals registers are all located on page 0 of the data
memory. Figure 4.5(a) and (b) shows the internal CPU registers and peripheral registers with
their addresses. The processors mode status (PMST) registers
that is used to configure the processor. It is a memory-mapped register located at address 1Dh
on page 0 of the RAM. A part of on-chip ROM may contain a boot loader and look-up tables
for function such as sine, cosine, μ- law, and A- law.
182
Fig.(23)(a) Internal memory-mapped registers of TMS320C54xx processors.
183
Fig.(24)(b).peripheral registers for the TMS320C54xx processors
ST0: Contains the status of flags (OVA, OVB, C, TC) produced by arithmetic
operations & bit manipulations.
ST1: Contain the status of various conditions & modes. Bits of ST0&ST1registers can be set
or clear with the SSBX & RSBX instructions.
184
OVA: Overflow flag for accumulator A.
OVB: Overflow flag for accumulator B.
DP: Data-memory page pointer.
HM: Hold mode, indicates whether the processor continues internal execution or
acknowledge for external interface.
0: Always read as 0
185
Processor Mode Status Register (PMST):
INTR: Interrupt vector pointer, point to the 128-word program page where the interrupt
vectors reside.
OVLY: RAM OVERLAY, OVLY enables on chip dual access data RAM blocks to be
mapped into program space.
AVIS: It enables/disables the internal program address to be visible at the address pins.
DROM: Data ROM, DROM enables on-chip ROM to be mapped into data space.
1. Immediate addressing.
2. Absolute addressing.
3. Accumulator addressing.
4. Direct addressing.
5. Indirect addressing.
6. Memory mapped addressing
7. Stack addressing.
1. Immediate addressing:
The instruction contains the specific value of the operand. The operand can be short
(3,5,8 or 9 bit in length) or long (16 bits in length). The instruction syntax for short operands
occupies one memory location,
186
2. Absolute Addressing:
3. Accumulator Addressing:
Accumulator content is used as address to transfer data between Program and Data
memory. Ex: READA *AR2
4. Direct Addressing:
Base address + 7 bits of value contained in instruction = 16 bit address. A page of 128
locations can be accessed without change in DP or SP. Compiler mode bit (CPL) in ST1
register is used.
If CPL =0 selects DP
CPL = 1 selects SP,
5. Indirect Addressing:
TMS320C54xx have 8, 16 bit auxiliary register (AR0 – AR 7). Two auxiliary register
arithmetic units (ARAU0 & ARAU1)
Used to access memory location in fixed step size. AR0 register is used for indexed and bit
reverse addressing modes.
187
For single– operand addressing
188
6. Circular Addressing:
A circular buffer is a sliding window contains most recent data. Circular buffer of size R must
start on a N-bit boundary, where 2N > R.
The circular buffer size register (BK): specifies the size of circular buffer.
Effective base address (EFB): By zeroing the N LSBs of a user selected AR (ARx).
End of buffer address (EOB): By replacing the N LSBs of ARx with the N LSBs of BK.
189
Fig.(29). Circular Addressing Block Diagram
7. Bit-Reversed Addressing:
190
8. Dual-Operand Addressing:
Dual data-memory operand addressing is used for instruction that simultaneously perform
two reads (32-bit read) or a single read (16-bit read) and a parallel store (16-bit store) indicated by
two vertical bars, II. These instructions access operands using indirect addressing mode.
If in an instruction with a parallel store the source operand the destination operand
point to the same location, the source is read before writing to the destination. Only 2 bits are
available in the instruction code for selecting each auxiliary register in this mode. Thus, just
four of the auxiliary registers, AR2-AR5, can be used, The ARAUs together with these
registers, provide capability to access two operands in a single cycle. Figure 4.11 shows how
an address is generated using dual data-memory operand addressing.
191
Fig.(32). Indirect Addressing Block Diagram for Dual Data-Memory Operands
Stack Addressing:
• Used to automatically store the program counter during interrupts and subroutines.
• Can be used to store additional items of context or to pass data values.
• Uses a 16-bit memory-mapped register, the stack pointer (SP).
• PSHD X2
Fig.(34). Stack and Stack Pointer Before and After a Push Operation
193
Program Control
It contains program counter (PC), the program counter related H/W, hard stack, repeat
counters &status registers.
PC addresses memory in several ways namely:
Branch: The PC is loaded with the immediate value following the branch instruction
Subroutine call: The PC is loaded with the immediate value following the call
instruction
Interrupt: The PC is loaded with the address of the appropriate interrupt vector.
Instructions such as BACC, CALA, etc ;The PC is loaded with the contents of the
accumulator low word
End of a block repeat loop: The PC is loaded with the contents of the block repeat
program address start register.
Return: The PC is loaded from the top of the stack.
Problems:
1. Assuming the current content of AR3 to be 200h, what will be its contents after
each of the following TMS320C54xx addressing modes is used? Assume that
the contents of AR0 are 20h.
a. *AR3+0
b. *AR3-0
c. *AR3+
d. *AR3
194
e. *AR3
f. *+AR3 (40h)
g. *+AR3 (-40h)
Solution:
The TMS320C54xx DSP instruction set can be divided into four basic types of operations:
• Arithmetic operations
• Logical operations
• Program-control operations
• Load and store operations
1. Arithmetic operations:
The instructions within the following functional groups:
• Add instructions
• Subtract instructions
• Multiply instructions
• Multiply-accumulate instructions
• Multiply-subtract instructions
• Double (32-bit operand) instructions
• Application-specific instructions
a) Add instructions
195
b) Subtract instructions
196
c) Multiply instructions
197
f) Application-specific instructions
198
2. Logical operations
The instructions within the following functional groups:
• AND instructions
• OR instructions
• XOR instructions
• Shift instructions
• Test instructions
o AND instructions
o OR instructions
o XOR instructions
199
o Shift instructions
o Test instructions
3. Program-Control Operations
• Branch instructions
• Call instructions
• Interrupt instructions
• Return instructions
• Repeat instructions
• Stack-manipulating instructions
• Miscellaneous program-control instructions
o Branch instructions
200
o Call instructions
o Interrupt instructions
o Return instructions
201
o Repeat instructions
o Stack-manipulating instructions
202
o Miscellaneous program-control instructions
203
QUESTION BANK
Short answer questions.
Long answer questions.
204
UNIT -1
8 Compute the DFT of a sequence x(n)={1,-1,1,-1} using DIT algorithm. Apply CO1
UNIT –III
PART-A: SHORT ANSWER QUESTIONS
Blooms Course
Taxonomy Outcome
S.NO QUESTION Level
1 What are the advantages and disadvantages of FIR filters. Remember CO2
2 Explain the procedure for designing FIR filters using windows. Understand CO2
5 Give the equations specifying Rectangular and Bartlett windows. Remember CO2
9 Give the equations specifying Hanning and Blackman windows. Remember CO2
10 What are the conditions for a FIR system to have linear phase. Remember CO2
PART-B: LONG ANSWER QUESTION
206
Blooms Course
Taxonomy Outcome
S.NO QUESTION Level
Design an ideal HPF whose desired frequency response is Apply
Hd(ejω ) =1 for Π/4 ≤ |ω| ≤ Π
1 CO2
=0 for |ω| ≤ Π/2
Using Hanning window.
Design an ideal HPF whose desired frequency response is Apply
Hd(ejω ) =1 for Π /4 ≤ |ω| ≤ Π
2 CO2
=0 for |ω| ≤ Π/2
Using Hamming window.
Design an ideal LPF whose desired frequency response is Apply
Hd(ejω ) =1 ; ≥ω ≥-
3 CO2
=0 ; ≥|ω| ≥ -
Using Hanning window for N=9.
Design an ideal LPF whose desired frequency response is Apply
Hd(ejω ) =1 ; ≥ω ≥-
4 CO2
=0 ; ≥|ω| ≥ -
Using Rectangular window for N=9.
Design an ideal HPF whose desired frequency response is Apply
Hd(ejω ) =1 ; ≥|ω| ≥
5 CO2
=0 ; otherwise
Using Bartlett window for N=9.
UNIT –IV
PART-A: SHORT ANSWER QUESTIONS
Blooms Course
Taxonomy Outcome
S.NO QUESTION Level
Show that decimator is a Time variant system.
1 Understand CO3
3 What is the need for anti-aliasing filter prior to down sampling? Remember CO3
4 what is the need for anti-imaging filter after up sampling a signal. Remember CO3
207
Blooms Course
Taxonomy Outcome
S.NO QUESTION Level
What is the modified Harvard architecture employed in Digital signal
1 Remember CO4
processor.
2 What are the special features of Digital signal processor. Remember CO4
208
PREVIOUS UNIVERSITY QUESTION PAPERS
(Attach image of 05 old University Question Papers)
209
210
211
212
UNIT WISE ASSIGNMENT QUESTIONS
UNIT –I ASSIGNMENT
1. List any 3 properties of DFT
2. What is decimation in time algorithm
3. What is decimation in frequency algorithm
4. What are the similarities and differences between DIT and DIF radix-2 FFT algorithms
5. Compute the DFT of a sequence x(n)={1,-1,1,-1} using DIT algorithm
6. Differentiate linear convolution and circular convolution
7. Find the DFT of x[n]={1,2,2,1} using DIT-FFT algorithm
8. List the properties of Twiddle factor.
9. What are the advantages of FFT algorithms
10. Obtain linear convolution of the following sequences via circular convolution
x1(n)={1,1,1,1} and x2(n)={1,2,3,4}.
11. Find the convolution of the sequences using overlap-save method
x(n)={1,-1,2,-2,3,-3,4,-4}; h(n)={-1,1}
12. Find the convolution of the sequences using overlap-add method
x(n)={1,-1,2,-2,3,-3,4,-4}; h(n)={-1,1}
13. Find the frequency response of the given causal system,
y(n))=1/2x(n)+x(n-1)+1/2x(n-2)
Plot magnitude and phase response
14. Determine the 8-point DFT of the sequence x(n)={1,2,1,3,5,4,0,2}
15. Find the DFT of the following Discrete time sequence x(n)={2,1,2,1,1,2,1,2} using radix-
2 DIT-FFT algorithm.
16. Determine IFFT using DIF method for X(K)={2,-6,8,-10,9,0,1,7}
17. Compute circular convolution of the following two sequences using DFT
x1(n)={0,1,0,1} and x2(n)={1,2,1,2}
18.Find the DFT of the following Discrete time sequence x(n)={2,1,4,1,3,2,2,3} using radix-2
DIF-FFT algorithm.
19. Determine IFFT using DIT method for X(K)={2,-6,3,-2,9,0,0,7}
213
4. A filter has a transfer function H(S) = . Design a digital filter for this function using
impulse invariant method.
5. Design a digital IIR filter from the analog filter transfer function H(S) = 1/(S+1)(S+2) by
using bilinear transformation. Assume T=1sec.
214
4. Show that decimator is a linear system
UNIT –V ASSIGNMENT
1. What is the modified Harvard architecture employed in Digital signal processor.
2. List any 3 arithmetic instructions TMS320C54XX processors.
3. Explain various CPU components of TMS320C54XX processor with the help of neat
diagram.
4. Draw the architecture of TMS320C54XX processor.
5. Explain the addressing modes of TMS320C54XX processor.
6. Explain any two data transfer instructions of TMS320C54XX processor.
215
INTERNAL QUESTION PAPERS WITH KEY
216
MATRUSRI ENGINEERING COLLEGE
SAIDABAD, HYDERABAD – 500 059
DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING
I Internal Assessment (2019-20)
Class/Branch: BE V -SEM ECE (A & B) Max Marks :20
Subject: Digital signal processing Duration : 1Hour
Answer all questions from PART-A and any two from PART-B
217
MATRUSRI ENGINEERING COLLEGE
SAIDABAD, HYDERABAD – 500 059
DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING
II Internal Assessment (2019-20)
Class/Branch: BE V -SEM ECE (A & B) Max Marks :20
Subject: Digital signal processing Duration : 1Hour
Answer all questions from PART-A and any two from PART-B
218
CO- Course Outcomes.
Note: Answer all questions from Part-A and any two questions from Part-B
5. Find the DFT and IDFT of the given x(n)=[1 2 3 4 4 3 2 1] using FFT DIT/DIF [7M][ (CO1)[L1]
6. Define the procedure and Design the Butterworth Low Pass filter that has a 2db pass band
attenuation at a frequency of 20 rad/sec and at least 10 db stopband attenuation at 30 rad/sec[7M]
(CO2)[L6]
7. Define the procedure to design digital filter using bilinear transformation method and apply it to
H(s)=2/(S+1)(S+2) find H(Z) [7M] (CO2)[L1]
Note: Answer all questions from Part-A and any two questions from Part-B
Note: Answer all questions from Part-A and any two questions from Part-B
4. Obtain linear convolution of the following sequences using DFT and IDFT
x(n)={1 2 3 4} ; h(n)={2 3}
5. Determine 8-point DFT of the sequence x(n) ={1,2,1,3,5,4,0,2} using DIF-FFT
algorithm
6. Design a Digital Butterworth filter using bilinear transformation technique with T=1
sec for the following specifications.
0.707≤│H(ejω)│≤1.0; for 0 ≤ ω ≤ π/2
220
MATRUSRI ENGINEERING COLLEGE
SAIDABAD, HYDERABAD – 500 059
DEPARTMENT OF ELECTRONICS &COMMUNICATION ENGINEERING
Note: Answer all questions from Part-A and any two questions from Part-B
221
CONTENT BEYOND SYLLABUS
Sl.No. TOPIC Mode of Relevance with POs and
Teaching PSOs
TOPIC Description:
Introduction to Adaptive Filters
An adaptive filter is a computational device that attempts to model the relationship between
two signals in real time in an iterative manner. Adaptive filters are often realized either as a
set of program instructions running on an arithmetical processing device such as a
microprocessor or DSP chip, or as a set of logic operations implemented in a field-
programmable gate array (FPGA) or in a semicustom or custom VLSI integrated circuit.
However, ignoring any errors introduced by numerical precision effects in these
implementations, the fundamental operation of an adaptive filter can be characterized
independently of the specific physical realization that it takes. For this reason, we shall focus
on the mathematical forms of adaptive filters as opposed to their specific realizations in
software or hardware. Descriptions of adaptive filters as implemented on DSP chips and on a
dedicated integrated circuit can be found in [1, 2, 3], and [4], respectively.
An adaptive filter is defined by four aspects:
1. The signals being processed by the filter
2. The structure that defines how the output signal of the filter is computed from its input
signal
3. The parameters within this structure that can be iteratively changed to alter the filter’s
input-output relationship
4. The adaptive algorithm that describes how the parameters are adjusted from one time
instant to the next
By choosing a particular adaptive filter structure, one specifies the number and type of
parameters that can be adjusted. The adaptive algorithm used to update the parameter values
of the system can take on a myriad of forms and is often derived as a form of optimization
procedure that minimizes an error criterion that is useful for the task at hand.
222
In this section, we present the general adaptive filtering problem and introduce the
mathematical notation for representing the form and operation of the adaptive filter. We then
discuss several different structures that have been proven to be useful in practical applications.
We provide an overview of the many and varied applications in which adaptive filters have
been successfully used.
Figure 18.1 shows a block diagram in which a sample from a digital input signal x(n) is fed
into a device, called an adaptive filter, that computes a corresponding output signal sample
y(n) at time n. For the moment, the structure of the adaptive filter is not important, except for
the fact that it contains adjustable parameters whose values affect how y(n) is computed. The
output signal is compared to a second signal d(n), called the desired response signal, by
subtracting the two samples at time n. This difference signal, given by
is known as the error signal. The error signal is fed into a procedure which alters or adapts the
parameters of the filter from time n to time (n+1) in a well-defined manner. This process of
adaptation is represented by the oblique arrow that pierces the adaptive filter block in the
figure. As the time index n is incremented, it is hoped that the output of the adaptive filter
becomes a better and better match to the desired response signal through this adaptation
process, such that the magnitude of e(n) decreases over time. In this context, what is meant by
“better” is specified by the form of the adaptive algorithm used to adjust the parameters of the
adaptive filter.
In the adaptive filtering task, adaptation refers to the method by which the parameters of the
system are changed from time index n to time index (n+1) the number and types of parameters
within this system depend on the computational structure chosen for the system. We now
discuss different filter structures that have been proven useful for adaptive filtering tasks.
Filter Structures
In general, any system with a finite number of parameters that affect how y(n) is computed
from x(n) could be used for the adaptive filter in Fig. 18.1. Define the parameter or coefficient
vector W(n) as
W (n) w0 (n), w1 (n)........ wL1 (n) (2)
T
223
FIGURE 1: The general adaptive filtering problem
where wi (n),0 i L 1 are the L parameters of the system at time n. With this definition,
we
could define a general input-output relationship for the adaptive filter as
y(n) f W (n), y(n 1), y(n 2),....... y(n N ), x(n), x(n 1),...... x(n M 1) (3)
where f(.) represents any well-defined linear or nonlinear function and M and N are positive
integers. Implicit in this definition is the fact that the filter is causal, such that future values of
x(n) are not needed to compute y(n). While noncausal filters can be handled in practice by
suitably buffering or storing the input signal samples, we do not consider this possibility.
Although (3) is the most general description of an adaptive filter structure, we are interested in
determining the best linear relationship between the input and desired response signals for
many problems. This relationship typically takes the form of a finite-impulse-response (FIR)
or infinite impulse-response (IIR) filter. Figure 2 shows the structure of a direct-form FIR
filter, also known
as a tapped-delay-line or transversal filter, where z−1 denotes the unit delay element and each
wi(n) is a multiplicative gain within the system. In this case, the parameters in W(n)
correspond to the impulse response values of the filter at time n. We can write the output
signal y(n) as
L 1
y (n ) wi (n )x(n i ) (4)
i 0
y (n ) W T (n ) X (n ) (5)
where
X (n ) x (n ), x (n 1),...... x(n L 1)
T
denotes the input signal vector and .T denotes vector transpose. Note that this system requires
L multiplies and L−1 adds to implement, and these computations are easily performed by a
processor or circuit so long as L is not too large and the sampling period for the signals is not
too short. It also requires a total of 2L memory locations to store the L input signal samples
and the L coefficient values, respectively.
224
The structure of a direct-form IIR filter is shown in Fig.3. In this case, the output of the
system can be represented mathematically as
N N
y ( n ) a i ( n ) y ( n i ) b j ( n ) x ( n j ) ( 6)
i 1 i 1
although the block diagram does not explicitly represent this system in such a fashion. We
could easily write (6) using vector notation as
y (n) W T (n)U (n) (7)
where the (2N+1)- dimensional vectors W(n) and U(n) are defined as
W ( n ) a1 ( n ),a 2 ( n )........a N ( n ),b0 ( n ),b1 ( n ).......b N ( n ) (8)
T
respectively. Thus, for purposes of computing the output signal y(n), the IIR structure
involves a fixed number of multiplies, adds, and memory locations not unlike the direct-form
FIR structure.
A third structure that has proven useful for adaptive filtering tasks is the lattice filter. A lattice
filter is an FIR structure that employs L − 1 stages of preprocessing to compute a set of
auxiliary signals bi (n),0 i L 1 known as backward prediction errors. These signals have
the special property that they are uncorrelated, and they represent the elements of X(n)
through a linear transformation. Thus, the backward prediction errors can be used in place of
the delayed input signals in a structure similar to that in Fig.2, and the uncorrelated nature of
the prediction errors can provide improved convergence performance of the adaptive filter
coefficients with the proper choice of algorithm.
A critical issue in the choice of an adaptive filter’s structure is its computational
complexity. Since the operation of the adaptive filter typically occurs in real time, all of the
calculations for the system must occur during one sample time. The structures described
above are all useful because y(n) can be computed in a finite amount of time using simple
arithmetical operations and finite amounts of memory.
In addition to the linear structures above, one could consider nonlinear systems for which the
principle of superposition does not hold when the parameter values are fixed. Such systems
are useful when the relationship between d(n) and x(n) is not linear in nature. Two such
225
classes of systems are the Volterra and bilinear filter classes that compute y(n) based on
polynomial representations of the input and past output signals. In addition, many of the
nonlinear models developed in the field of neural networks, such as the multilayer perceptron,
fit the general form of (3), and many of the algorithms used for adjusting the parameters of
neural networks are related to the algorithms used for FIR and IIR adaptive filters.
Perhaps the most important driving forces behind the developments in adaptive filters
throughout their history have been the wide range of applications in which such systems can
be used. We now discuss the forms of these applications in terms of more-general problem
classes that describe the assumed relationship between d(n) and x(n). Our discussion
illustrates the key issues in selecting an adaptive filter for a particular task. Extensive details
concerning the specific issues and problems associated with each problem genre can be found
in the references at the end.
System Identification
Consider Fig.4, which shows the general problem of system identification. In this diagram, the
system enclosed by dashed lines is a “black box,” meaning that the quantities inside are not
observable from the outside. Inside this box is (1) an unknown system which represents a
general input output relationship and (2) the signal (n ) called the observation noise signal
because it corrupts the observations of the signal at the output of the unknown system.
Channel Identification
In communication systems, useful information is transmitted from one point to another across
a medium such as an electrical wire, an optical fiber, or a wireless radio link. Nonidealities of
the transmission medium or channel distort the fidelity of the transmitted signals, making the
deciphering of the received information difficult. In cases where the effects of the distortion
can be modeled as a linear filter, the resulting “smearing” of the transmitted symbols is known
as inter-symbol interference (ISI). In such cases, an adaptive filter can be used to model the
effects of the channel ISI for purposes of deciphering the received information in an optimal
manner. In this problem scenario, the transmitter sends to the receiver a sample sequence x(n)
that is known to both the transmitter and receiver. The receiver then attempts to model the
received signal d(n) using an adaptive filter whose input is the known transmitted sequence
x(n). After a suitable period of adaptation, the parameters of the adaptive filter in W(n) are
fixed and then used in a procedure to decode future signals transmitted across the channel.
Channel identification is typically employed when the fidelity of the transmitted channel is
severely compromised or when simpler techniques for sequence detection cannot be used.
226
Plant Identification
In many control tasks, knowledge of the transfer function of a linear plant is required by the
physical controller so that a suitable control signal can be calculated and applied. In such
cases, we can characterize the transfer function of the plant by exciting it with a known signal
x(n) and then attempting to match the output of the plant d(n) with a linear adaptive filter.
After a suitable period of adaptation, the system has been adequately modeled, and the
resulting adaptive filter coefficients in W(n) can be used in a control scheme to enable the
overall closed-loop system to behave in the desired manner.
In certain scenarios, continuous updates of the plant transfer function estimate provided by
W(n) are needed to allow the controller to function properly.
227
END
228