0% found this document useful (0 votes)

198 views591 pages

Digital Signal Processing With Selected

Uploaded by

Olalekan Aturaka

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

198 views591 pages

Digital Signal Processing With Selected

Uploaded by

Olalekan Aturaka

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 591

DIGITAL

SIGNAL
PROCESSING
Basic Theory and Applications

Adaptive Systems and Neural Networks

Time-Frequency Signal Analysis
Sparse Signal Processing – Compressive Sensing

Ljubiša Stanković

Revised edition 2020

Library of Congress Cataloging-in-Publication Data

Library of Congress Control Number: 2015912465

ISBN-13: 978-1514179987
ISBN-10: 1514179989

c 2015 Ljubiša Stanković, All Rights Reserved

Printed by CreateSpace Independent Publishing Platform,

An Amazon.com Company
North Charleston, South Carolina, USA.

Available from Amazon.com and other on-line and bookstores

c 2020 Ljubiša Stanković, All Rights Reserved

No part of this book may be reproduced or utilized in any form or by any means, electronic or
mechanical, including photocopying, recording, or by any information storage and retrieval
system, without permission in writing from the copyright holder.
Ljubiša Stanković Digital Signal Processing 3

To
my parents
Božo and Cana ,

my wife Snežana,
and our
Irena, Isidora, and Nikola.
Contents
I Review of Continuous-Time Signals and Systems 16
Chapter 1 Continuous-Time Signals and Systems 17
1.1 Continuous-Time Signals 17
1.2 Linear Systems 20
1.3 Periodic Signals and Fourier Series 23
1.3.1 Fourier Series of Real-Valued Signals 25
1.4 Fourier Transform 30
1.4.1 Fourier Transform and Linear Time-Invariant Systems 33
1.4.2 Properties of the Fourier Transform 33
1.4.3 Relationship Between the Fourier Series and the Fourier
Transform 36
1.5 Fourier Transform and the Stationary Phase Method 37
1.6 Laplace Transform 42
1.6.1 Properties of the Laplace Transform 44
1.6.2 Table of the Laplace Transform 46
1.6.3 Linear Systems Described by Differential Equations 47
1.7 Butterworth Filter 52

II Deterministic Discrete-Time Signals and Systems 57

Chapter 2 Discrete-Time Signals and Transforms 58
2.1 Discrete-Time Signals 58
2.2 Discrete-Time Systems 62
2.3 Fourier Transform of Discrete-Time Signals 66
2.3.1 Properties 68
2.3.2 Spectral Energy and Power Density 72
2.4 Sampling Theorem in the Time Domain 73
2.5 Problems 78
2.6 Exercise 82
2.7 Solutions 84

Chapter 3 Discrete Fourier Transform 102

3.1 DFT Definition 102

4
Ljubiša Stanković Digital Signal Processing 5

3.2 DFT Properties 107

3.3 Zero-Padding and Interpolation 112
3.4 Relation among the Fourier Representations 117
3.5 Fast Fourier Transform 119
3.6 Sampling of Periodic Signals 123
3.7 Analysis of a Sinusoid Using the DFT 126
3.7.1 Leakage Effect 127
3.7.2 Displacement 129
3.8 Discrete Cosine and Sine Transforms 132
3.9 Discrete Walsh-Hadamard and Haar Transforms 137
3.9.1 Discrete Walsh-Hadamard Transform 140
3.9.2 Discrete Haar Wavelet Transform 143
3.10 Problems 146
3.11 Exercise 148
3.12 Solutions 150

Chapter 4 z-Transform 158

4.1 Definition of the z-transform 158
4.2 Properties of the z-transform 160
4.2.1 Linearity 160
4.2.2 Time-Shift 160
4.2.3 Multiplication by an exponential signal: Modulation 161
4.2.4 Differentiation 161
4.2.5 Convolution in time 162
4.2.6 Initial and Stationary State Signal Value 162
4.2.7 Table of the z-transform 163
4.3 Inverse z-transform 163
4.3.1 Direct Power Series Expansion 163
4.3.2 Theorem of Residues Based Inversion 167
4.4 Discrete systems and the z-transform 169
4.5 Difference equations 171
4.5.1 Solution Based on the z-transform 172
4.5.2 Solution to Difference Equations in the Time Domain 174
4.6 Relation of the z-transform to other Transforms 179
4.7 Problems 181
4.8 Exercise 184
4.9 Solutions 186

Chapter 5 From Continuous to Discrete Systems 200

5.1 Impulse Invariance Method 200
5.2 Matched z-transform method 205
5.3 Differentiation and Integration 207
5.4 Bilinear Transform 211
5.5 Discrete Filters Design 216
5.5.1 Lowpass filters 216
5.5.2 Highpass Filters 222
5.5.3 Bandpass Filters 223
6 Contents

5.5.4 Allpass Systems - System Stabilization 224

5.5.5 Inverse and Minimum Phase Systems 226
5.6 Problems 230
5.7 Exercise 232
5.8 Solutions 234

Chapter 6 Realization of Discrete Systems 242

6.1 Realization of IIR systems 242
6.1.1 Direct realization I 242
6.1.2 Direct realization II 243
6.1.3 Sensitivity of the System Poles/Zeros 246
6.1.4 Cascade Realization 249
6.1.5 Parallel realization 253
6.1.6 Inverse realization 256
6.2 FIR Systems and their Realizations 257
6.2.1 Linear Phase Systems and Group Delay 257
6.2.2 Windows 258
6.2.3 Design of a FIR System in the Frequency Domain 262
6.2.4 Realizations of the FIR systems 262
6.3 Problems 269
6.4 Exercise 271
6.5 Solutions 274

III Random Discrete-Time Signals and Systems 286

Chapter 7 Discrete-Time Random Signals 287
7.1 Basic Statistical Definitions 287
7.1.1 Mean Value – Sample Average 287
7.1.2 Median 293
7.1.3 Variance and Standard Deviation 297
7.1.4 Linear Regression Analysis 299
7.1.5 Random Sample Consensus (RANSAC) 301
7.1.6 Ridge Regression 304
7.2 Basic Probability Definitions 306
7.2.1 Probability 306
7.2.2 Expected Value and Variance 311
7.2.3 Probability Density Function 312
7.3 Second-Order Statistics 318
7.3.1 Correlation and Covariance 318
7.3.2 Stationarity and Ergodicity 320
7.3.3 Characteristic Function and Moments 321
7.3.4 Power Spectral Density 322
7.4 Noise and Random Signal Examples 327
7.4.1 Uniform Noise 327
7.4.2 Binary, Bernoulli, and Binomial Random Signal 328
7.4.3 Bayesian Inference for Binary and Binomial Random Signal 330
Ljubiša Stanković Digital Signal Processing 7

7.4.4 Gaussian Noise 334

7.4.5 Estimation of the Gaussian Distribution Parameters 337
7.4.6 Cramer-Rao Bound 341
7.4.7 Confidence Intervals 347
7.4.8 Bootstrap Method 354
7.4.9 Hypothesis Testing 356
7.4.10 Complex Gaussian Noise and Rayleigh Distribution 359
7.4.11 Impulsive Noises 361
7.4.12 Poisson Noise 363
7.4.13 Exponential Random Signal 364
7.4.14 Noisy Signals 366
7.5 Discrete Fourier Transform of Noisy Signals 367
7.5.1 Expected Value and Variance of the DFT 367
7.5.2 Spectral Estimation 369
7.5.3 Bias in the Fourier Transform of the Windowed Signals 370
7.5.4 Variance in the Fourier transform of Noisy Signals 371
7.5.5 Bias-to-Variance Trade-Off: Optimum Window Width 372
7.5.6 Periodogram 373
7.5.7 Detection of a Sinusoidal Signal Frequency 380
7.6 Linear Systems and Random Signals 383
7.6.1 Spectral Estimation of Narrowband Signals 388
7.6.2 Detection and Matched Filter 392
7.6.2.1 Matched Filter 392
7.6.2.2 Two Hypothesis Decision 395
7.6.3 Optimal Wiener Filter 397
7.7 Quantization effects 400
7.7.1 Input signal quantization 401
7.7.2 Quantization of the results 404
7.7.2.1 Fixed point arithmetic 404
7.7.2.2 Discrete rounding error 408
7.7.2.3 Floating point arithmetic 410
7.8 Problems 415
7.9 Exercise 420
7.10 Solutions 424

IV Adaptive Systems and Neural Networks 445

Chapter 8 Adaptive Systems 446
8.1 Introduction 446
8.2 Linear Adaptive Adder 449
8.2.1 Error Signal 451
8.2.2 Autocorrelation Matrix Eigenvalues and Eigenvectors 458
8.2.3 Error Signal Analysis 462
8.2.4 Orthogonality Principle 464
8.3 Steepest Descend Method 465
8.4 LMS Algorithm 474
8 Contents

8.4.1 Convergence of the LMS algorithm 475

8.5 LMS Application Examples 476
8.5.1 Identification of Unknown System 476
8.5.2 Noise Cancellation 479
8.5.3 Sinusoidal Disturbance Cancellation 482
8.5.4 Signal Prediction 483
8.5.5 Adaptive Antenna Arrays 487
8.5.6 Acoustic Echo Cancellation 492
8.6 Variations on the LMS Algorithm 493
8.6.1 Sign LMS 493
8.6.2 Block LMS 495
8.6.3 Normalized LMS Algorithm 496
8.6.4 LMS with Variable Step Size 497
8.6.5 Complex LMS 498
8.7 RLS Algorithm 499
8.8 Adaptive Recursive Systems 503
8.9 From the LMS algorithm to the Kalman filters 505

Chapter 9 Neural Networks 510

9.1 Neural Network Structure 510
9.1.1 Neuron 511
9.1.2 Network Function 511
9.1.3 Activation Function 512
9.1.4 Neural Network Topology 513
9.1.5 Network with Supervised Learning 515
9.1.6 One-Layer Network with Binary Output - Perceptron 515
9.1.7 One-Layer Neural Network with Continuous Output 519
9.1.8 Multilayer Neural Networks 521
9.1.9 Neural Networks with Unsupervised Learning 524
9.1.10 Voting Machines 524
9.2 Convolutional Neural Network – CNN 525
9.2.1 Forward Propagation 526
9.2.2 Updating Convolution Weights: Back-propagation 536
9.2.2.1 Initial Weights 536
9.2.2.2 Back-propagation in a two-layer CNN 538
9.2.2.3 Softmax Output Layer 540
9.2.2.4 Back-Propagation in a Multi-Layer CNN 541
9.2.2.5 Delta error back-propagation 542
9.2.2.6 Additional operations 546

V Time-Frequency Analysis 549

Chapter 10 Linear Time-Frequency Representations 550
10.1 Short-Time Fourier Transform 550
10.2 STFT Inversion 555
10.3 Windows 558
Ljubiša Stanković Digital Signal Processing 9

10.3.1 Rectangular Window 558

10.3.2 Triangular (Bartlett) Window 558
10.3.3 Hann(ing) Window 559
10.3.4 Hamming Window 561
10.3.5 Blackman and Kaiser Windows 562
10.4 Realizations of the STFT 564
10.4.1 Discrete Form and Realizations of the STFT 564
10.4.2 Recursive STFT Realization 568
10.4.3 Filter Bank STFT Implementation 569
10.4.3.1 Overlapping windows 569
10.5 Signal Reconstruction from the Discrete STFT 572
10.6 Varying Windows in the STFT 581
10.6.1 Time-Varying Windows 581
10.6.1.1 Time-Varying Hann(ing) Windows 590
10.6.1.2 Time-Varying Sine Windows 591
10.6.2 Frequency-Varying Window 592
10.6.3 Hybrid Time-Frequency-Varying Windows 594
10.7 Local Polynomial Fourier Transform 594
10.7.1 Fractional Fourier Transform with Relation to the LPFT 598
10.8 High-Resolution STFT 599
10.8.1 Capon’s STFT 599
10.8.2 MUSIC STFT 602
10.8.3 Capon’s LPFT 604

Chapter 11 Quadratic Time-Frequency Representations 606

11.1 Wigner Distribution 606
11.1.1 Auto-Terms and Cross-Terms in the Wigner Distribution 609
11.1.2 Wigner Distribution Properties 613
11.1.3 Pseudo and Smoothed Wigner Distribution 616
11.1.4 Discrete Pseudo Wigner Distribution 618
11.2 From the STFT to the Wigner Distribution via S-Method 624
11.3 General Quadratic Time-Frequency Distributions 628
11.3.1 Reduced Interference Distributions 632
11.3.2 Kernel Decomposition Method 636

Chapter 12 Wavelet Transform 639

12.1 Continuous Morlet Wavelet Transform 639
12.1.1 S-Transform 642
12.1.2 Spectral Meyer Wavelet Transform 643
12.2 Filter Bank and Discrete Wavelet 645
12.2.1 Lowpass and Highpass Filtering and Downsampling 646
12.2.2 Upsampling 648
12.2.3 Reconstruction Condition 649
12.2.4 Orthogonality Conditions 651
12.2.5 FIR Filter and Orthogonality Condition 652
12.2.6 Haar Wavelet Implementation 653
12.2.7 Daubechies D4 Wavelet Transform 655
10 Contents

12.2.8 Daubechies D4 Wavelet Functions in Different Scales 664

12.2.9 Daubechies D6 Wavelet Transform 668
12.2.10 Coifflet Transform 669
12.2.11 Discrete Wavelet Transform - STFT 670

VI Sparse Signal Processing and Compressive Sensing 673

Chapter 13 Sensing of Sparse Signals 674
13.1 Illustrative Examples 674
13.2 Basic Definitions 680
13.2.1 Sparsity 680
13.2.2 Measurements 681
13.2.2.1 Sparsity Aware Form of the Measurements 682
13.2.2.2 Signal Samples as Measurements 682
13.2.2.3 Indirect Measurements 684
13.3 Measurement Matrix Parameters 684
13.3.1 Measurement Matrix Examples 684
13.3.2 Isometry and Restricted Isometry Property 686
13.3.3 Coherence Index 688
13.3.4 Restricted Isometry Property and Coherence 691
13.3.5 Restricted Isometry Property and Eigenvalues 697
13.3.6 Spark and Rank of a Matrix 701
13.4 Reconstruction Uniqueness 702
13.4.1 Unique Reconstruction Condition and the RIP 702
13.4.2 Spark and the Solution Uniqueness 705
13.4.3 Coherence Index Uniqueness Relations 706
13.4.4 Spark and Coherence Index Relation 707

Chapter 14 Sparse Signals Reconstruction 709

14.1 Norm-Zero Based Reconstruction 709
14.1.1 Problem Formulation 709
14.1.2 Known Coefficient Positions 710
14.1.3 Direct Combinatorial Search 712
14.1.4 Pseudoinverse matrix 712
14.1.5 Estimation of Unknown Positions 713
14.1.5.1 Initial Estimate 714
14.1.5.2 Sparsity Aware Initial Estimate 715
14.1.5.3 Reconstruction Algorithm 715
14.1.6 Iterative Reconstruction 718
14.1.7 CoSaMP Reconstruction Algorithm 722
14.1.8 Initial Estimate and the Coherence Based Uniqueness Relation722
14.1.9 Solution Exactness 724
14.1.10 The Initial Estimate Noise 726
14.1.10.1 Expected Value of the Initial Estimate 726
14.1.10.2 Variance of the Initial Estimate 726
Ljubiša Stanković Digital Signal Processing 11

14.1.10.3 Distribution of the Initial Estimate in the Partial

DFT Matrix 730
14.1.11 Influence of Additive Input Noise 732
14.1.11.1 Signal-To-Noise Ratio Bounds 733
14.1.11.2 Exact Signal-To-Noise Ratio 733
14.1.11.3 Exact Signal-To-Noise Ratio in the Partial DFT 734
14.1.12 Nonsparse Signal Reconstruction 736
14.1.13 Iterative Hard Thresholding Reconstruction Algorithm 739
14.2 Norm-One Based Reconstruction 739
14.2.1 Uniqueness Illustration in the Sparsity Domain 741
14.2.2 Illustration of the ℓ0 -norm and ℓ1 -norm Solution Equivalence 743
14.2.3 Conditions for the ℓ0 -norm and ℓ1 -norm Solution Equivalence752
14.3 Norm-one Based Reconstruction Algorithms 754
14.3.1 LASSO- Minimization 754
14.3.1.1 Iterative Calculation 755
14.3.1.2 Norm-Two Based Minimization Solution 758
14.3.2 Reconstruction in the Signal Domain 758
14.3.2.1 Direct Search in the Signal Domain 759
14.3.2.2 Reconstruction with a Gradient Algorithm 760
14.3.2.3 Algorithm 762
14.3.2.4 Finite Difference Step 765
14.3.2.5 Comments on the Algorithm 767
14.4 On the Uniqueness of the DFT of Sparse Signals 770
14.5 Indirect Measurements 778
14.6 Processing of Sparse Signals with Impulsive Noise 785
14.6.1 Direct Search Procedure 786
14.6.2 Criteria for Selecting Samples 787
14.6.2.1 Iterative procedure 788
14.6.3 Uniqueness of the Obtained Solution 789
14.7 RANSAC-Based Signal Denoising Using Compressive Sensing 791
14.7.1 RANSAC-based CS Signal Denoising 791
14.7.2 Expected Signal-to-Noise Ratio 794
14.7.3 Numerical and Statistical Examples 794
14.8 Image Reconstruction 799
14.9 Median Based Formulation and Reconstruction 800
14.10Bayesian-Based Reconstruction 806

About the Author 819

Preface

HIS book is a result of the author’s thirty-three years of experience in teaching and

T research in signal processing. It is written for students and engineers as a first book in
digital signal processing, assuming that a reader is familiar with basic mathematics,
including integrals, differential calculus, and linear algebra. Although a review of continuous-
time analysis is presented in the first chapter, a prerequisite for the presented content is basic
knowledge about continuous-time signal processing.
The book consists of three parts. After an introductory review part, the basic principles
of digital signal processing are presented within Part two of the book. This part starts with
Chapter two which deals with basic definitions, transforms, and properties of discrete-time
signals. The sampling theorem, providing an essential relation between continuous-time
and discrete-time signals, is presented in this chapter as well. Discrete Fourier transform
and its applications to signal processing are the topics of the third chapter. Other common
discrete transforms, like Cosine, Sine, Walsh-Hadamard, and Haar are also presented in
this chapter. The z-transform, as a powerful tool for analysis of discrete-time systems, is
the topic of Chapter four. Various methods for transforming a continuous-time system into
a corresponding discrete-time system are derived and illustrated in Chapter five. Chapter
six is dedicated to the forms of discrete-time system realizations. Basic definitions and
properties of random discrete-time signals are given in Chapter six. Systems to process
random discrete-time signals are considered in this chapter as well. Chapter six concludes
with a short study of quantization effects.
The presentation is supported by numerous illustrations and examples. Chapters within
Part two are followed by a number of solved and unsolved problems for practice. The
theory is explained in a simple way with a necessary mathematical rigor. The book provides
simple examples and explanations for every presented transform, method, algorithm or
approach. Sophisticated results in signal processing theory are illustrated by simple numerical
examples.
Part three of the book contains a few selected topics in digital signal processing: adaptive
discrete-time systems, time-frequency signal analysis, and processing of discrete-time sparse
signals. This part could be studied within an advanced course in digital signal processing,
following the basic course. Some parts from the selected topics may be included in tailoring
a more extensive first course in digital signal processing as well.
The author would like to thank colleagues: prof. Zdravko Uskoković, prof. Srdjan
Stanković, prof. Igor Djurović, prof. Veselin Ivanović, prof. Miloš Daković, prof. Božo

12
Ljubiša Stanković Digital Signal Processing 13

Krstajić, prof. Vesna Popović-Bugarin, prof. Slobodan Djukanović, prof. Irena Orović, dr.
Nikola Žarić, dr Marko Simeunović, M.Sc. Predrag Raković, M.Sc. Andjela Draganić and
M.Sc. Isidora Stanković for careful reading of the initial version of this book and for the
comments that helped to improve the presentation.
The author thanks the colleagues that helped in preparing the special topics part of the
book. Many thanks to Miloš Daković who coauthored all three chapters of Part three of
this book and to other coauthors of chapters in this part: Thayaparan Thayananthan, Srdjan
Stanković, and Irena Orović. Special thanks to M.Sc. Miloš Brajović and M.Sc. Stefan
Vujović for their careful double-check of the presented theory and examples, numerous
comments, and for the help in proofreading the final version of the book.

London,
July 2013 - July 2015.

Author

Preface to the Revised Edition

The book has been slightly edited, keeping the main structure unchanged. The
chapter dealing with random signals is updated to provide the basis for machine learning,
compressive sensing, graph signal processing, and other modern signal processing areas.

Podgorica, Montenegro
March - June 2020.

Author

Last revision on March 9, 2021.

Introduction

S ignal is a physical process, mathematical function, or any other physical or symbolic

A representation of information. Signal theory and processing are the areas dealing
with the efficient generation, description, transformation, transmission, reception,
and interpretation of signals. In the beginning, the most common physical processes used
for these purposes were the electric signals, for example, varying current or electromagnetic
waves. Signal theory is most commonly studied within electrical engineering. Signal theory
tools are strongly related to applied mathematics and information theory. Examples of
signals include speech, music, image, video, medical, biological, geophysical, sonar, radar,
biomedical, car engine, financial, and molecular data. In terms of signal generation, the
main topics are in sensing, acquisition, synthesis, and reproduction of information. Various
mathematical transforms, representations, and algorithms are used for describing signals.
Signal transformations are a set of methods for decomposition, filtering, estimation, and
detection. Modulation, demodulation, detection, coding, and compression are the most
important aspects of signal transmission. In the process of interpretation, various approaches
may be used, including adaptive and learning-based tools and analysis.
Mathematically, signals are presented by the functions of one or more variables.
Examples of one-dimensional signals are speech and music signals. A typical example of a
two-dimensional signal is an image while video sequence is a sample of a three-dimensional
signal. Some signals, for example, geophysical, medical, biological, radar, or sonar, may be
represented and interpreted as one-dimensional, two-dimensional, or multidimensional.
Signals may be continuous functions of independent variables, for example, functions of
time or space. Independent variables may also be discrete, with the signal values being defined
only over an ordered set of discrete independent variable values. This is a discrete-time
signal. The discrete-time signals, after being stored in a general computer or special-purpose
hardware, are discretized (quantized) in amplitude as well, so that they can be memorized
within the registers of a finite length. These kinds of signals are referred to as digital signals,
Fig. 1. A continuous-time and continuous amplitude (analog) signal is transformed into a
discrete-time and discrete-amplitude (digital) signal using analog-to-digital (A/D) converters,
Fig. 2. Their processing is known as digital signal processing. In modern systems, the
amplitude quantization errors are very small. Common A/D converters are with the sampling
frequency of up to megasample (some even up to a few gigasample) per second with 8 to
24 bits of resolution in amplitude. The digital signals are usually mathematically treated as
continuous (nondiscretized) in amplitude, while the quantization error is studied, if needed,

14
Ljubiša Stanković Digital Signal Processing 15

continuous discrete−time digital

1 1
15 1111
14 1110
0.8 0.8 13 1101
12 1100
11 1011
10 1010
0.6 0.6
9 1001

x (n)
x(n)
x(t)

8 1000

d
7 0111
0.4 0.4 6 0110
5 0101
4 0100
0.2 0.2 3 0011
2 0010
1 0001
0 0 0000
0 5 10 15 0 5 10 15 0 5 10 15
t n n

Figure 1 A continuous-time analog signal (left) and its discrete-time (middle) and digital version (right).

as a small disturbance in processing, reduced to a noise in the input signal. Digital signals
are transformed back into analog form by digital-to-analog (D/A) converters.

ANALOG SYSTEM

x(t) y(t)
ha(t)

DIGITAL SYSTEM

x(t) x(n) y(n) y(t)

A/D h(n) D/A

Figure 2 Illustration of an analog and a digital system used to process an analog signal.

According to the nature of their behavior, all signals could be deterministic or stochastic.
For deterministic signals, the values are known in the past and future, while the stochastic
signals are described by probabilistic methods. The deterministic signals are commonly used
for theoretical description, analysis, and syntheses of systems for signal processing.
Advantages of processing signals in digital form are in their flexibility and adaptability
with possibilities ranging up to our imagination to implement a transformation with an
algorithm on a computer. The time required for processing in real time (all calculations
have to be completed between two signal samples) is a limitation as compared to the analog
systems that are limited with a physical delay of electrical components and circuits only.
Part I

Review of Continuous-Time
Signals and Systems

16
Chapter 1
Continuous-Time Signals and Systems

OST of discrete-time signals are obtained by sampling continuous-time signals. In

M many applications, the result of signal processing is presented and interpreted in

the continuous-time domain. Throughout the course of digital signal processing,
the results will be discussed and related to the continuous-time forms of signals and their
parameters. This is the reason why the first chapter is dedicated to a review of signals and
transforms in the continuous-time domain. This review will be of help in establishing proper
correspondence and notation for the presentation that follows in the next chapters.

1.1 CONTINUOUS-TIME SIGNALS

One-dimensional signals, represented by a function of time as a continuous independent

variable, are referred to as continuous-time signals (continuous signals). Some simple forms
of deterministic continuous-time signals are presented next.
The unit-step signal (Heaviside function) is defined by

1, for t ≥ 0
u(t) = (1.1)
0, for t < 0.

In the Heaviside function definition, the value of u(0) = 1/2 is also used. Note that the
independent variable t is continuous, while the signal itself is not a continuous function. It
has a discontinuity at t = 0.
The boxcar signal (rectangular window) is formed as b(t) = u(t + 1/2) − u(t − 1/2),
that is, b(t) = 1 for −1/2 ≤ t < 1/2 and b(t) = 0 elsewhere. The signal obtained by
multiplying the unit-step signal by t is called the ramp signal, with notation R(t) = tu(t).
The impulse signal (or delta function) is defined as
Z∞
δ(t) = 0, for t 6= 0 and δ(t)dt = 1. (1.2)
−∞

The impulse signal is equal to 0 everywhere, except at t = 0, where it takes an infinite value,
so that its area is 1. From the definition of the impulse signal, it follows δ( at) = δ(t)/ | a| .
This function cannot be implemented in real-world systems due to its infinitely short duration
and infinitely large amplitude at t = 0.

17
18 Continuous-Time Signals and Systems

In theory, any signal can be expressed using the impulse signal, as

Z∞ Z∞
x (t) = x (t − τ )δ(τ )dτ = x (τ )δ(t − τ )dτ. (1.3)
−∞ −∞

The unit-step signal can be related to the impulse signal using the previous relation as

Z∞ Zt
u(t) = δ(τ )u(t − τ )dτ = δ(τ )dτ
−∞ −∞

or
du(t)
= δ ( t ). (1.4)
dt
The sinusoidal signal, with amplitude A, frequency Ω0 , and initial phase ϕ, is a signal
of the form
x (t) = A sin(Ω0 t + ϕ). (1.5)
This signal is periodic in time, since it satisfies the periodicity condition

x ( t + T ) = x ( t ). (1.6)

In this case, the period is T = 2π/Ω0 .

A signal periodic with a period T could also be considered as periodic with periods kT,
where k is an integer.
The complex-valued sinusoidal signal whose definition is given by

x (t) = Ae j(Ω0 t+ ϕ) = A cos(Ω0 t + ϕ) + jA sin(Ω0 t + ϕ) (1.7)

is also periodic with period T = 2π/Ω0 . Fig. 1.1 depicts basic continuous-time signals.

1 1
u(t)

δ(t)

0 0
−1 (a) −1 (b)
−4 −2 0 2 4 −4 −2 0 2 4

1 1
sin(πt)
b(t)

0 0
−1 (c) −1 (d)
−4 −2 0 2 4 −4 −2 0 2 4
t t

Figure 1.1 Continuous-time signals: (a) unit-step signal, (b) impulse signal, (c) boxcar signal, and (d) sinusoidal
signal.
Ljubiša Stanković Digital Signal Processing 19

Example 1.1. Find the periods of the signals: x1 (t) = sin(2πt/36), x2 (t) = cos(4πt/15 + 2),
x3 (t) = exp( j0.1t), x4 (t) = x1 (t) + x2 (t), and x5 (t) = x1 (t) + x3 (t).

⋆Periods are calculated according to (1.6). For x1 (t), the period follows from 2πT1 /36 = 2π,
as T1 = 36. Similarly, T2 = 15/2 and T3 = 20π. The period of x4 (t) is the smallest interval
containing T1 and T2 . It is T4 = 180 (5 periods of x1 (t) and 24 periods of x2 (t)). For the signal
x5 (t), when the periods of components are T1 = 36 and T3 = 20π, there is no common interval
T5 such that the periods T1 and T3 are contained an integer number of times in it. Thus, the signal
x5 (t) is not periodic.

Example 1.2. Find the period of the signal defined by

N
x (t) = ∑ An e jnΩ t .0

n =0

⋆This signal consists of N + 1 components. The constant component A0 can be considered

as periodic with any period. The remaining components A1 e jΩ0 t , A2 e j2Ω0 t , A3 e j3Ω0 t ,. . . ,
A N e jNΩ0 t are periodic with periods, T1 = 2π/Ω0 , T2 = 2π/(2Ω0 ), T3 = 2π/(3Ω0 ), . . . ,
TN = 2π/( NΩ0 ), respectively. A sum of periodic signals is periodic with the period being equal
to the smallest time interval T containing all of the periods T1 , T2 , T3 ,. . . , TN an integer number
of times. In this case, it is T = 2π/Ω0 .

Some parameters that can be used to describe a signal are:

• Maximum absolute value (magnitude) of the signal x (t) is defined by

Mx = max | x (t)| , (1.8)

−∞<t<∞

• Signal energy of the same signal is

Z∞
Ex = | x (t)|2 dt, (1.9)
−∞

• Signal x (t) instantaneous power

Px (t) = | x (t)|2 . (1.10)

The average signal x (t) power is defined by

ZT
1
PAV = lim | x (t)|2 dt. (1.11)
T →∞ 2T
−T
20 Continuous-Time Signals and Systems

The average power is a time average of energy. Energy signals are signals with a finite
energy, while power signals have finite and nonzero power. The average signal power
of energy signals is zero.

Example 1.3. Find the magnitude, energy, instantaneous power, and average power of the signal
x (t) given by
x (t) = te−t u(t). (1.12)

⋆ The signal x (t) is a nonnegative continuous function with the initial and the final value equal
to x (0) = 0 and limt→∞ x (t) = 0, respectively. The magnitude of this signal is obtained as its
maximum, from
dx (t)
= e−t − te−t = (1 − t)e−t = 0, for t > 0. (1.13)
dt
The maximum Mx = 1/e is achieved at t = 1. The energy of this signal is equal to
Z∞ Z∞ Z∞
2 −2t e−2t 2 ∞ −2t e−2t ∞ 1 1
Ex = t e dt = − t + te dt = − t + e−2t dt = , (1.14)
2 0 2 0 2 4
0 0 0

where the integration in parts is used twice, with limt→∞ e−2t t2 = 0. The instantaneous power of
the signal x (t) is Px (t) = t2 e−2t u(t). The average power of this signal is PAV = 0.

1.2 LINEAR SYSTEMS

A system transforms one signal (input signal) into another signal (output signal). Assume
that x (t) is the input signal. The system transformation will be denoted by an operator, T {◦}.
The output signal can be written as

y(t) = T { x (t)}. (1.15)

A system is linear if, for any two signals x1 (t) and x2 (t) and arbitrary constants a1 and a2 ,
holds
y(t) = T { a1 x1 (t) + a2 x2 (t)} = a1 T { x1 (t)} + a2 T { x2 (t)}. (1.16)
A system is time-invariant if its properties and parameters do not change over time. For
a time-invariant system, the following relation:

if y(t) = T { x (t)} then T { x (t − t0 )} = y(t − t0 ), (1.17)

holds for any t0 .

Linear and time-invariant (LTI) systems are defined by their response to the impulse
signal. If we know the impulse response of these systems,

h(t) = T {δ(t)},
Ljubiša Stanković Digital Signal Processing 21

then for any signal x (t) at the system input, the output can be obtained using (1.3), as
 
 Z∞ 
y(t) = T { x (t)} = T x (τ )δ(t − τ )dτ
 
−∞
Z∞ Z∞
Linearity Time−invariance
= x (τ )T {δ(t − τ )}dτ = x (τ )h(t − τ )dτ.
−∞ −∞

The last integral is of particular importance in signals and systems. It is called the convolution
in time of x (t) and h(t), and has a specific notation

Z∞
y(t) = x (t) ∗t h(t) = x (τ )h(t − τ )dτ. (1.18)
−∞

The convolution is a commutative operation, since

x (t) ∗t h(t) = h(t) ∗t x (t) (1.19)

holds.

Example 1.4. Find the convolution of the two boxcar signals x (t) = u(t) − u(t − 5) and
h ( t ) = u ( t ) − u ( t − 2).

⋆The signals x (τ ) and h(t − τ ) are shown in Fig. 1.2 for t = 0 and t = 1.25. For example, the
convolution value at t = 0 is obtained using the integral of the product of x (τ ) and h(−τ ), that is
Z∞
y (0) = x (τ )h(−τ )dτ = 0. (1.20)
−∞

For t < 0, the nonzero values of x (τ ) and h(t − τ ) do not overlap, resulting in y(t) = 0. For
Rt
0 ≤ t < 2, the output signal is y(t) = 0 dτ = t, while for 2 ≤ t < 5, y(t) = 2. For 5 ≤ t < 7, the
value of y(t) is y(t) = 7 − t. Finally, for t ≥ 7 the convolution value is equal to zero, y(t) = 0,
as shown in Fig. 1.2.
Duration of the convolution, y(t) = x (t) ∗t h(t), is equal to the sum of durations of x (t)
and h(t), that is Ty = Tx + Th , where Tx , Th , and Ty , are the respective durations of x (t), h(t),
and y(t).

Example 1.5. Find the convolution of the two signals x (t) = u(t + 1) − u(t − 1) and h(t) =
e − t u ( t ).
22 Continuous-Time Signals and Systems

Figure 1.2 Calculation of the convolution, y(t) = x (t) ∗t h(t), of signals x (t) = u(t) − u(t − 5) and h(t) =
u ( t ) − u ( t − 2).

⋆By using the convolution definition, we get

Z∞ Z1
y(t) = x (τ )h(t − τ )dτ = 1 · e−(t−τ ) u(t − τ )dτ
−∞ −1
 R t +1


 e−λ dλ = e−t (e − 1/e), for t ≥ 1
tZ−1  t −1
R t +1 − λ
=− e−λ u(λ)dλ = e dλ = 1 − e−(t+1) , for − 1 ≤ t < 1

 0
t +1 

0 for t < −1.

A system is causal if there is no response before the input signal appears. For causal
systems h(t) = 0 for t < 0. In general, signals that satisfy the property that they may be an
impulse response of a causal system may be referred to as causal signals.
A system is stable if any input signal with a finite magnitude Mx = max−∞<t<∞ | x (t)|
produces an output y(t) whose values are finite, |y(t)| < ∞. Sufficient condition that a linear
time-invariant system is stable is

Z∞
|h(τ )|dτ < ∞ (1.21)
−∞

since
Z∞ Z∞
|y(t)| = | x (t − τ )h(τ )dτ | ≤ | x (t − τ )h(τ )|dτ
−∞ −∞

Z∞ Z∞
= | x (t − τ )||h(τ )|dτ ≤ Mx |h(τ )|dτ < ∞,
−∞ −∞
Ljubiša Stanković Digital Signal Processing 23

if (1.21) holds.
It can be shown that the absolute value integrability of the impulse response is the
necessary condition for a linear time-invariant system to be stable as well.

1.3 PERIODIC SIGNALS AND FOURIER SERIES

Consider a periodic signal x (t) with a period T. It can be expanded onto periodic complex
sinusoidal functions φn (t) = e j2πnt/T , for −∞ < n < ∞,

x (t) = · · · + X−1 e− j2πt/T + X0 e− j0 + X1 e j2πt/T + · · ·

∞
= ∑ Xn e j2πnt/T (1.22)
n=−∞

if the following (Dirichlet) conditions are met:

(1) The signal x (t) has a finite number of finite discontinuities within the period T;
R T/2
(2) It is absolutely integrable over the period T, that is −T/2 | x (t)|dt ≤ c < ∞; and
(3) The signal x (t) has a finite number of maxima and minima.
Since signal analysis deals with real-world physical signals, rather than mathematical
generalizations, these conditions are almost always met.
The set of basis functions φn (t) = e j2πnt/T , −∞ < n < ∞, is an orthogonal set of
functions since their inner product is
T/2
Z
(
D E 1 1 for m = n
j2πmt/T j2πnt/T j2πmt/T − j2πnt/T
e ,e = e e dt = sin(π (m−n))
T π (m−n)
=0 for m 6= n.
− T/2

It means that the inner product of any two different basis functions is zero (orthogonal set),
while the self-inner product of a basis function is 1 (normal set). In the case of an orthonormal
set of basis functions, it is easy to show that the weighting coefficients Xn can be calculated
as the projections of x (t) onto the basis functions, here φn (t) = e j2πnt/T , −∞ < n < ∞,

D E T/2
Z
1
Xn = x ( t ), e j2πnt/T
= x (t)e− j2πnt/T dt. (1.23)
T
− T/2

This relation follows after a simple multiplication of the right and left side of (1.22) by
R T/2
e− j2πmt/T and a normalized integration within the period, that is T1 −T/2 (·) dt.
Normalization is achieved using the factor of 1/T in the scalar product definition. If
this factor were not
√ used, then the orthonormal set of basis functions would be defined by
φn (t) = e j2πt/T / T, −∞ < n < ∞, with the same conclusions and relations.
Since the signal and the basis functions are periodic with period T, we can use
T/2
Z Z +Λ
T/2
1 − j2πnt/T 1
x (t)e dt = x (t)e− j2πnt/T dt (1.24)
T T
− T/2 − T/2+Λ
24 Continuous-Time Signals and Systems

in all previous integrals, where Λ is an arbitrary constant.

The signal expansion in (1.22) is known as the Fourier series, and the coefficients Xn
are the Fourier series coefficients.

Example 1.6. What are the Fourier series coefficients of the periodic signal x (t) = cos2 (πt/4).
What will be the coefficient values if the period T = 8 is assumed?

⋆The signal x (t) can be written as x (t) = (1 + cos(πt/2))/2. The period is T = 4. Assuming
that the Fourier series coefficients are calculated with T = 4, after transforming the signal into
(1.22) form, we get
1 1 1
x (t) = e− j2πt/4 + + e j2πt/4 .
4 2 4
The Fourier series coefficients are recognized as X−1 = 1/4, X0 = 1/2 and X1 = 1/4 (without
the calculation defined by (1.23)). Other coefficients are equal to zero. In the above transformation,
the relation cos(πt/2) = (e jπt/2 + e− jπt/2 )/2 is used.
If the period T = 8 is used, then the signal is decomposed into complex sinusoids of the
form e j2πnt/8 (see relation (1.22)). The signal can be written as
1 − j2π2t/8 1 1 j2π2t/8
x (t) = e + + e . (1.25)
4 2 4
Thus, by comparing the signal definition with the basis functions e jπnt/4 , we may write
X−2 = 1/4, X0 = 1/2, and X2 = 1/4. The remaining coefficients Xn are equal to zero.

Example 1.7. Calculate the Fourier series coefficients of a periodic signal x (t) defined as
∞
x (t) = ∑ x0 (t + 2n)
n=−∞

with
x0 (t) = u(t + 1/4) − u(t − 1/4). (1.26)

⋆The signal x (t) is a periodic extension of x0 (t), with period T = 2. This signal is equal to 1
for −1/4 ≤ t < 1/4, within its basic period. The Fourier series coefficients are obtained from
1/4
Z
1 sin(πn/4)
Xn = 1e− j2πnt/2 dt = , (1.27)
2 πn
−1/4

with X0 = 1/4. Values of Xn are shown in Fig. 1.3 (right).

The signal x (t) can be reconstructed using the Fourier series (1.22). In numeric calculations,
a finite number of M terms is used,
M
1 M sin(π n4 ) − jπnt 1 M sin(π n )
x M (t) = ∑ Xn e jπnt = + ∑ e + e jπnt = + ∑ n
4
cos(πnt).
n=− M
4 n =1
πn 4 n =1
π 2
Ljubiša Stanković Digital Signal Processing 25

The reconstructed signal, with M = 1, 2, 6, and 30, is shown in Fig. 1.4.

x(t) X
n

−1 −1/4 1/4 1
t 0 n

Figure 1.3 Periodic signal, x (t), (left) and its Fourier series coefficients, Xn , (right).

1.5 1.5
1 1
x (t)

x (t)

0.5 0.5
1

0 0
(a) (b)
−0.5 −0.5
−2 −1 0 1 2 −2 −1 0 1 2
t t
1.5 1.5
1 1
x (t)
x (t)

0.5 0.5
30
6

0 0
(c) (d)
−0.5 −0.5
−2 −1 0 1 2 −2 −1 0 1 2
t t

Figure 1.4 Reconstruction of the signal x (t) using a finite Fourier series with: (a) the coefficients Xn within
−1 ≤ n ≤ 1, (b) the coefficients Xn within −2 ≤ n ≤ 2, (c) the coefficients Xn within −6 ≤ n ≤ 6, and (d) the
coefficients Xn within −30 ≤ n ≤ 30.

1.3.1 Fourier Series of Real-Valued Signals

For a real-valued signal x (t) the Fourier series coefficients can be written in the form

T/2
Z T/2
Z
1 2πnt 1 2πnt An − jBn
Xn = x (t) cos( )dt − j x (t) sin( )dt = , (1.28)
T T T T 2
− T/2 − T/2
26 Continuous-Time Signals and Systems

where An /2 and − Bn /2 are the real and imaginary part of Xn . Since Xn∗ = X−n holds for
real-valued signals, the values of An and Bn are equal to

T/2
Z
2 2πnt
A n = Xn + X− n = x (t) cos( )dt,
T T
− T/2
T/2
Z
Xn − X− n 2 2πnt
Bn = = x (t) sin( )dt. (1.29)
−j T T
− T/2

The Fourier series form of real-valued signals is

−1 ∞
x (t) = ∑ Xn e j2πnt/T + X0 + ∑ Xn e j2πnt/T
n=−∞ n =1
∞
= X0 + ∑ (Xn e j2πnt/T + X−n e− j2πnt/T )
n =1
∞
2πnt 2πnt
= X0 + ∑ (Xn + X−n ) cos( T
) + j( Xn − X−n ) sin(
T
)
n =1
∞ ∞
A0 2πnt 2πnt
= + ∑ An cos( ) + ∑ Bn sin( ), (1.30)
2 n =1
T n =1
T
p
with | Xn | = A2n + Bn2 /2. For real-valued signals the integrals in (1.29), corresponding to
An and Bn , are respectively even and odd function of n. Therefore, it is possible to calculate

T/2
Z T/2
Z
1 2πnt 2πnt 1 2πnt
Hn = x (t) cos( ) + sin( ) dt = x (t)cas( )dt (1.31)
T T T T T
− T/2 − T/2

and to get

An = Hn + H−n
Bn = Hn − H−n .

The coefficients calculated by (1.31) are the Hartley series coefficients. For a real-valued and
even signal, x (t) = x (−t), the Hartley series reduces to

T/2
Z T/2
Z
An 1 2πnt 2 2πnt
Cn = Xn = = x (t) cos( )dt = x (t) cos( )dt,
2 T T T T
− T/2 0

corresponding to the Fourier cosine series coefficients,

∞
2πnt
x (t) = C0 + ∑ 2Cn cos( ). (1.32)
n =1
T
Ljubiša Stanković Digital Signal Processing 27

A similar expression is obtained for an odd and real-valued signal x (t), when the Fourier
series reduces to the Fourier sine series coefficients,

Example 1.8. Consider the Fourier series based reconstruction of the the signal
x (t) = t[u(t) − u(t − 1/2)],

whose duration (nonzero values) is limited to 0 ≤ t < 1/2. For the Fourier series expansion,
a periodic extension of the signal must be formed. The rate of the Fourier series coefficients
convergence depends on the way how the periodic extension of this signal is formed.
(a) Calculate the Fourier series of the original signal extended periodically with period T = 1/2,
∞
1
x p (t) = ∑ x ( t + n ).
2
n=−∞

Write the reconstruction formula with M Fourier series coefficients.

(b) What are the Fourier transform coefficients and the reconstruction formula for
∞
x p (t) = ∑ x ( t + n ),
n=−∞

when the period is T = 1.

x c ( t ) = x ( t ) + x (1 − t ),

and then extended periodically with the period T = 1. Find the Fourier series coefficients and the
reconstruction formula.
(d) Comment the coefficients convergence in all cases.

⋆ (a) The Fourier series coefficients of this signal are

1/2
Z
1 1
Xn = te− j2πn/(1/2)t dt =
1/2 − j4πn
0

with X0 = 1/4. The signal reconstructed with M coefficients is

M M
1 1 j4πnt 1 − j4πnt 1 sin(4πnt)
x M (t) = +∑ − e + e = −∑ .
4 n =1 4jπn 4jπn 4 n =1 2πn

The reconstructed signal for several values of M is shown in Fig. 1.5.

(b) The Fourier series coefficients of the signal x (t) extended with period 1 are

1/2
Z 1/2 1/2 (−1)n
1 − j2πnt 1
− j2πnt 1 − j2πnt (−1)n − 1
Xn = te dt = te + e = + ,
1 − j2πn 0 (2πn)2 0 − j4πn (2πn)2
0

with X0 = 1/8. Note that the relation between the Fourier coefficients in (a) and (b) is
(b) ( a)
2X2n = Xn . The reconstruction is given in Fig. 1.6.
28 Continuous-Time Signals and Systems

0.6 0.6

0.4 0.4
x1(t)

x2(t)
0.2 0.2
0 0
(a) (b)
−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1
t t
0.6 0.6
0.4 0.4

x30(t)
x6(t)

0.2 0.2

0 0
(c) (d)
−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1
t t

Figure 1.5 Reconstruction of the signal x (t) using the Fourier series. Reconstructed signal is denoted by x M (t),
where M indicates the number of coefficients used in reconstruction.

0.6 0.6
0.4 0.4
x1(t)

x2(t)

0.2 0.2
0 0
(a) (b)
−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1
t t
0.6 0.6

0.4 0.4
x30(t)
x6(t)

0.2 0.2

0 0
(c) (d)
−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1
t t

Figure 1.6 Reconstruction of the periodic signal x (t), with a zero interval extension before the Fourier series is
used.

(c) For the signal xc (t) extended with its reversed version follows
1/2
Z Z1 1/2
Z
1 (−1)n − 1
Xn = Cn = te− j2πnt dt + (1 − t)e− j2πnt dt = 2 t cos(2πnt)dt =
1 2π 2 n2
0 1/2 0

with C0 = 1/4. The reconstruction formula is

1 M
(−1)n − 1 1 M
1
x M (t) = +2 ∑ 2 2
cos(2πnt) = − 2 ∑ 2 2
cos 2π (2n − 1)t .
n=1 π (2n − 1)
4 n=1 2π n
4

The reconstructed signal is shown in Fig. 1.7.

Ljubiša Stanković Digital Signal Processing 29

0.6 0.6

0.4 0.4

x1(t)

x2(t)
0.2 0.2
0 0
(a) (b)
−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1
t t
0.6 0.6
0.4 0.4

x30(t)
x6(t)

0.2 0.2

0 0
(c) (d)
−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1
t t

Figure 1.7 Reconstruction of a periodic signal after an even extension before using the Fourier series (cosine
Fourier series).

(d) The coefficients convergence in cases (a) and (b) is of order 1/n, while the convergence
in the last case (c) is of order 1/n2 . The best signal reconstruction, with a given number of
coefficients, will be achieved in case (c). Also, for a given reconstruction error the smallest
number of reconstruction terms M would be required in case (c). This kind of signal extension
(even signal extension) will be later used as a basis for the definition of the so called cosine signal
transforms. From these periodic extensions, we can also conclude that an extension that avoids
signal discontinuities at the interval ending instants improves the series convergence.

Example 1.9. Show that the Fourier series coefficients Xn of a periodic signal x (t) can be obtained
by minimizing the mean squared error between the signal and ∑nN=− N Xn e j2πnt/T within the
period T.

⋆The mean squared value of the error, defined by

N
e(t) = x (t) − ∑ Xn e j2πnt/T ,
n=− N

within the period, T, is given by

T/2 2
Z N
1 j2πnt/T
I= x ( t ) − ∑ Xn e dt.
T n=− N

− T/2
30 Continuous-Time Signals and Systems

∗ = 0 follows
From ∂I/∂Xm

T/2
Z
!
N
1 − j2πmt/T j2πnt/T
T
e x (t) − ∑ Xn e dt = 0
n=− N
− T/2
T/2
Z
1
Xm = x (t)e− j2πmt/T dt. (1.33)
T
− T/2

Note: The derivative of a complex function F (z) = u( x, y) + jv( x, y), with z = x + jy, are u( x, y), v( x, y)
are real-valued functions, is defined by

∂F (z) ∂ ∂
= −j F ( x, y),
∂z ∂x ∂y

∂F (z) ∂ ∂
= +j F ( x, y).
∂z∗ ∂x ∂y
Commonly, a half of these values is used in the definition.
In order to justify the complex derivation ∂I/∂Xm ∗ in (1.33) let us denote: (i) the complex-valued

variable Xm by z = x + jy, (ii) all terms in x (t) − ∑nN=− N Xn e j2πnt/T = f (z) which do not depend on
z = Xm = x + jy by a + jb, and (iii) the value of −e j2πmt/T by e jα . Now we have to show that

∂F (z) ∂ | f (z)|2
∗
= = 2e− jα f (z).
∂z ∂z∗
In our case
2

| f (z)|2 = a + jb + e jα ( x + jy)
= ( a + x cos α − y sin α)2 + (b + x sin α + y cos α)2 .

For the minimization of the real-valued function | f (z)|2 of two variables x and y we need partial derivatives

∂ | f (z)|2
= 2 cos α( a + x cos α − y sin α) + 2 sin α(b + x sin α + y cos α) (1.34)
∂x
= 2 Re{e− jα f (z)}
and
∂ | f (z)|2
= 2 Im{e− jα f (z)}. (1.35)
∂y
Therefore, all calculations with two real-valued equations (1.34) and (1.35) are the same as one complex-
valued relation

∂ | f (z)|2 ∂ | f (z)|2 ∂ ∂ ∂ ∂ ∂F (z)
+j = +j | f (z)|2 = +j F (z) = .
∂x ∂y ∂x ∂y ∂x ∂y ∂z∗

1.4 FOURIER TRANSFORM

The Fourier series has been introduced and presented for periodic signals, with a period T.
Assume now that the signal is of limited duration and that the period for its expansion is
extended toward infinity, while not changing the signal. This case corresponds to the analysis
of an aperiodic signal x (t). Its transform, the Fourier series coefficients normalized by the
Ljubiša Stanković Digital Signal Processing 31

period, is given by
T/2
Z Z∞
lim Xn T = lim x (t)e− j2πnt/T dt = x (t)e− jΩt dt (1.36)
T →∞ T →∞
− T/2 −∞

with 2π/T = ∆Ω → dΩ (being infinitesimal) and 2πn/T → Ω becoming a continuous

variable, as T → ∞ and −∞ < n < ∞.
The function X (Ω), defined by
Z∞
X (Ω) = x (t)e− jΩt dt, (1.37)
−∞

is called the Fourier transform (FT) of a signal x (t). For the Fourier transform existence it is
sufficient that the signal is absolutely integrable, that is
Z∞
| x (t)|dt < ∞. (1.38)
−∞

There are some signals that do not satisfy this condition, such as the unit-step signal, whose
Fourier transform exists in the form of generalized functions.
The inverse Fourier transform (IFT) can be obtained by multiplying both sides of (1.37)
by e jΩτ and integrating over Ω,
Z∞ Z∞ Z∞
X (Ω)e jΩτ dΩ = x (t)e jΩ(τ −t) dtdΩ.
−∞ −∞ −∞

Using the fact that

Z∞
e jΩ(τ −t) dΩ = 2πδ(τ − t),
−∞
we get the inverse Fourier transform
Z∞
1
x (t) = X (Ω)e jΩt dΩ. (1.39)
2π
−∞

Example 1.10. Calculate the Fourier transform of the signal

x (t) = Ae− at u(t).

⋆According to the Fourier transform definition we can write

Z∞
A
X (Ω) = Ae− at e− jΩt dt = .
( a + jΩ)
0
32 Continuous-Time Signals and Systems

The Fourier transform of this signal exists for a > 0, when

Z∞ Z∞
e− at ∞ A
| x (t)|dt = A e− at dt = A = < ∞.
−a 0 a
0 0

Example 1.11. Find the Fourier transform of the signal


 1 for t > 0
x (t) = sign(t) = 0 for t = 0 . (1.40)

−1 for t < 0

⋆Since a direct calculation of the Fourier transform for this signal is not possible, let us consider
the signal  − at
 e for t > 0
x a (t) = 0 for t = 0

−e at for t < 0
where a > 0 is a real-valued constant. It is obvious that the signal x (t) can be obtained as the
following limit
lim x a (t) = x (t).
a →0
The Fourier transform of x (t) can be calculated from

X (Ω) = lim Xa (Ω),

a →0

where
Z0 Z∞
2Ω
Xa (Ω) = −e at e− jΩt dt + e− at e− jΩt dt = . (1.41)
ja2 + jΩ2
−∞ 0
It results in
2
X (Ω) = . (1.42)
jΩ

Based on the definitions of the Fourier transform and the inverse Fourier transform, it is
easy to conclude that the duality property holds:
If X (Ω) is the Fourier transform of x (t), then the Fourier transform of X (t) is
2πx (−Ω)

X (Ω) = FT{ x (t)}

2πx (−Ω) = FT{ X (t)}, (1.43)

where FT{◦} denotes the Fourier transform operator.

Ljubiša Stanković Digital Signal Processing 33

Example 1.12. Find the Fourier transform of the signals δ(t), x (t) = 1, and u(t).

⋆The Fourier transform of δ(t) is

Z∞
FT{δ(t)} = δ(t)e− jΩt dt = 1. (1.44)
−∞

According to the duality property, the Fourier transform of x (t) = 1 is

FT{1} = 2πδ(Ω). (1.45)

Finally, for the unit-step signal we get

sign(t) + 1 1
FT{u(t)} = FT = + πδ(Ω). (1.46)
2 jΩ

1.4.1 Fourier Transform and Linear Time-Invariant Systems

Consider a linear, time-invariant system with an impulse response h(t) and the input signal
x (t) = Ae j(Ω0 t+ ϕ) . The output signal is

Z∞
y(t) = x (t) ∗t h(t) = Ae j(Ω0 (t−τ )+ ϕ) h(τ )dτ
−∞
Z∞
= Ae j(Ω0 t+ ϕ) h(τ )e− jΩ0 τ dτ = H (Ω0 ) x (t), (1.47)
−∞

where
Z∞
H (Ω) = h(t)e− jΩt dt (1.48)
−∞
is the Fourier transform of h(t). The linear time-invariant system does not change the form of
an input complex harmonic signal x (t) = Ae j(Ω0 t+ ϕ) . It remains complex harmonic signal
after passing through the linear time-invariant system, with the same frequency Ω0 . The
amplitude of the input signal x (t) is changed for | H (Ω0 )| and the phase is changed for
arg{ H (Ω0 )}.

1.4.2 Properties of the Fourier Transform

The Fourier transform satisfies the following properties:

1. Linearity: The Fourier transform of a linear combination of two signals, x1 (t) and x2 (t) is

FT{ a1 x1 (t) + a2 x2 (t)} = a1 X1 (Ω) + a2 X2 (Ω), (1.49)

34 Continuous-Time Signals and Systems

where X1 (Ω) and X2 (Ω) are the Fourier transforms of signals x1 (t) and x2 (t), separately.
2. Realness: The Fourier transform of a signal is real-valued (that is, X ∗ (Ω) = X (Ω)), if

x ∗ (−t) = x (t),

since
Z∞ Z∞
∗ ∗ t→−t
X (Ω) = x (t)e jΩt
dt = x ∗ (−t)e− jΩt dt = X (Ω), (1.50)
−∞ −∞
if x ∗ (−t) = x (t).
3. Modulation: If the signal x (t) is modulated by e jΩ0 t the Fourier transform of the modulated
signal is shifted in frequency, that is
Z∞
FT{ x (t)e jΩ0 t
}= x (t)e jΩ0 t e− jΩt dt = X (Ω − Ω0 ) (1.51)
−∞
FT{2x (t) cos(Ω0 t)} = X (Ω − Ω0 ) + X (Ω + Ω0 ).

4. Shift in time: The Fourier transform of the signal x (t) shifted in time for t0 is modulated
in the frequency domain,
Z∞
FT{ x (t − t0 )} = x (t − t0 )e− jΩt dt = X (Ω)e− jt0 Ω . (1.52)
−∞

5. Time-scaling: For a signal scaled in time by factor a the Fourier transform is given by
Z∞
1 Ω
FT{ x ( at)} = x ( at)e− jΩt dt = X ( ). (1.53)
| a| a
−∞

6. Convolution: The Fourier transform of the convolution of signals x (t) and h(t) is equal to
the product of their corresponding Fourier transforms, that is
Z∞ Z∞
FT{ x (t) ∗t h(t)} = x (τ )h(t − τ )e− jΩt dτdt (1.54)
−∞ −∞
Z∞ Z∞
t−τ →u
= x (τ )h(u)e− jΩ(τ +u) dτdu = X (Ω) H (Ω).
−∞ −∞

7. Multiplication: For the product of two signals holds

Z∞ Z∞
1
FT{ x (t)h(t)} = x (t) H (θ )e jθt dθe− jΩt dt (1.55)
2π
−∞ −∞
Z∞
1
= H (θ ) X (Ω − θ )dθ = X (Ω) ∗Ω H (Ω) = H (Ω) ∗Ω X (Ω).
2π
−∞
Ljubiša Stanković Digital Signal Processing 35

Convolution in frequency domain is denoted by ∗Ω with a factor of 1/2π being included.

8. Parseval’s theorem: The inner products of the signals x (t) and y(t) in the time domain
and the frequency domain satisfy the following relations

Z∞ Z∞
∗ 1
x (t)y (t)dt = X (Ω)Y ∗ (Ω)dΩ, (1.56)
2π
−∞ −∞
Z∞ Z∞
1
| x (t)|2 dt = | X (Ω)|2 dΩ.
2π
−∞ −∞

9. Differentiation of a signal in the time domain corresponds to the Fourier transform

multiplication by jΩ,
  
d Z∞ 
dx (t)  1
FT = FT X (Ω)e jΩt dΩ = jΩX (Ω). (1.57)
dt  dt 2π 
−∞

10. Integration: The Fourier transform of an integral of the signal x (t),

Zt
x (τ )dτ,
−∞

can be calculated as the Fourier transform of the convolution of the signals x (t) and u(t),

Z∞ Zt
x (t) ∗t u(t) = x (τ )u(t − τ )dτ = x (τ )dτ.
−∞ −∞

Then, the Fourier transform of the signal x (t) integral is obtained from
 
 Zt 
FT = FT{ x (t)}FT{u(t)} =
x (τ )dτ (1.58)
 
−∞

1 1
X (Ω) + πδ(Ω) = X (Ω) + πX (0)δ(Ω).
jΩ jΩ

If the mean value of the signal x (t) is zero, when X (0) = 0, a multiplication by 1/( jΩ) in
the Fourier transform domain corresponds to the signal integration in the time domain.
11. An analytic part of a signal x (t), whose Fourier transform is X (Ω), is a signal with the
Fourier transform defined by

 2X (Ω) for Ω > 0
Xa (Ω) = X (0) for Ω = 0 . (1.59)

0 for Ω < 0
36 Continuous-Time Signals and Systems

It can be written as

Xa (Ω) = X (Ω) + X (Ω)sign(Ω) = X (Ω) + jXh (Ω), (1.60)

where Xh (Ω) is the Fourier transform of the Hilbert transform of the signal x (t). From
Example 1.11, with the signal x (t) = sign(t) and the duality property of the Fourier
transform pair, obviously the inverse Fourier transform of sign(Ω) is j/(πt). Therefore, the
analytic part of the signal x (t), in the time domain, takes the form

Z∞
j 1 x (τ )
x a (t) = x (t) + jxh (t) = x (t) + x (t) ∗t = x (t) + j dτ, (1.61)
πt π t−τ
−∞
p.v.

where p.v. stands for Cauchy principal value of the considered integral.

1.4.3 Relationship Between the Fourier Series and the Fourier Transform

Consider an aperiodic signal x (t), with the Fourier transform X (Ω). Assume that the signal
is of limited duration (that is, x (t) = 0 for |t| > T0 /2). Then,

TZ0 /2
X (Ω) = x (t)e− jΩt dt. (1.62)
− T0 /2

If we make a periodic extension of x (t), with the period T, we get the periodic signal

∞
x p (t) = ∑ x (t + nT ).
n=−∞

This periodic signal x p (t) can be expanded into Fourier series with the coefficients

T/2
Z
1
Xn = x p (t)e− j2πnt/T dt. (1.63)
T
− T/2

If T > T0 it is easy to conclude that

T/2
Z TZ0 /2
− j2πnt/T
x p (t)e dt = x (t)e− jΩt dt|Ω=2πn/T
− T/2 − T0 /2

or
1
Xn =
X (Ω)|Ω=2πn/T . (1.64)
T
It means that the Fourier series coefficients are equal to the samples of the Fourier transform,
divided by T. The only condition in the derivation of this relation is that the signal duration
Ljubiša Stanković Digital Signal Processing 37

is shorter than the period of its periodic extension (that is, T > T0 ). The sampling interval in
frequency is
2π 2π
∆Ω = , ∆Ω < .
T T0
This sampling interval should be smaller than 2π/T0 , where T0 is the signal x (t) duration.
This is a form of the sampling theorem in the frequency domain. It states that: the values
of X (Ω) can be recovered for any Ω, from its samples X (2πn/T ) = Xn T if T > T0 . The
sampling theorem in the time domain will be discussed later.
In order to write the Fourier series coefficients in the Fourier transform form, note that
the periodic signal x p (t), formed by a periodic extension of x (t) with period T, can be
written as
∞ ∞
x p (t) = ∑ x (t + nT ) = x (t) ∗t ∑ δ(t + nT ). (1.65)
n=−∞ n=−∞
The Fourier transform of this periodic signal is
( )
∞
X p (Ω) = FT x (t) ∗t ∑ δ(t + nT ) (1.66)
n=−∞

2π ∞ 2π 2π 2π 2π ∞
= X (Ω) · ∑ δ Ω− T n = T ∑ X T n δ Ω− T n
T n=− ∞ n=−∞

since
( ) Z∞
∞ ∞
FT ∑ δ(t + nT ) = ∑ δ(t + nT )e− jΩt dt
n=−∞ n=−∞−∞
∞
jΩnT 2π ∞ 2π
= ∑ e = ∑ δ Ω− T n . (1.67)
n=−∞ T n=− ∞

The Fourier transform of a periodic signal is a series of generalized impulse signals at

Ω = 2πn/T, with weighting factors X ( 2π T n ) /T being equal to the Fourier series coefficients
Xn . The relation between the periodic generalized impulse signals in the time and frequency
domain will be explained (derived) later, (see Example 2.8).

1.5 FOURIER TRANSFORM AND THE STATIONARY PHASE METHOD

When a signal
x (t) = A(t)e jφ(t) (1.68)
is not of a simple analytic form, it may be possible, in some cases, to obtain an approximative
expression for its Fourier transform using the method of stationary phase.
The method of stationary phase states that if the phase function φ(t) is monotonous and
the amplitude A(t) is sufficiently smooth function, then
s
Z∞
jφ(t) − jΩt jφ(t0 ) − jΩt0 2πj
A(t)e e dt ≃ A(t0 )e e , (1.69)
|φ′′ (t0 )|
−∞
38 Continuous-Time Signals and Systems

where t0 is the solution to

φ′ (t0 ) = Ω.
The most significant contribution to the integral on the left side of (1.69) comes from
the region where the phase φ(t) − Ωt of the exponential function exp( j(φ(t) − Ωt)) is
stationary in time, since the contribution of the intervals with fast varying φ(t) − Ωt tends
to zero. It means that, locally around an instant t, the signal behaves as exp( j(φ′ (t)t). Value

Ωi ( t ) = φ ′ ( t )

is called the instantaneous frequency of a signal. Around the stationary phase instant t0 the
following relation holds

d (φ(t) − Ωt)
=0
dt | t = t0
φ′ (t0 ) − Ω = 0.

In the vicinity of the stationary phase instant, t0 , the phase can be expanded into a Taylor
series,
1
φ(t) − Ωt = [φ(t0 ) − Ωt0 ] + [φ′ (t0 ) − Ω] + φ′′ (t0 )t2 + . . .
2
Since φ′ (t0 ) − Ω = 0 the integral in (1.69) can be written in the form

Z∞ Z∞
1 ′′ ( t 2
A(t)e j(φ(t)−Ωt) dt ∼
= A(t0 )e j(φ(t0 )−Ωt0 ) ej 2 φ 0 )t dt,
−∞ −∞

where A(t) ∼
= A(t0 ) is also used. With
s
Z∞
j 21 at2 2πj
e dt =
| a|
−∞

the stationary phase approximation follows.

If the equation φ′ (t0 ) = Ω has two (or more) solutions t0+ and t0− , then the integral on
the left side of (1.69) is equal to the sum of functions at both (or more) stationary phase
points. Finally, this relation holds for φ′′ (t0 ) 6= 0. If φ′′ (t0 ) = 0, then similar analysis may
be performed, using the lowest nonzero phase derivative at the stationary phase point.

Example 1.13. Consider signal

x (t) = exp(−(t2 − 1)t2 ) exp( j4πt2 + j10πt).

Find its Fourier transform approximation using the stationary phase method.
Ljubiša Stanković Digital Signal Processing 39

⋆According to the stationary phase method, the instant of stationary phase follows from
φ′ (t0 ) − Ω = 0, that is

8πt0 + 10π = Ω
Ω − 10π
t0 =
8π
and
φ′′ (t0 ) = 8π. (1.70)
The amplitude of X (Ω) is
s r
2π 2π
2 2
| X (Ω)| ≃ A(t0 ) = exp(−(t0 − 1)t0 )
φ′′ (t0 ) 8π
" 2 # !
1 Ω − 10π Ω − 10π 2
= exp − −1 (1.71)
2 8π 8π

The signal, stationary phase approximation of the Fourier transform (its amplitude) and the
numerical value of the Fourier transform amplitudes are shown in Fig. 1.8

Example 1.14. Consider a frequency-modulated signal

x (t) = A(t) exp( jat2N ).

where A(t) is a slow-varying non-negative function. Find its Fourier transform approximation
using the stationary phase method.

⋆According to the stationary phase method, we get that the stationary phase point is
−1
2Nat2N
0 = Ω with
1/(2N −1)
Ω
t0 =
2Na
and
Ω (2N −2)/(2N −1)
φ′′ (t0 ) = 2N (2N − 1) a . (1.72)
2Na
The amplitude and phase of X (Ω), according to (1.69), are

2π
| X (Ω)|2 ≃ A2 (t0 ) ′′ (1.73)
φ ( t0 )

2 Ω 1/(2N −1) 2π Ω 1/(2N −1)
=A ( )
2Na (2N − 1)Ω 2aN
1/(2N −1)
(1 − 2N ) Ω
arg { X (Ω)} ≃ φ(t0 ) − Ωt0 + π/4 = Ω + π/4
2N 2aN

for a large value of a.

For N = 1 and A(t) = 1, we get | X (Ω)|2 = |π/a| and arg { X (Ω)} = −Ω2 /(4a) + π/4.
40 Continuous-Time Signals and Systems

1
x(t)

−1

−4 −3 −2 −1 0 1 2 3 4
t
1

Stationary phase method

|X(Ω)|

0.5

0
−100 −80 −60 −40 −20 0 20 40 60 80 100
Ω
1

Numeric calculation
|X(Ω)|

0.5

0
−100 −80 −60 −40 −20 0 20 40 60 80 100
Ω

Figure 1.8 The signal (top), along with the stationary phase method approximation of its Fourier transform and
the Fourier transform obtained by numeric calculation with a high precision (bottom).

The method of stationary phase may be defined in the frequency domain as well. For a
Fourier transform
X (Ω) = B(Ω)e jθ (Ω) (1.74)
the method of stationary phase states that if the Fourier transform phase, θ (t), is monotonous
and the amplitude, B(t), is sufficiently smooth function, then
s
Z∞
1 j
x (t) = B(Ω)e jθ (Ω) e jΩt dΩ ≃ B(Ω0 )e jθ (Ω0 ) e jΩ0 t , (1.75)
2π 2π |θ ′′ (Ω0 )|
−∞

where Ω0 is the solution to

−θ ′ (Ω0 ) = t,
and
t g = −θ ′ (Ω)
Ljubiša Stanković Digital Signal Processing 41

is the group delay.

Example 1.15. Consider a system with the transfer function

H (Ω) = exp(−Ω2 ) exp(− jaΩ2 /2 − jbΩ).

Find the impulse response of this system using the stationary phase method.

⋆According to the stationary phase method,

t−b
aΩ0 + b = t or Ω0 =
a
and
θ ′′ (Ω0 ) = − a.
The impulse response is
s r
j t−b 2 1
−Ω20 − jaΩ20 /2− jbΩ0 + jΩ0 t
= e−( a ) e j((t−b) /(2a)+π/4)
2
h(t)) ≃ e e ′′ .
2π |θ (Ω0 )| 2πa

The signal amplitude is delayed for b. The second-order parameter a in the phase scales the time
axis of the impulse response. This is an undesirable effect in common systems.

Example 1.16. For a system with the frequency response H (Ω) = | H (Ω)| e j0 the impulse response
is h(t). Find the impulse response of the systems whose transfer functions are:
(a) Ha (Ω) = | H (Ω)| e− j4Ω ,
2
(b) Hb (Ω) = | H (Ω)| he− j2πΩ , and i
3
(c) Hc (Ω) = | H (Ω)| 4 + 14 cos(2πΩ2 ) e j0 .

⋆(a) The impulse response is

Z∞
1
h a (t) = H (Ω)e− j4Ω e jΩt = h(t − 4).
2π
−∞

It is delayed with respect to h(t) for t0 = 4.

(b) In this case
Z∞
1 2
hb (t) = H (Ω)e− j2πΩ e jΩt dΩ.
2π
−∞
The group delay is t g ′
= −θ (Ω) = 4πΩ. According to the stationary phase method, by replacing
Ω by t/(4π ), we get
r
t j(t2 /8π +π/4) 1
hb (t) = H e .
4π 8π 2
42 Continuous-Time Signals and Systems

The nonlinear group

R ∞delay causesRtime scaling and form change of the impulse response.
∞
Note: Check that −∞ h2 (t)dt = −∞ h2b (t)dt.
(c) Since 2 cos(2πΩ2 ) = exp( j2πΩ2 ) + exp(− j2πΩ2 ) we can write
3 1 1 3 1
hc (t) = h(t) + hb (t) + hb (−t) = h(t) + hb (t).
4 8 8 4 4
Fast variations of the amplitude result in a two-component impulse response, one being
proportional to the impulse response from case (a) and the other proportional to the form from
(b).

1.6 LAPLACE TRANSFORM

The Fourier transform could be considered as a special case of the Laplace transform. In the
beginning, Fourier’s work was even not published as an original contribution mainly due to
this fact. The Laplace transform is defined by
Z∞
X (s) = L{ x (t)} = x (t)e−st dt, (1.76)
−∞

where s = σ + jΩ is a complex number. It is obvious that the Fourier transform is a special

case of a Laplace transform along the imaginary axis, where σ = 0 or s = jΩ. This form
of the Laplace transform is also known as the bilateral Laplace transform (in contrast to
unilateral one, where the integration limits are from 0 to ∞). A part of the complex s-plane
where the Laplace transform exists (converges) is referred to as the region of convergence
(ROC).

Example 1.17. Calculate the Laplace transform of x (t) = e at u(t), for a real-valued constant a.

⋆According to the definition

∞

Z∞ e−(s− a)t 1
X (s) = e at e−st dt = − 0
=
s−a s−a
0

if limt→∞ e−(s− a)t = 0 or σ − a > 0, that is, σ > a. The region of convergence of this Laplace
transform, X (s), is a part of the complex s-plane where σ > a. The point s = a is the pole of the
Laplace transform. The region of convergence cannot include any poles and it is limited by a
vertical line in the complex s-plane passing through the pole, as shown in Fig. 1.9.

The Laplace transform may be considered as the Fourier transform of a signal x (t)
multiplied by exp(−σt), with varying parameter σ, that is
Z∞ Z∞
−σt −σt − jΩt
FT{ x (t)e }= x (t)e e dt = x (t)e−st dt = X (s). (1.77)
−∞ −∞
Ljubiša Stanković Digital Signal Processing 43

Figure 1.9 The region of convergence (ROC) of the Laplace transform of the signal x (t) = e at u(t) for a = −1.

In this way, we may calculate the Laplace transform of signals that are not absolutely
integrable,
R∞ that is, do not satisfy the condition for the Fourier transform convergence,
−∞ | x ( t )| dt < ∞. For some values of σ, the new signal x (t)e−σt may be absolutely
integrable and the Laplace transform could exist.
In the previous example, the Fourier transform does not exist for a > 0, while for a = 0
it exists in the sense of the generalized functions sense only. The Laplace transform of the
considered signal always exists, with the region of convergence σ > a. If a < 0, then the
region of convergence σ > a includes the line σ = 0, meaning that the Fourier transform
exists.

Example 1.18. Find the Laplace transform of x (t) = −e at u(−t) and the region of its convergence.

⋆The Laplace transform of this signal is given by

0

Z0 e−(s− a)t 1
−∞
X (s) = − e at e−st dt = =
s−a s−a
−∞

if limt→−∞ e−(s− a)t = 0 or σ − Re{ a} < 0, that is, σ < Re{ a}, where Re{ a} is a real part of a.
The Laplace transform X (s) is this example has the same form as X (s) in the previous example,
with different regions of convergence. The Fourier transform of x (t) = −e at u(−t) exists if σ = 0
is within the region of convergence, that is, if Re{ a} > 0.

The inverse Laplace transform is defined by

γZ+ jT
1
x (t) = lim X (s)est ds,
2πj T →∞
γ− jT
44 Continuous-Time Signals and Systems

where the integration is performed along a path, with γ within the region of convergence of
X ( s ).

1.6.1 Properties of the Laplace Transform

Properties of the Laplace transform may easily be generalized from those presented for the
Fourier transform in Section 1.4.2, like for example, the linearity property and the convolution
property, given by

L{ ax (t) + by(t)} = aL{ x (t)} + bL{y(t)} = aX (s) + bY (s),

and
L{ x (t) ∗t h(t)} = L{ x (t)}L{h(t)} = X (s) H (s).
Since the Laplace transform will be used to analyze linear systems described by linear
differential equations we will pay a special attention to the relation of the signal derivatives
to the corresponding forms in the Laplace domain. In general, the Laplace transform of the
first derivative, dx (t)/d(t), of a signal x (t) is

Z∞ ∞ Z∞
dx (t) −st −st
e dt = x (t)e +s x (t)e−st dt = sX (s). (1.78)
dt −∞
−∞ −∞

This relation follows from the integration in parts, with the assumption that the values of
x (t)e−st are zero as t → ±∞.
Unilateral Laplace transform. In many applications, causal systems are assumed, with the
corresponding causal signals used in calculations. In these cases, x (t) = 0 for t < 0, that
is, x (t) = x (t)u(t). Then, the so called one-sided Laplace transform (unilateral Laplace
transform) is used. Its definition is

Z∞
X (s) = x (t)e−st dt.
0

The region of convergence for the unilateral Laplace transform is the right-sided part of the s
plane. This topic is discussed in Section 1.6.3.
When dealing with the derivatives of causal signals we have to take care about possible
discontinuity at t = 0. In general the first derivative of the function x (t)u(t) is

d( x (t)u(t)) dx (t) du(t) dx (t)

= u(t) + x (t) = u ( t ) + x (0) δ ( t ).
dt dt dt dt
The unilateral Laplace transform of the first derivative of a causal signal x (t) is

Z∞ ∞ Z∞
dx (t) −st −st
e dt = x (t)e + s x (t)e−st dt = sX (s) − x (0). (1.79)
dt 0
0 0

The value of signal x (t) at t = 0, denoted by x (0), is the initial condition.

Ljubiša Stanković Digital Signal Processing 45

The previous relation can easily be generalized to the higher-order signal derivatives of
x (t) and the corresponding Laplace transforms

Z∞ n Z∞
d x (t) −st
n
e dt = s n
x (t)e−st dt − sn−1 x (0) − sn−2 x ′ (0) − · · · − x (n−1) (0)
dt
0 0
n
= s n X ( s ) − s n −1 x (0 ) − s n −2 x ′ (0 ) − · · · − x ( n −1) (0 ) = s n X ( s ) − ∑ s n − m x ( m −1) (0 ).
m =1

The unilateral Laplace transform of an integral with variable limit of the signal x (t) is

Zt
1
L{ x (τ )dτ } = L{u(t) ∗t x (t)} = L{u(t)}L{ x (t)} = X ( s ),
s
0

since
Z∞
1
L{u(t)} = e−st dt = .
s
0

The signal that corresponds to the derivative of the Laplace transform is obtained from

Z∞ Z∞
dX (s) d −st

= x (t)e dt = −tx (t)e−st dt.
ds ds
0 0

This means that

dX (s)
L{tx (t)} = − .
ds

Example 1.19. Find the Laplace transform of x (t) = te at u(t).

⋆The Laplace transform of x (t) is obtained as

d d 1 1
L{te at u(t)} = − L{e at u(t)} = − = .
ds ds s − a ( s − a )2

This relation can easily be generalized to L{tn e at u(t)} = n!/(s − a)n+1 .

Example 1.20. Find the Laplace transform of the signal x (t) = e jΩ0 t u(t).
46 Continuous-Time Signals and Systems

⋆The Laplace transform of e jΩ0 t u(t) is

Z∞
1 s + jΩ0
L{e jΩ0 t u(t)} = e jΩ0 t e−st dt = = 2 ,
s − jΩ0 s + Ω20
0

for σ > 0. The Laplace transforms of cos(Ω0 t)u(t) and sin(Ω0 t)u(t) follow from the last
relation as
L{cos(Ω0 t)u(t)} = L{e jΩ0 t u(t)}/2 + L{e− jΩ0 t u(t)}/2 = s/(s2 + Ω20 ),
L{sin(Ω0 t)u(t)} = L{e jΩ0 t u(t)}/2j − L{e− jΩ0 t u(t)}/2j = Ω0 /(s2 + Ω20 ).

The initial value theorem and the final value theorem for the signal x (t) are

x (0+) = lim sX (s) and x (∞) = lim sX (s),

s→∞ s →0

respectively. Both of them follow from (1.79). The requirement is that the Laplace transform
of the signal, x (t), and its derivative dx (t)/dt, exist. The final value of the signal does not
exist if the poles of sX (s) are: (a) on the right side of the s plane, (b) there is a pair of the
conjugate-complex poles on the imaginary axis, (c) at the origin. The initial value theorem
requires that the signal does not contain delta pulses at the origin.

1.6.2 Table of the Laplace Transform

Signal x (t) Laplace transform X (s)

δ(t) 1
u(t) 1/s
1
e at u(t) s− a
tu(t) 1/s2
s− a
e at cos(Ω0 t)u(t) (s− a)2 +Ω20
Ω0
e at sin(Ω0 t)u(t) (s− a)2 +Ω20
1
te at u(t) ( s − a )2
x ′ (t)u(t) sX (s) − x (0)
tx (t)u(t) −dX (s)/ds
R∞
x (t)u(t)/t s F ( s ) ds
e at x (t)u(t) X (s − a)
Rt
x (t) ∗t u(t) = 0 x (t)dt X (s)/s
Ljubiša Stanković Digital Signal Processing 47

1.6.3 Linear Systems Described by Differential Equations

After we have established the relation between the Laplace transform and the signals
derivatives we may use it to analyze the systems described by differential equations. Consider
a causal system described by the following differential equation

d N y(t) dy(t) d M x (t) dx (t)

aN + · · · + a 1 + a 0 y ( t ) = b M + · · · + b1 + b0 x (t), (1.80)
dt N dt dt M dt

with the initial conditions x (0) = x ′ (0) = x (n−1) (0) = 0. The Laplace transform of both
sides of this differential equation is

a N s N Y (s) + · · · + a1 sY (s) + a0 Y (s) = b M s M X (s) + · · · + b1 sX (s) + b0 X (s).

The transfer function of this system is of the form

Y (s) b s M + · · · + b1 s + b0
H (s) = = M N . (1.81)
X (s) a N s + · · · + a1 s + a0

Stability and causality. A Rlinear time invariant system is stable if its impulse response
∞
h(t) satisfies the condition −∞ | h(t)| dt < ∞. Within the Laplace transform framework,
this condition means that the line σ = 0, in the complex s-plane, belongs to the region of
convergence of the transfer function, H (s).
A system whose impulse response is of the form

N
h(t) = ∑ A n e an t u ( n )
n =1

is causal. Although, this is not a general form of causal systems, it is important in system
analysis. The transfer function of this system is given by

N
An
H (s) = ∑ , (1.82)
n =1
s − an

with the region of convergence defined by the set of inequalities σ > Re{ a1 }, σ > Re{ a2 },
. . . , σ > Re{ a N }. These inequalities can be written in a compact form as

σ > max Re{ an }.

The region of convergence of the causal system in (1.82) is the right side of the line
σ = maxn Re{ an }, passing trough the pole with the largest real value part.
The system defined by (1.81) can be written in the form given by (1.82) if the polynomial
order in the denominator is higher that the nominator polynomial order, that is, if N > M.
This system is causal and stable if all poles an reside in the left side of the complex s-plane,
that is, if Re{ an } < 0 for all n = 1, 2, . . . , N and the region of convergence is defined by
48 Continuous-Time Signals and Systems

σ > maxn Re{ an }, as illustrated in Fig. 1.10. Possible higher-order (multiple) poles in H (s)
would not change the conclusion about its causality.

Figure 1.10 Poles of a stable and causal system, with its region of convergence, σ > maxn Re{ an }, that includes
the line σ = 0.

Example 1.21. A causal system with a proportional regulator is described with the transfer function,
K
H (s) = ,
s2 + 4s + K
where K is the constant of the regulator. Find the system response to the input signal x (t) = u(t),
for K = 4, K = 3 < 4, and K = 20 > 4.

⋆ The Laplace transform of the input signal is X (s) = L{u(t)} = 1/s. The poles of the transfer
function H (s) are obtained from s2 + 4s + K = 0 as
√
s1,2 = −2 ± 4 − K.

For K = 4 the Laplace transform of the output signal is

4 A B C 1 1 2
Y (s) = = + + = − − ,
s ( s + 2)2 s s+2 ( s + 2)2 s s+2 ( s + 2)2
while the system response in the time domain is obtained as the inverse Laplace transform of
Y (s) and it is given by
y(t) = (1 − e−2t − 2te−2t )u(t).
The convergence toward the steady-state is of the e−2t (1 + 2t) form (critically dumped response).
For K = 3 the Laplace transform, Y (s), and the time domain signal, y(t), are given by

3 A B C 1 3/2 1/2
Y (s) = = + + = − + ,
s(s + 1)(s + 3) s s+1 s+3 s s+1 s+3
Ljubiša Stanković Digital Signal Processing 49

and
3 1
y(t) = (1 − e−t + e−3t )u(t).
2 2
In this case, the slowest converging term is of the form e−t . The convergence of this term toward
the steady-state is slower than in the case with K = 4 (overdumped response).
Finally, for K = 20 the outputs in the Laplace domain and the time domain are of the form

20 1 (2 + j)/4 (2 − j)/4
Y (s) = = − − ,
s(s + 2 + j4)(s + 2 − j4) s s + 2 + j4 s + 2 − j4

and
2 + j −2t(1+2j) 2 − j −2t(1−2j) 1
y ( t ) = (1 − e − e )u(t) = (1 − e−2t cos(4t) − e−2t sin(4t))u(t).
4 4 2
The convergence toward the steady-state is defined by the function e−2t , with the oscillatory term
sin(4t) (underdumped response that overshoots its final value).
The responses of the system for the three considered cases are shown in Fig. 1.11.

1.2

0.8

0.6

0.4

0.2

0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

Figure 1.11 The responses of the system for the three considered cases: K = 4 (critically dumped), K = 3
(overdumped), and K = 20 (underdumped)

Example 1.22. Transfer function, H (s), of the system is defined by

3s2 + 4s − 1
H (s) = .
(s + 2)(s + 1)(s − 1)
Find the impulse response if the system is stable. Is this system causal? Find the impulse response
of a causal system with the same transfer function H (s).

⋆ The poles of this transfer function are s1 = −2, s2 = −1, and s2 = 1. The system is stable if
the line σ = 0 belongs to the region of convergence. This region of convergence is defined by
−1 < σ < 1. In order to find the inverse Laplace transform, the transfer function can be written
50 Continuous-Time Signals and Systems

in the form
1 1 1
H (s) = + + .
s+2 s+1 s−1
The impulse response for the region of convergence defined by −1 < σ < 1 is

h(t) = e−2t u(t) + e−t u(t) − et u(−t).

R∞
It is easy to check that the stability condition, −∞ | h(t)| dt = 5/2 < ∞, holds for this system.
This system is not causal, since et u(−t) is not zero for t < 0.
Since the system is of form (1.81), with N = 3 > M = 2, it would be causal if the region of
convergence is defined by σ > maxn Re{ an } = 1. In this case, the impulse response would be of
the form
h(t) = e−2t u(t) + e−t u(t) + et u(t).
This system is not stable since the last term tends to infinity as t increases.

Solution to the differential equations using the Laplace transform. The output of a
linear-time invariant system described by (1.80) can be found by solving the corresponding
differential equation. The Laplace transform approach to solve differential equations is of
crucial importance in engineering. In general, if the initial conditions are included in (1.80),
the corresponding Laplace transform domain equation is

N n
a N s N Y (s) + · · · + a1 sY (s) + a0 Y (s) − ∑ an ∑ s n − m x ( m −1) (0 )
n =0 m =1
= b M s M X (s) + · · · + b1 sX (s) + b0 X (s).

The Laplace transform of the solution (output signal) can be written in the form

B(s) C (s)
Y (s) = X (s) + ,
A(s) A(s)

where A(s) = a N s N + · · · + a1 s + a0 , B(s) = b M s M + · · · + b1 s + b0 , and

N n
C (s) = ∑ a n ∑ s n − m x ( m −1) (0 ) .
n =0 m =1

The output consists of two parts, Y (s) = Yp (s) + Yh (s), whose form is defined as follows:

• The part of the output caused by the input signal is given by

B(s)
Yp (s) = X ( s ),
A(s)

and it is called the forced response (in mathematics, the particular part of the
differential equation solution).
Ljubiša Stanković Digital Signal Processing 51

• The output signal part due to the initial conditions is

C (s)
Yh (s) = .
A(s)

This part is independent of the input signal and it is called the natural response (in
mathematics, the homogeneous part of the solution).

Example 1.23. A first-order causal system is described by the differential equation

dy(t)
+ 2y(t) = 3u(t),
dt
with the initial condition y(0) = 0. Find the system output, y(t).

⋆The Laplace transform of both sides is

3
[sY (s) − y(0)] + 2Y (s) = .
s
The Laplace transform Y (s) of the output signal y(t) is obtained in the form

3 A B
Y (s) = = + .
s ( s + 2) s s+2

The coefficients A and B are obtained from

3 = A(s + 2) + Bs.

For s = 0, the coefficient A is obtained, A = 3/2, while s = −2 produces B = −3/2.

The output signal, y(t), follows as the inverse Laplace transform of Y (s),

3
y(t) = (1 − e−2t )u(t).
2

Example 1.24. A causal system is described by the differential equation

d2 y ( t ) dy(t)
2
+3 + 2y(t) = x (t),
dt dt
with the initial conditions y′(0) = 1 and y(0) = 0. Find the system output, y(t), for the input
signal x (t) = e−4t u(t).

⋆The Laplace transform of both sides (including the initial conditions) is

[s2 Y (s) − sy(0) − y′ (0)] + 3[sY (s) − y(0)] + 2Y (s) = X (s)
52 Continuous-Time Signals and Systems

or
Y (s)(s2 + 3s + 2) = X (s) + sy(0) + y′ (0) + 3y(0).
The Laplace transform of x (t) = e−4t u(t) is equal to X (s) = 1/(s + 4). The Laplace transform
of the output signal is equal to
s+5 A1 A2 A3
Y (s) = 2
= + + .
(s + 4)(s + 3s + 2) s+4 s+2 s+1

The coefficients Ai are obtained from

A i = ( s − s i )Y ( s ) | s = s i .

For example,
s+5
A1 = ( s + 4) = 1/6.
(s + 4)(s2 + 3s + 2) |s=−4
The other two coefficients are A2 = −3/2 and A3 = 4/3.
The output signal, y(t), is the inverse Laplace transform of Y (s), that is

1 −4t 3 4
y(t) = e u(t) − e−2t u(t) + e−t u(t).
6 2 3

Note that 16 e−4t u(t) = y p (t) is the forced response and − 32 e−2t u(t) + 34 e−t u(t) = yh (t) is the
natural response in the time domain.

1.7 BUTTERWORTH FILTER

The most common processing systems in communications and signal processing are the
filters, used to selectively pass a part of the input signal within a predefined band in the
frequency domain and to reduce possible interferences in such a way. The basic form of
filters is the lowpass filter. Here we will present a simple Butterworth lowpass filter.
The squared frequency response of the Butterworth lowpass filter is defined by

1
| H ( jΩ)|2 = 2N .
Ω
1+ Ωc

It is shown in Fig. 1.12, for various N. This filter definition contains two parameters. Order
of the filter is N. It is a measure of the transition sharpness from the passband to the stopband
region. For N → ∞, the amplitude form of an ideal lowpass filter is achieved. The second
parameter is the critical frequency Ωc . At the frequency Ω equal to the critical frequency,
Ω = Ωc , we get
1 1
| H ( jΩc )|2 = | H (0)|2 = ,
2 2
corresponding to 10 log(1/2) = −3[dB] gain, for any filter order N.
Ljubiša Stanković Digital Signal Processing 53

The squared frequency response may be written as

1
H ( jΩ) H (− jΩ) = 2N
jΩ
1+ jΩc

The corresponding Laplace domain form is given by

1
H (s) H (−s) = 2N for s = jΩ.
s
1+ jΩc

Poles of the product of the transfer functions, H (s) H (−s), are of the form
2N
sk
= −1 = e j(2πk+π )
jΩc
sk = Ωc e j(2πk+π )/2N + jπ/2 for k = 0, 1, 2, . . . , 2N − 1.

The poles of the product H (s) H (−s) of the transfer function H (s) of the Butterworth filter
and its reversed version H (−s) are located on the circle whose radius is Ωc and at the
positions defined by the phases

2πk + π π
αk = + for k = 0, 1, 2, . . . , 2N − 1.
2N 2

For a given filter order N and the critical frequency Ωc the only remaining decision is to
select a half of the poles sk that belong to H (s) and to declare that the remaining half of the
poles belongs to H (−s). Since we want that the designed filter is stable and causal then we
chose the poles
s0 , s1 , . . . , s N −1
within the left side of the s-plane, where Re{s} < 0, π/2 < αk < 3π/2. The symmetric
poles with Re{s} > 0 are the poles of H (−s). They are not used in the filter design.

Figure 1.12 Squared amplitude of the frequency response of a Butterworth filter for various orders N.
54 Continuous-Time Signals and Systems

Figure 1.13 Poles of a stable Butterworth filter for N = 3, N = 4, and N = 5.

Example 1.25. Design a lowpass Butterworth filter with the following filter order, N, and critical
frequency, Ωc ,
(a) N = 3 and Ωc = 1,
(b) N = 4 and Ωc = 3.

⋆(a) The poles for the third-order filter, N = 3, and critical frequency Ωc = 1, have the phases
2πk + π π
αk = + , for k = 0, 1, 2.
6 2
The pole values are
√
2π 2π 1 3
s0 = cos( ) + j sin( ) = − + j
3 3 2 2
2π π 2π π
s1 = cos( + ) + j sin( + ) = −1
3 3 3 3
√
2π 2π 2π 2π 1 3
s2 = cos( + ) + j sin( + )=− −j
3 3 3 3 2 2
with the third-order Butterworth filter transfer function
c 1
H (s) = √ √ = ,
(s + 1
−j 3 1 3 (s2 + s + 1)(s + 1)
2 2 )( s + 2 +j 2 )( s + 1)

where c = 1 is used to make H (0) = 1.

(b) Poles for N = 4 and Ωc = 3 are with phases

2πk + π π
αk = + , for k = 0, 1, 2, 3.
8 2
Their values are
π π π π π 3π π 3π
s0 = 3 cos( + ) + j3 sin( + ), s1 = 3 cos( + ) + j3 sin( + )
2 8 2 8 2 8 2 8
π 5π π 5π π 7π π 7π
s2 = 3 cos( + ) + j3 sin( + ), s3 = 3 cos( + ) + j3 sin( + ),
2 8 2 8 2 8 2 8
Ljubiša Stanković Digital Signal Processing 55

with the fourth-oder Butterworth filter transfer function given by

c 81
H (s) = = 2 ,
(s2 + 2.296s + 9)(s2 + 5.543s + 9) (s + 2.296s + 9)(s2 + 5.543s + 9)

where c = 81 is used to make H (0) = 1.

In practice, we usually do not know the filter order N, but the passband frequency Ω p
and the stopband frequency Ωs , of the filter, with the maximum attenuation in the passband
a p [dB] and the minimum attenuation in the stopband as [dB], as shown in Fig. 1.14. Based
on these values we can calculate the filter order, N, and the critical frequency, Ωc , needed
for the filter design.
The passband and stopband relations for N and Ωc are

1 2
2N ≥ A p (1.83)
Ωp
1+ Ωc
1 2
2N ≤ As , (1.84)
Ωs
1+ Ωc

where A p and As are the required amplitudes of the frequency response at the respective
passband and stopband frequencies, Ω p and Ωs . The relation

a = 20 log A

or A = 10a/20 should be used for the attenuation a given in [dB].

Figure 1.14 Specification of the Butterworth filter parameters in the passband and stopband.

Using the equality in both relations, (1.83) and (1.84), the order N follows

ln( A12 − 1) − ln( A12 − 1)

1 p s
N= .
2 ln Ω p − ln Ωs

The nearest greater integer is assumed for the filter order N. Next, we can use any of the
relations in (1.83) or (1.84) with the equality sign to calculate Ωc . If we choose the first
56 Continuous-Time Signals and Systems

2
one, then the critical frequency Ωc will satisfy H ( jΩ p ) = A2p , while if we use the second
relation, the value of Ωc will satisfy | H ( jΩs )|2 = A2s . These two values differ. However
both of them are within the defined criteria for the transfer function passband and stopband.
All other filter forms, like bandpass and highpass, may be obtained from a lowpass filter
with appropriate signal modulations. These modulations will be discussed for discrete-time
filter forms in Chapter V.
Part II

Deterministic Discrete-Time
Signals and Systems

57
Chapter 2
Discrete-Time Signals and Transforms

HEfirst step in numerical processing of signals is in their discretization in time. A

T continuous-time signal is converted into a sequence of numbers, defining the discrete-

time signal. The basic definitions of discrete-time signals and their transforms are
presented in this chapter. The key fact in the conversion from a continuous-time signal
into a sequence of numbers is that these two signal representations are equivalent under
certain conditions. The discrete-time signal may contain the same information as the original
continuous-time signal. The sampling theorem is fundamental for this relation between two
signal forms. It is presented in this chapter, after basic definitions of discrete-time signals
and systems are introduced.

2.1 DISCRETE-TIME SIGNALS

Discrete-time signals (discrete signals) are represented in a form of an ordered set of numbers
{ x (n)}. Commonly, they are obtained by sampling continuous-time signals. There exist
discrete-time signals whose independent variable is inherently discrete in nature as well.
In the case that a discrete-time signal is obtained by sampling a continuous-time signal,
we can write (Fig. 2.1),
x (n) = x (t)|t=n∆t ∆t. (2.1)

x(t) x(n) = x(t) ∆t t = n∆t

∆t t n

Figure 2.1 Signal discretization: continuous-time signal (left) and corresponding discrete-time signal (right).

Discrete-time signals are defined for an integer value of the argument n. We will use
the same notation for continuous-time and discrete-time signals, x (t) and x (n). However,

58
Ljubiša Stanković Digital Signal Processing 59

we hope that this will not cause any confusion since we will use different sets of variables,
for example, t and τ for continuous time and n and m for discrete time. Also, we hope that
the context will always be clear, so that there is no doubt what kind of signal is considered.
Notation x [n] is sometimes used in literature for discrete-time signals, instead of x (n).

Examples of discrete-time signals are presented next.

The discrete-time impulse signal is defined by

1, for n = 0
δ(n) = . (2.2)
0, for n 6= 0

It is presented in Fig. 2.2. In contrast to the continuous-time impulse signal, that cannot be

1 1
x(n)=u(n)

δ(n)
0 0

−1 −1
(a) (b)
−10 0 10 −10 0 10
t n
1 1
x(n)=b(n)

sin(nπ/4)

0 0

−1 −1
(c) (d)
−10 0 10 −10 0 10
n n

Figure 2.2 Illustration of discrete-time signals: (a) unit-step function, (b) discrete-time impulse signal, (c) boxcar
signal b(n) = u(n + 2) − u(n − 3), and (d) discrete-time sinusoid.

practically implemented and used, the discrete-time unit impulse is a signal that can easily
be implemented and used in realizations. In mathematical notation, this signal corresponds
to the Kronecker delta function

1, for m = n
δm,n = (2.3)
0, for m 6= n.

Any discrete-time signal can be written in a form of a sum of shifted and weighted
discrete-time impulses,
∞
x (n) = ∑ x ( k ) δ ( n − k ), (2.4)
k =−∞

as illustrated in Fig. 2.3.

The discrete unit-step signal is defined by

1, for n ≥ 0
u(n) = . (2.5)
0, for n < 0
60 Discrete-Time Signals and Transforms

4 4

2 2

−2δ (n+2)
x(n)

0 0

−2 −2

−4 −4
−5 0 5 −5 0 5
n n
4 4

2 2

−δ(n−1)
3δ(n)

0 0

−2 −2

−4 −4
−5 0 5 −5 0 5
n n

Figure 2.3 Signal x (n) along with corresponding discrete-time impulses.

The discrete-time impulse and the unit-step signal are related as

δ ( n ) = u ( n ) − u ( n − 1)
n
u(n) = ∑ δ ( k ).
k =−∞

The discrete-time complex sinusoidal signal is defined by

x (n) = Ae j(ω0 n+ ϕ) = A cos(ω0 n + ϕ) + jA sin(ω0 n + ϕ). (2.6)

A discrete-time signal is periodic if there exists an integer N such that

x ( n + N ) = x ( n ). (2.7)

Smallest positive integer N that satisfies this equation is called the period of the discrete-
time signal x (n). Note that the signal x (n) with a period N is also periodic with any integer
multiple of N. Some basic discrete-time signals are presented in Fig. 2.2.

Example 2.1. Check the periodicity of discrete-time signals x1 (n) = sin(2πn/36), x2 (n) =
cos(4πn/15 + 2), x3 (n) = exp( j0.1n), x4 (n) = x1 (n) + x2 (n), and x5 (n) = x1 (n) + x3 (n).

⋆Period of the discrete-time signal x1 (n) = sin(2πn/36) is obtained from 2πN1 /36 = 2πk,
where k is an integer. It is N1 = 36, for k = 1. The period N2 follows from 4πN2 /15 = 2πk as
N2 = 15 with k = 2. Period of signal x3 (n) should be calculated from 0.1N3 = 2πk. Obviously,
Ljubiša Stanković Digital Signal Processing 61

there is no integer k such that N3 is an integer. This signal is not periodic. The same holds for
x5 (n). The period of x4 (n) is a common period for signals x1 (n) and x2 (n) with N1 = 36 and
N2 = 15. It is N4 = 180.

A discrete-time signal is even if

x (n) = x (−n).

For an odd signal holds

x (n) = − x (−n).

Example 2.2. Show that a discrete-time signal may be written as a sum

x (n) = xe (n) + xo (n)

where xe (n) and xo (n) are its even and odd part, respectively.

⋆For a signal x (n) we can form its even and odd part as
x (n) + x (−n) x (n) − x (−n)
xe (n) = and xo (n) = .
2 2
Summing these two parts, the signal x (n) is reconstructed. Note that xo (0) = 0.

A signal is Hermitian if
x (n) = x ∗ (−n).
Magnitude of a discrete-time signal is defined as the maximum value of the signal
amplitude
Mx = max | x (n)| .
−∞<n<∞
Energy of discrete-time signals is defined by
∞
Ex = ∑ | x (n)|2 . (2.8)
n=−∞

The instantaneous power of x (n) is Px (n) = | x (n)|2 , while the average signal power is

1 N D E
PAV = lim ∑ | x (n)|2 = | x (n)|2 , (2.9)
N →∞ 2N + 1 n=− N

D E
where | x (n)|2 is used to denote an average over large number of signal values, as N → ∞.
The average power of signals with a finite energy (energy signals) is PAV = 0. For power
signals (when 0 < PAV < ∞) the energy is infinite, Ex → ∞.
62 Discrete-Time Signals and Transforms

Example 2.3. The energy of signal x (n) is Ex = 10. The energy of its even part is Exe = 3. Find
the energy of its odd part.

⋆The energy of signal is

∞ ∞ ∞
Ex = ∑ | x (n)|2 = ∑ | xe (n) + xo (n)|2 = ∑ [ xe (n) + xo (n)][ xe (n) + xo (n)]∗
n=−∞ n=−∞ n=−∞
∞ ∞ ∞
= ∑ | xe (n)|2 + ∑ | xo (n)|2 + ∑ [ xo (n) xe∗ (n) + xe (n) xo∗ (n)].
n=−∞ n=−∞ n=−∞

The terms xo (n) xe∗ (n) and xe (n) xo∗ (n) in the last sum correspond to odd signals

xo (−n) xe∗ (−n) = − xo (n) xe∗ (n)

xe (−n) xo∗ (−n) = − xe (n) xo∗ (n).

Their sum is zero,

∞ ∞
∑ xo (n) xe∗ (n) = ∑ xe (n) xo∗ (n) = 0.
n=−∞ n=−∞

For the signals xe (n) and xo (n), satisfying the previous relation, we say that they are orthogonal.
Therefore, for the energies Ex , Exe , and Exo , holds

Ex = Ex e + Ex o .

Obviously Ex0 = Ex − Exe = 7.

2.2 DISCRETE-TIME SYSTEMS

Discrete-time (discrete) system transforms one discrete-time signal (input) into the other
(output signal)
y(n) = T { x (n)}. (2.10)
A discrete system T {·} is linear if for any two signals x1 (n) and x2 (n) and any two constants
a1 and a2 holds

y(n) = T { a1 x1 (n) + a2 x2 (n)} = a1 T { x1 (n)} + a2 T { x2 (n)}. (2.11)

A discrete system is time-invariant if for

y(n) = T { x (n)} (2.12)

holds
T { x (n − n0 )} = y(n − n0 ),
Ljubiša Stanković Digital Signal Processing 63

for any n0 .
For any input signal x (n) the signal at the output of a linear time-invariant discrete
system can be calculated if we know the output to the impulse signal. The output to the
impulse signal, h(n) = T {δ(n)}, is the impulse response.
The output to an input signal x (n) is
( )
∞
y(n) = T { x (n)} = T ∑ x (k)δ(n − k) .
k =−∞

For a linear time-invariant discrete system we get

∞ ∞
y(n) = ∑ x (k)T {δ(n − k )} = ∑ x ( k ) h ( n − k ). (2.13)
k =−∞ k =−∞

This is a discrete-time convolution. Its notation is

∞
x (n) ∗n h(n) = ∑ x ( k ) h ( n − k ). (2.14)
k =−∞

Discrete-time convolution is a commutative operation,

x ( n ) ∗ n h ( n ) = h ( n ) ∗ n x ( n ). (2.15)

Example 2.4. Calculate discrete-time convolution of signals x (n) and h(n) shown in Fig. 2.4.

3 3
2 2
1 1
h(n)
x(n)

0 0
−1 −1
−2 −2
−4 −2 0 2 4 6 8 −4 −2 0 2 4 6 8

Figure 2.4 Input signal and impulse response.

⋆By definition, according to Fig. 2.5, we have

∞
y (0) = ∑ x (k)h(−k) = 1 − 1 + 2 = 2,
k =−∞
∞
y (1) = ∑ x (k)h(1 − k) = −1 − 1 + 1 + 4 = 3.
k =−∞
64 Discrete-Time Signals and Transforms

3 3
2 2

h(−k)
1 1
x(k)

0 0
−1 −1
−2 −2
−4 −2 0 2 4 6 8 −4 −2 0 2 4 6 8
n n
3 3
2 2
h(1−k)

h(2−k)
1 1
0 0
−1 −1
−2 −2
−4 −2 0 2 4 6 8 −4 −2 0 2 4 6 8
n n

Figure 2.5 The signals x (k ), h(−k), h(1 − k), and h(2 − k ) used for the calculation of the output signal values
y(0), y(1), and y(2).

In a similar way y(−2) = 2, y(−1) = −1, y(2) = 6, y(3) = 2, y(4) = −1, y(5) = −1, and
y(n) = 0, for all other n. The convolution y(n) is shown in Fig. 2.6.

8
6
4
y(n)

2
0
−2
−4 −2 0 2 4 6 8
n

Figure 2.6 The output signal y(n) = x (n) ∗ h(n).

Example 2.5. Calculate the convolution of signals x (n) = n[u(n) − u(n − 10)] and h(n) = u(n).

⋆The convolution of these two signals is

∞
y(n) = ∑ k[u(k) − u(k − 10)]u(n − k)
k=−∞
Ljubiša Stanković Digital Signal Processing 65

Using the fact that u(k) − u(k − 10) = 1 for 0 ≤ k ≤ 9 and u(n − k) = 1 for k ≤ n we get
 n

 ∑ k = n n+ 1
for 0 ≤ n ≤ 9
 2
k =0
y(n) = ∑ k = 9


0≤ k ≤9  ∑ k = 45 for n > 9
k≤n k =0
n+1
=n [u(n) − u(n − 10)] + 45u(n − 10).
2

Example 2.6. If the response of a linear time-invariant system to the unit-step is y(n) =
T {u(n)} = e−n u(n) find the impulse response h(n) of this system.

⋆The impulse response is

h(n) = T {δ(n)} = T {u(n) − u(n − 1)} = T {u(n)} − T {u(n − 1)}
= e−n u(n) − e−(n−1) u(n − 1) = e−n [u(n) − e u(n − 1)].

A discrete system is causal if there is no response before the input signal appears. For
causal linear time-invariant discrete systems h(n) = 0 for n < 0 holds. For a signal that may
be an impulse response of a causal system we say that it is a causal signal or one-sided signal.
A discrete system is stable if any input signal with a finite magnitude, Mx =
max−∞<t<∞ | x (n)|, produces the output y(n) whose values are finite, |y(n)| < ∞. A
discrete linear time-invariant system is stable if
∞
∑ |h(m)| < ∞. (2.16)
m=−∞

The output of a linear time-invariant system is

∞ ∞
|y(n)| = | ∑ x (n − m)h(m)| ≤ ∑ | x (n − m)||h(m)|
m=−∞ m=−∞
∞
≤ Mx ∑ |h(m)|.
m=−∞

Therefore |y(n)| < ∞ if (2.16) holds. It can be shown that the absolute sum convergence
of the impulse response h(n) is the necessary condition for a linear time-invariant discrete
system to be stable as well.
66 Discrete-Time Signals and Transforms

2.3 FOURIER TRANSFORM OF DISCRETE-TIME SIGNALS

The Fourier transform of a discrete-time signal is defined by

∞
X (e jω ) = ∑ x (n)e− jωn . (2.17)
n=−∞

Notation X (e jω ) is used to emphasize the fact that it is a periodic function of the normalized
frequency ω. The period is 2π.
In order to establish the relation between the Fourier transform of discrete-time signals
and the Fourier transform of continuous-time signals,

Z∞
X (Ω) = x (t)e− jΩt dt,
−∞

we will write an approximation of the Fourier transform of continuous-time signal according

to the rectangular rule of numerical integration,
∞
X (Ω) ∼
= ∑ x (n∆t)e− jΩn∆t ∆t. (2.18)
n=−∞

By using the notation

x (n∆t)∆t −→ x (n) and Ω∆t −→ ω, (2.19)

the previous approximation can be written as

∞
∑ x (n)e− jωn = X (e jω ). (2.20)
n=−∞

This is the Fourier transform of the discrete-time signal x (n).

Later we will show that, under certain conditions, the Fourier transform X (e jω )
of discrete-time signals is not just an approximation of the Fourier transform X (Ω) of
continuous-time signals, but the equality holds (that is, X (Ω) = X (e jω )) with Ω∆t = ω
and −π ≤ ω < π.
The inverse Fourier transform of discrete-time signals is obtained by multiplying both
sides of (2.20) by e jωm and integrating them over a period of X (e jω )

∞ Zπ Zπ
∑ x (n) e− jω (n−m) dω = X (e jω )e jωm dω.
n=−∞ −π −π

Since
Zπ
e− jω (n−m) dω = 2πδ(n − m),
−π
Ljubiša Stanković Digital Signal Processing 67

we get the inverse Fourier transform relation

Zπ
1
x (n) = X (e jω )e jωn dω. (2.21)
2π
−π

Example 2.7. Find the Fourier transform of the discrete-time signal

x (n) = Ae−α|n|

where α > 0 is a real constant.

⋆The Fourier transform of this signal is

−1 ∞
X (e jω ) = A + ∑ Aeαn− jωn + ∑ Ae−αn− jωn
n=−∞ n =1
!
e jω −α e− jω −α 1 − e−2α
= A 1+ + =A . (2.22)
1 − e jω −α 1 − e− jω −α 1 − 2e−α cos(ω ) + e−2α

Example 2.8. Find the inverse Fourier transform of a discrete-time signal if X (e jω ) = 2πδ(ω ) for
−π ≤ ω < π and X (e jω ) = 2π ∑∞
k=−∞ δ ( ω + 2kπ ) for any ω.

⋆By definition
Zπ
1
x (n) = 2πδ(ω )e jωn dω = 1.
2π
−π
Therefore, the Fourier transform of signal x (n) = 1 is
∞ ∞
∑ e− jωn = 2π ∑ δ(ω + 2kπ ). (2.23)
n=−∞ k =−∞

The equivalent form in the continuous-time domain is obtained (by using ω = ΩT and
δ( TΩ) = δ(Ω)/T) as
∞ ∞
2π ∞
∑ e jΩnT = 2π ∑ δ(ΩT + 2kπ ) = ∑ δ(Ω + 2kπ/T ). (2.24)
n=−∞ k=−∞
T k=− ∞
68 Discrete-Time Signals and Transforms

2.3.1 Properties

Linearity: The Fourier transform of discrete-time signals is linear

FT{ ax (n) + by(n)} = aX (e jω ) + bY (e jω ), (2.25)

where X (e jω ) and Y (e jω ) are the Fourier transforms of the discrete-time signals x (n) and
y(n), respectively.
Shift and modulation: With respect to the signal shift and modulation the Fourier transform
of discrete-time signals behaves in the same way as the Fourier transform of continuous-time
signals,
FT{ x (n − n0 )} = X (e jω )e− jn0 ω (2.26)
and
FT{ x (n)e jω0 n } = X (e j(ω −ω0 ) ). (2.27)

Example 2.9. The Fourier transform of a discrete-time signal x (n) is X (e jω ). Find the Fourier
transform of y(n) = x (2n).

⋆For y(n) = x (2n) the Fourier transform is

∞ ∞
x (n) + (−1)n x (n) − jωn/2
FT{ x (2n)} = ∑ x (2n)e− jωn = ∑ e
n=−∞ n=−∞ 2
1 ∞
= ∑ [ x(n) + e− jnπ x(n)]e− jωn/2
2 n=− ∞
1 1
= [ X (e jω/2 ) + X (e j(ω/2+π ) )] = [ X (e jω/2 ) + X (e j(ω +2π )/2 )]. (2.28)
2 2
The period of this Fourier transform is 2π. Period of X (e jω/2 ) is 4π.

Example 2.10. Calculate the Fourier transform of the discrete-time signal (rectangular window),
w R ( n ) = u ( N + n ) − u ( n − N − 1). (2.29)

Write the Fourier transform of a Hann(ing) window

1
w H (n) = [1 + cos(nπ/N )] [u( N + n) − u(n − N − 1)] .
2

⋆By definition
N
1 − e− jω (2N +1) sin(ω 2N2+1 )
WR (e jω ) = ∑ e− jωn = e jωN − jω
= . (2.30)
n=− N 1−e sin(ω/2)
Ljubiša Stanković Digital Signal Processing 69

The Fourier transform of the Hann(ing) window can easily be written as

jω 1 N 1 jnπ/N 1 − jnπ/N − jωn
WH ( e ) = ∑ 1 + 2e + e e = (2.31)
2 n=− N
2
π 2N +1 π 2N +1
sin(ω 2N2+1 ) sin((ω − N ) 2 ) sin((ω + N ) 2 )
= + π + π . (2.32)
2 sin(ω/2) 4 sin((ω − N )/2) 4 sin((ω + N )/2)

As the window width increases in the time domain the main lobe width in the Fourier domain

1
N=4

W (e )
jω
w (n)

0.5
R

R
0

−10 0 10 −π ω π
n

1
N=8
W (e )
jω
w (n)

0.5
R

−10 0 10 −π ω π
n

1
N=8
W (e )
jω
w (n)

0.5
H

−10 0 10 −π ω π
n

Figure 2.7 Discrete-time signal in a form of rectangular window of the widths 2N + 1 = 9 and 2N + 1 = 17
samples (top and middle), and a Hann(ing) window with 2N + 1 = 17 (bottom). The time domain values are on the
left while the Fourier transforms of these discrete-time signals are on the right.

is narrowing. The first zero value of the Fourier transform of a rectangular window is at
ω (2N + 1)/2 = π, that is, at ω = 2π/(2N + 1) where 2N + 1 is the signal duration. In
the case of a Hann(ing) window the main lobe is wider as compared to the rectangular window of
the same width, but its convergence is much faster with very reduced oscillations in the Fourier
transform, Fig. 2.7.
70 Discrete-Time Signals and Transforms

Convolution: The Fourier transform of a convolution of discrete-time signals,

∞ ∞
FT{ x (n) ∗n h(n)} = ∑ ∑ x (k)h(n − k)e− jnω = X (e jω ) H (e jω ), (2.33)
n=−∞ k =−∞

is equal to the product of the Fourier transforms of corresponding discrete-time signals.

The Fourier transform of the impulse response
∞
H (e jω ) = ∑ h(n)e− jωn
n=−∞

is called frequency response of a discrete linear time-invariant system.

Example 2.11. Find the output of a discrete linear time-invariant system with frequency response
H (e jω ) if the input signals are:
(a) x (n) = Ae jω0 n and
(b) x (n) = A cos(ω0 n + ϕ). What is the output if the impulse response h(n) is real-valued?

⋆(a) The output signal to the input x (n) is

∞ ∞
y(n) = ∑ h(k) x (n − k) = ∑ h(k) Ae jω0 (n−k)
k=−∞ k=−∞
∞
jω0
= Ae jω0 n ∑ h(k)e− jω0 k = Ae jω0 n H (e jω0 ) = A H (e jω0 ) e j(ω0 n+arg{ H (e }) .
k =−∞

(b) The input signal can be written as

A j ( ω0 n + ϕ ) A − j ( ω0 n + ϕ )
x (n) = A cos(ω0 n + ϕ) = e + e .
2 2
According to the linearity property, the output is
A
jω0 A
− jω0
y(n) = H (e jω0 ) e j(ω0 n+ ϕ)+ j arg{ H (e )} + H (e− jω0 ) e− j(ω0 n+ ϕ)+ j arg{ H (e )} .
2 2
For a real-valued impulse response
∞ ∞
H (e jω ) = ∑ h(n) cos(ωn) − j ∑ h(n) sin(ωn)
n=−∞ n=−∞

holds
H (e jω ) = H ∗ (e− jω )
with the even amplitude and odd phase of the transfer function

H (e jω ) = H (e− jω )
∞
∑n=−∞ h(n) sin(ωn)
arg{ H (e jω ) = − arctan = − arg{ H (e− jω )}.
∑∞n=−∞ h ( n ) cos ( ωn )
Ljubiša Stanković Digital Signal Processing 71

The output signal for a real-valued impulse response and x (n) = A cos(ω0 n + ϕ) is of the form

y(n) = A H (e jω0 ) cos(ω0 n + ϕ + arg{ H (e jω0 )}).

Example 2.12. Find the impulse response of an ideal discrete-time differentiator

H (e jω ) = jω for − π ≤ ω < π.

⋆The impulse response is

Zπ Zπ Zπ
1 j 1
h(n) = jωe jωn dω = ω cos(ωn)dω − ω sin(ωn)dω
2π 2π 2π
−π −π −π
Zπ Zπ
−1 1 cos(ωn) π 1
= ω sin(ωn)dω = ω + πn cos(ωn)dω.
π π n 0
0 0

Since the last integral is equal to 0, the final result is

cos(πn) (−1)n
h(n) = =
n n
for n 6= 0 and h(n) = 0 for n = 0. Using samples n = ±1, ±2, . . . , ± N the approximation of
the frequency response is
N N
sin(ωn)
HN (e jω ) = ∑ h(n)e− jωn = 2j ∑ (−1)n−1 .
n=− N n =1
n

Note that this system is not causal.

Product of signals: The Fourier transform of a product of the two discrete-time signals x (n)
and h(n) is equal to the convolution of their Fourier transforms in the frequency domain,

Zπ
1
FT{ x (n)h(n)} = X (e jθ ) H (e j(ω −θ ) )dθ = X (e jω ) ∗ω H (e jω ). (2.34)
2π
−π

This convolution is periodic with period 2π (circular convolution).

72 Discrete-Time Signals and Transforms

2.3.2 Spectral Energy and Power Density

Parseval’s theorem for discrete-time signals reads

∞ ∞ Zπ
1
∑ x(n)y (n) = ∑ 2π X (e jω )e jωn y∗ (n)dω
∗
(2.35)
n=−∞ n=−∞ −π
Zπ
! Zπ
∞
1 − jωn ∗ 1
= jω
X (e ) ∑ ( e y ( n )) dω = X (e jω )Y ∗ (e jω )dω.
2π n=−∞ 2π
−π −π

For a signal x (n), Parseval’s theorem is

∞ Zπ 2
1
2
∑ | x(n)| = 2π X (e jω ) dω = Ex .
n=−∞ −π

2
Function X (e jω ) is the spectral energy density of signal x (n).
Since the average power of a signal x (n) is defined by
N
1
PAV = lim ∑ | x (n)|2 , (2.36)
N →∞ 2N + 1 n=− N

its power spectral density may be defined as

1 2

Pxx (e jω ) = lim X N (e jω ) , (2.37)
N →∞ 2N + 1

where the Fourier transform of x (n) within − N ≤ n ≤ N is denoted by X N (e jω ). The

power spectral density can be written as

N N
1
Pxx (e jω ) = lim ∑ ∑ x (n) x ∗ (m)e− jω (n−m) . (2.38)
N →∞ 2N + 1 n=− N m=− N

For a very specific signal x (n) = Ae j(ω0 n+ ϕ) , which satisfies r (k ) = x (n) x ∗ (n − k) =

A2 e jω0 k ,
the power spectral density is

2N
1
Pxx (e jω ) = lim ∑ (2N + 1 − |k|)r (k)e− jωk ,
N →∞ 2N + 1
k =−2N

since the value r (0), for n − m = k = 0, appears 2N + 1 times along the diagonal in n, m
domain in (2.38). The value for n − m = k = ±1 appears 2N times, and so on. The value
r (k), for n − m = k, appears 2N + 1 − |k| times in double summation (2.38). Note that
Pxx (e jω ), in this case, is the Fourier transform of r (k) multiplied by the Bartlett window
1 − |k|/(2N + 1).
Ljubiša Stanković Digital Signal Processing 73

2.4 SAMPLING THEOREM IN THE TIME DOMAIN

Consider a continuous-time signal, x (t), whose Fourier transform X (Ω) is nonzero within a
limited frequency band |Ω| ≤ Ωm , that is

X (Ω) = 0 for |Ω| > Ωm , (2.39)

The signal x (t) can be reconstructed at any t from the discrete-time samples, x (n∆t),
acquired at the instants t = n∆t with the sampling interval ∆t such that

π 1
∆t < = ,
Ωm 2 fm

where Ωm = 2π f m . The discrete-time signal is commonly denoted by

x (n) = x (n∆t)∆t,

Now we will prove this fundamental statement in the analog to digital signal conversion.
Since a limited duration of X (Ω) is assumed, we can make its periodic extension

∞
X p (Ω) = ∑ X (Ω + 2Ω0 m) (2.40)
m=−∞

with the period in the frequency domain equal to 2Ω0 . For the reconstruction of the original
aperiodic Fourier transform, X (Ω), from this periodic extension, X p (Ω), it is of crucial
importance that the basic period of X p (Ω) contains undisturbed X (Ω), that is

X p (Ω) = X (Ω) for | Ω | < Ω0 .

This condition is satisfied if the extension period 2Ω0 and the maximum frequency in the
Fourier transform of the signal, Ωm , satisfy the inequality Ω0 > Ωm . In this case, it is
possible to make transition from X (Ω) to X p (Ω), and back, without losing any information.
Of course, that would not be the case if Ω0 > Ωm did not hold. By periodic extension of
X (Ω), with Ω0 ≤ Ωm the overlapping (aliasing) would have occurred in X p (Ω). It would
not be reversible. A periodic extension of the Fourier transform with Ω0 > Ωm is illustrated
in Fig. 2.8.
The periodic function X p (Ω) can be expanded into the Fourier series with coefficients

ZΩ0 Z∞
1 1
X− n = X p (Ω)e jπΩn/Ω0 dΩ = X (Ω)e jπΩn/Ω0 dΩ. (2.41)
2Ω0 2Ω0
− Ω0 −∞

The integration limits are extended to the infinity since X (Ω) = X p (Ω) within the basic
period interval and X (Ω) = 0 outside this interval.
74 Discrete-Time Signals and Transforms

X(Ω)

−Ωm Ωm Ω

Xp(Ω) = X(Ω)
−Ω0 < Ω < Ω0

−Ω0 −Ωm Ωm Ω0 Ω

Figure 2.8 The Fourier transform of a signal, X (Ω), such that X (Ω) = 0 for |Ω| > Ωm (top) and its periodically
extended version, X p (Ω), with the period 2Ω0 > 2Ωm (bottom).

The inverse Fourier transform of the continuous-time signal x (t) is defined by

Z∞
1
x (t) = X (Ω)e jΩt dΩ. (2.42)
2π
−∞

By comparing equations (2.41) and (2.42), we easily conclude that

π π
X− n = x (t)|t=πn/Ω0 = x (n∆t)∆t with ∆t = ,
Ω0 Ω0

meaning that the Fourier series coefficients of the periodically extended Fourier transform of
X (Ω) are the samples of the signal x (t), acquired at the instants t = n∆t, with the sampling
interval ∆t = π/Ω0 . Therefore, the samples x (n∆t) of the signal x (t) and the periodically
extended Fourier transform, X p (Ω) are the Fourier series pair

∞
x (n∆t)∆t = X−n ←→ X p (Ω) = ∑ X (Ω + 2Ω0 m) (2.43)
m=−∞

with ∆t = π/Ω0 .
The reconstruction formula for x (t) form samples x (n∆t) then follows from

Z∞ ZΩ0
1 1
x (t) = X (Ω)e jΩt dΩ = X p (Ω)e jΩt dΩ. (2.44)
2π 2π
−∞ − Ω0
Ljubiša Stanković Digital Signal Processing 75

The periodic Fourier transform X p (Ω) is now expanded into Fourier series to produce

ZΩ0
!
∞
1 jπnΩ/Ω0
x (t)= ∑ Xn e e jΩt dΩ. (2.45)
2π n=−∞
− Ω0

With Xn = x (−n∆t)∆t we get

ZΩ0
!
∞
1 jπnΩ/Ω0
x (t)= ∑ x (−n∆t)∆te e jΩt dΩ. (2.46)
2π n=−∞
− Ω0

Finally, the reconstruction formula

∞ π
sin( ∆t (t − n∆t))
x (t) = ∑ x (n∆t) π , (2.47)
n=−∞ ∆t ( t − n∆t )

follows by evaluating the simple integral over Ω. In this way, the signal x (t) is expressed for
any t, in terms of its samples x (n∆t).

Example 2.13. The sampling theorem and relation (2.47) can be used to prove that X (Ω) = X (e jω )
with Ω∆t = ω for |ω | < π for the signals x (t) sampled at the rate which satisfies the sampling
theorem.

⋆Starting from the inverse Fourier transform definition

Z∞
X (Ω) = x (t)e− jΩt dt
−∞

the signal x (t), which satisfies the sampling theorem, can by written in terms of its samples,
according to the third row of (2.46), as

ZΩ0
!
∞
1 − j∆tnθ
x (t) = ∑ x (n∆t)∆te e jθt dθ.
2π n=−∞
− Ω0

The Fourier transform of such a signal x (t) is given by

Z∞ ZΩ0
!
∞
1 − j∆tnθ
X (Ω) = ∑ x (n∆t)∆te e jθt dθe− jΩt dt
2π n=−∞
−∞ − Ω0

∞ ZΩ0 ∞
= ∑ x (n∆t)∆t δ(θ − Ω)e− j∆tnθ dθ = ∑ x (n∆t)∆te− j∆tnΩ for |Ω| < Ω0 , (2.48)
n=−∞ n=−∞
− Ω0
76 Discrete-Time Signals and Transforms

resulting in
∞
X (Ω) = ∑ x (n)e− jωn = X (e jω ) for |ω | < π
n=−∞
with ω = Ω∆t and x (n) = x (n∆t)∆t.

Example 2.14. If the highest frequency in a signal x (t) is Ωm1 and the highest frequency in
a signal y(t) is Ωm2 what should be the sampling interval for the signals x (t)y(t) and
x (t − t1 )y∗ (t − t2 )? The highest frequency Ωm in a signal is used in the sense that the Fourier
transform of the signal is zero for |Ω| > Ωm .

⋆The Fourier transform of a product x (t)y(t) is a convolution of the Fourier transforms

X (Ω) and Y (Ω). Since these Fourier transforms are of limited duration within |Ω| < Ωm1
and |Ω| < Ωm2 , respectively, in general, their convolution is limited to the frequency band
|Ω| < Ωm1 + Ωm2 . Therefore, the sampling interval for y(t) should be
π
∆t < .
Ωm1 + Ωm2
Shifts in the time domain and the complex conjugate operation do not change the Fourier transform
width. Therefore, the conclusion remains the same for the signal x (t − t1 )y∗ (t − t2 ).

Example 2.15. If the signal

x (t) = e−|t|
is sampled with ∆t = 0.1, write the Fourier transform of the obtained discrete-time signal: (a) by
a periodical extension of the continuous-time Fourier transform and (b) by a direct calculation
based on the discrete-time signal. Comment on the expected error due to the discretization.

⋆The Fourier transform of this signal is

Z∞ Z0 Z∞
− jΩt t − jΩt
X (Ω) = x (t)e dt = ee dt + e−t e− jΩt dt
−∞ −∞ 0
1 1 2
= + = . (2.49)
1 − jΩ 1 + jΩ 1 + Ω2

After sampling the signal in the time domain with the sampling interval ∆t = 0.1, the Fourier
transform is periodically extended with the period 2Ω0 = 2π/∆t = 20π.
(a) The periodic Fourier transform is
2 2 2
X p (Ω) = · · · + + + + ...
1 + (Ω + 20π )2 1 + Ω2 1 + (Ω − 20π )2
Ljubiša Stanković Digital Signal Processing 77

The value of X p (Ω) at the period ending points ±10π will approximately be equal to
X p (±10π ) = 2/(1 + 100π 2 ) ∼ = 0.002. By comparing this value with the maximum Fourier
transform value X (0) = 2, we can conclude that the expected error due to the discretization of
this signal (since it does not strictly satisfy the sampling theorem) will be of a 0.1% order.
(b) The discrete-time signal obtained by sampling x (t) = exp(− |t|) with ∆t = 0.1 is
x (n) = 0.1e−0.1|n| . Its Fourier transform is already calculated with A = 0.1 and α = 0.1 in
equation (2.22). The result is

1 − e−0.2
X (e jω ) = 0.1 . (2.50)
1 − 2e−0.1 cos(ω ) + e−0.2

Therefore, the exact value of an infinite sum in X p (Ω) is X (e jω ) with ω = Ω∆t = 0.1Ω
∞
2 1 − e−0.2
X p (Ω) = ∑ 1 + (Ω + 20kπ )2
= 0.1
1 − 2e−0.1 cos(0.1Ω) + e−0.2
.
k=−∞

In this way, we have solved an interesting mathematical problem of finding a sum of an infinite
series.
For Ω = 0, the original value of the Fourier transform is X (0) = 2. In the signal that could
be reconstructed based on the discretized signal X p (0) = 0.1(1 + e−0.1 )/(1 − e−0.1 ) = 2.00167.
The increase of 0.00167 is due to the periods overlapping. This overlapping manifests produces
the aliasing error in X (0). The value of error corresponds to our previous conclusion of about a
0.1% error order.

Example 2.16. A continuous-time signal

x (t) = cos(25πt + π/4) + sin(50πt − π/3)

is sampled with the sampling interval ∆t = 1/100 and a discrete-time signal x (n) = x (n∆t)∆t
is formed. The discrete-time signal is processed using the system whose impulse response is
1 1 1
h(n) = δ ( n ) + δ ( n − 2) + δ ( n + 2)
2 4 4
Find the output signal y(n) and the corresponding continuous-time signal y a (t).

⋆The discrete-time input signal is

x (n) = [cos(nπ/4 + π/4) + sin(nπ/2 − π/3)]∆t

with the Fourier transform

∞
π π π
X (e jω ) = ∑ [δ(ω + + 2kπ )e− jπ/4 + δ(ω − + 2kπ )e jπ/4 ]
100 k=−∞ 4 4
∞
π π π
+ ∑ [δ(ω + + 2kπ )e− jπ/6 + δ(ω − + 2kπ )e jπ/6 ].
100 k=−∞ 2 2
78 Discrete-Time Signals and Transforms

The frequency response of the discrete system is

∞
1
H (e jω ) = ∑ h(n)e− jωn = (1 + cos(2ω )).
n=−∞ 2

The frequency response values at the frequencies of nonzero values in X (e jω ), within the
basic period −π ≤ ω < π, are H (e± jπ/4 ) = 1/2 and H (e± jπ/2 ) = 0. Therefore, the Fourier
transforms of the output signal is
π π π
Y (e jω ) = H (e jω ) X (e jω ) = [δ(ω + )e− jπ/4 + δ(ω − )e jπ/4 ] for − π ≤ ω < π.
200 4 4
The output discrete-time signal is
1
y(n) = cos(nπ/4 + π/4)∆t.
2
The corresponding continuous-time output signal is given by
1 π 1
y(t) = cos(n + π/4) = cos(25πt + π/4).
2 4∆t 2
Hint: Find the output signal for the same input and h(n) = ∑2i=−2 δ(n − i ).

2.5 PROBLEMS

Problem 2.1. Check the periodicity and find the period of signals:
(a) x (n) = sin(2πn/32),
(b) x (n) = cos(9πn/82),
(c) x (n) = e jn/32 , and
(d) x (n) = sin(πn/5) + cos(5πn/6) − sin(πn/4).
Problem 2.2. Check the linearity and time-invariance of the discrete system described by
equation
y(n) = x (n) + 2.
Problem 2.3. The output of a linear time-invariant discrete system to the input signal
x (n) = u(n) is y(n) = 2−n u(n). Find the impulse response h(n). Is the system stable?
Problem 2.4. Find the convolution

y(n) = x (n) ∗n x (n)

for x (n) = u(n) − u(n − 5).

Problem 2.5. Find the convolution of discrete-time signals x (n) = e−|n| and h(n) =
u ( n + 5) − u ( n − 6).
Problem 2.6. A discrete system consists of systems with impulse responses h1 (n) =
e− an u(n), h2 (n) = e−bn u(n), and h3 (n) = u(n). Find the impulse response of the resulting
system for:
Ljubiša Stanković Digital Signal Processing 79

(a) Systems h1 (n), h2 (n), and h3 (n) connected in parallel,

(b) System h1 (n) connected in parallel with a cascade of systems h2 (n) and h3 (n).
Problem 2.7. Consider three causal linear time-invariant systems in cascade. Impulse
responses of these systems are h1 (n), h2 (n), and h2 (n), respectively. The impulse response
of the second and the third system is h2 (n) = u(n) − u(n − 2), while the impulse response
of the whole system,
h ( n ) = h1 ( n ) ∗ n h2 ( n ) ∗ n h2 ( n ),
is shown in Fig. 2.9 (left).
h(n) = h (n)*h (n)*h (n)

12 3
2

10 2
8
1
2

x(n)
6
4 0
1

2 −1
0
−2
−2 0 2 4 6 8 −2 0 2 4 6 8

Figure 2.9 Problem 2.7, impulse response h(n) (left) and Problem 2.14, discrete signal x (n) (right).

Find h1 (n) and y(n) = h(n) ∗n x (n), with x (n) = δ(n) − δ(n − 1).
Problem 2.8. Find the output of a discrete system whose impulse response is

h(n) = ne−n/2 u(n)

to the input signal

x (n) = 5 sin(πn/10) − 3 cos(πn/6 + π/6).

Find the sum

∞
S= ∑ ne−n/2 .
n =0

Problem 2.9. Find the Fourier transform of signals:

(a) x (n) = u(n),
(b) x (n) = 2 cos(ω0 n)u(n), and
(c) y(n) = ∑∞ jω
k =−∞ x ( n + kN ) if the Fourier transform of x ( n ) is X ( e ).

Problem 2.10. In order to design a system whose output will produce an approximation of
the input signal derivative we may use a system with the impulse response

h(n) = a[δ(n + 1) − δ(n − 1)] + b[δ(n + 2) − δ(n − 2)].

Find the constants a and b such that

H (e jω ) ∼
= jω for small ω,
80 Discrete-Time Signals and Transforms

that is,
dH (e jω ) d2 H (e jω )
= j and = 0.
dω ω =0 dω 2 ω =0

Problem 2.11. Find the Fourier transform of the following discrete-time signal (triangular
window)
|n|
wT (n) = 1 − [u(n + N ) − u(n − N − 1)].
N+1
with N being an even number.

Problem 2.12. Find the value of the definite integral

Zπ
1 sin2 (( N + 1)ω/2)
I= dω.
2π sin2 (ω/2)
−π

Problem 2.13. A window is formed as a sum of the three windows,

w ( n ) = w H ( n + N ) + w H ( n ) + w H ( n − N ),

where w H (n) is the Hann(ing) window

1
w H (n) = [1 + cos(nπ/N )] [u( N + n) − u(n − N − 1)] .
2

Plot the window w(n) and express its Fourier transform as a function of the Fourier transform
of the Hann(ing) window WH (e jω ). Generalize the results for

K
w(n) = ∑ w H (n + kN ).
k =−K

Problem 2.14. A discrete-time signal x (n) is shown in Fig. 2.9 (right). Without calculating
its Fourier transform X (e jω ) find

Zπ Zπ 2
j0 jπ jω
X ( e ), X ( e ), X (e )dω, X (e jω ) dω,
−π −π

and a signal whose Fourier transform is the real part of X (e jω ), denoted by Re{ X (e jω )}.

Problem 2.15. Find the Fourier transform of the discrete-time signal

y(n) = ne−n/4 u(n).

Ljubiša Stanković Digital Signal Processing 81

Using this Fourier transform find the center of gravity of signal x (n) = e−n/4 u(n), defined
by
∞
∑ nx (n)
n=−∞
ng = ∞ .
∑ x (n)
n=−∞

Problem 2.16. Impulse response of a discrete system is given by:

sin(nπ/3)
(a) h(n) = , with h(0) = 1/3,
nπ
sin2 (nπ/3)
(b) h(n) = , with h(0) = 1/9,
(nπ )2
sin((n − 2)π/4)
(c) h(n) = , with h(2) = 1/4.
( n − 2) π

Show that the frequency response of the system with the impulse response h(n) =
sin(nπ/3)/nπ is H (e jω ) = 1 for |ω | ≤ π/3 and H (e jω ) = 0 for π/3 < |ω | < π. Find
the frequency responses in other two cases (b) and (c). Find the output of these three systems
to the input signal x (n) = sin(nπ/6).

Problem 2.17. A continuous-time signal x (t) = cos(20πt + π/4) + sin(90πt) is sampled

with a step ∆t and a discrete-time signal x (n) = x (n∆t)∆t is formed. The signal is convolved
with h(n) = sin(nπ/2)/(nπ ). (a) What is the result of this convolution for ∆t = 1/100?
(b) If the signal is sampled with ∆t = 1/50 what is the output signal? (c) Find the result of
convolution for ∆t = 3/100.

Problem 2.18. An analytic part x a (n) of a discrete-time signal x (n) is defined in the
frequency domain by

 2X (e jω ) for 0<ω<π
jω
Xa (e ) = X (e jω ) for ω=0 .

0 for −π ≤ ω < 0

In the time domain the analytic part can be written as

x a (n) = x (n) + jxh (n),

where xh (n) is the Hilbert transform of x (n). Find the impulse response of the system that
transforms a signal x (n) into its Hilbert transform (Hilbert transformer).

Problem 2.19. The Fourier transform of a continuous signal x (t) is nonzero only within
3Ω1 < Ω < 5Ω1 . Find the maximum possible sampling interval ∆t such that the signal can
be reconstructed based on the samples x (n∆t).
82 Discrete-Time Signals and Transforms

Problem 2.20. For a signal whose Fourier transform is zero-valued for frequencies
|Ω| ≥ Ωm = 2π f m = π/∆t show that

Z∞
sin(π (t − τ )/∆t)
x (t) = x (τ ) dτ.
π (t − τ )
−∞

Write a discrete-time version of this relation.

Problem 2.21. Sampling of a signal is done twice, with the sampling interval ∆t = 2π/Ωm
that is twice larger than the sampling interval required by the sampling theorem (∆t = π/Ωm
is required). After the first sampling process, the discrete-time signal x1 (n) = ∆tx (n∆t) is
formed, while after the second sampling process signal x2 (n) = ∆tx (n∆t + a) is formed.
Show that the continuous-time signal x (t) can be reconstructed based on x1 (n) and x2 (n)
if a 6= k∆t, that is, if the samples x1 (n) and x2 (n) do not overlap in the continuous-time
domain.
Problem 2.22. In general, a sinusoidal signal x (t) = A sin(Ω0 t + ϕ) is described with
three parameters A, Ω0 and ϕ. Thus, generally speaking, three points of x (t) would be
sufficient to find three signal parameters. If we know the signal x (t) at t = t0 , t = t0 + ∆t
and t = t0 − ∆t what is the relation and conditions to reconstruct, for example, Ω0 , which is
usually the most important parameter of a sinusoid?
Problem 2.23. Show that the relation among the amplitudes of a signal x (n) and its even
and odd parts xe (n) = [ x (n) + x (−n)]/2 and xo (n) = [ x (n) − x (−n)]/2 is
√
As (n) ≤ | xe (n)| + | xo (n)| ≤ 2As (n)
h i
where As (n) > 0 is defined by A2s (n) = | x (n)|2 + | x (−n)|2 /2.

2.6 EXERCISE

Exercise 2.1. Calculate the convolution of the signals x (n) = n[u(n) − u(n − 3)] and
h(n) = δ(n + 1) + 2δ(n) − δ(n − 2).

Exercise 2.2. Find the convolution of the signals x (n) = e−|n| and h(n) = u(3 − n)u(3 +
n ).
Exercise 2.3. The output of a linear time-invariant discrete system to the input signal
x (n) = u(n) is y(n) = ( 31n + n)u(n). Find the impulse response h(n). Is the system
stable?
2.4. For signal j0 jπ
Exercise
Rπ jω
R π x (n)jω= 2nu(5 − n)u(n + 5) find the values of X (e ), X (e ),
−π X ( e ) dω, and −π | X ( e )| dω without the Fourier transform calculation. Check the
results by calculating the Fourier transform.
Exercise 2.5. For a signal x (n) at an instant m a signal y(n) = x (m − n) x ∗ (m + n)
is formed. Show that the Fourier transform of y(n) is real-valued. What is the Fourier
Ljubiša Stanković Digital Signal Processing 83

transform of y(n) if x (n) = A exp( jan2 /4 + j2ω0 n)? Find the Fourier transform of
z(m) = x (m − n) x ∗ (m + n) for a given n.
Note: The Fourier transform of y(n) is the Fourier transform of x (n) for a given m,
while the Fourier transform of z(m) is the Ambiguity function of x (n) for a given n.
Exercise 2.6. For a signal x (n) with the Fourier transform X (e jω ) find the Fourier transform
of x (2n). Find the Fourier transform of y1 (2n) = x (2n) and y1 (2n + 1) = 0. What is the
Fourier transform of x (2n + 1) and what is the Fourier transform of the signal y2 (n)
defined by y2 (2n) = 0 and y2 (2n + 1) = x (2n + 1). Check the result by showing that
Y1 (e jω ) + Y2 (e jω ) = X (e jω ).
Exercise 2.7. For a real-valued signal x (n) find the relation between its Fourier transform
X (e jω ) and the corresponding Hartley transform

∞
H (e jω ) = ∑ x (n)[cos(ωn) + sin(ωn)].
n=−∞

Write this relation if the signal is real-valued and even, that is, x (n) = x (−n).

Exercise 2.8. Systems with impulse responses h1 (n), h2 (n) and h3 (n) are connected in
cascade. If the impulse responses h2 (n) = h3 (n) = u(n) − u(n − 2) and the resulting
impulse response is h(n) = δ(n) + 5δ(n − 1) + 10δ(n − 2) + 11δ(n − 3) + 8δ(n − 4) +
4δ(n − 5) + δ(n − 6). Find the impulse response h1 (n).
Exercise 2.9. Continuous-time signal x (t) = sin(100πt) + cos(180πt) + sin(200πt +
π/4) is sampled with the sampling interval ∆t = 1/125 and used as an input to the system
with the transfer function H (e jω ) = 1 for |ω | < 3π/4 and H (e jω ) = 0 for |ω | ≥ 3π/4.
What is the discrete-time output of this system? What is the corresponding continuous-time
output signal? What should be the sampling interval so that the continuous-time output signal
y(t) is equal to the input signal x (t)?
84 Discrete-Time Signals and Transforms

2.7 SOLUTIONS

Solution 2.1. (a) The signal shifted for N is given by x (n + N ) = sin(2π (n + N )/32).
The equality x (n + N ) = x (n) holds for 2πN/32 = 2kπ, k = 1, 2, . . . . The smallest
integer N satisfying the previous condition is N = 32, with k = 1. The period of this signal
is N = 32.
(b) For the signal x (n) = cos(9πn/82), the equality x (n) = x (n + N ) = cos(9πn/82 +
9πN/82) holds for 9πN/82 = 2kπ, k = 1, 2, . . . . The period follows from N = 164k/9
and it is equal to N = 164 for k = 9.
(c) In this case x (n + N ) = e j(n/32+ N/32) . The relation N/32 = 2kπ, k = 1, 2, . . . ,
produces N = 64kπ. This is not an integer for any k, meaning that the signal x (n) is not
periodic.
(d) The periods of the signal components are obtained from N1 = 10k, N2 = 12k/5,
and N3 = 8k. The smallest value of N when N1 = N2 = N3 = N is N = 120 containing 12
periods of sin(πn/5), 50 periods of cos(5πn/6), and 15 periods of sin(πn/4).

Solution 2.2. In order to establish if the linearity property holds we have to check the
system output to a linear combination of the input signals x1 (n) and x2 (n),

T { a1 x1 (n) + a2 x2 (n)} = a1 x1 (n) + a2 x2 (n) + 2.

This output is not equal to

a1 y1 (n) + a2 y2 (n) = a1 x1 (n) + 2a1 + a2 x2 (n) + 2a2 .

Therefore, the system is not linear.

This system is time-invariant since

T { x (n − N )} = x (n − N ) + 2 = y(n − N ).

Solution 2.3. The impulse response is defined by h(n) = T {δ(n)}, It can be written as

h(n) = T {u(n) − u(n − 1)}.

For a linear time-invariant discrete system holds

h(n) = T {u(n)} − T {u(n − 1)}.

In this case, this relation means

h(n) = T { x (n)} − T { x (n − 1)} = y(n) − y(n − 1) = 2−n u(n) − 2−(n−1) u(n − 1)

= δ(n) + 2−n u(n − 1) − 2−(n−1) u(n − 1) = δ(n) + 2−n (1 − 2)u(n − 1)
= δ ( n ) − 2− n u ( n − 1 ).
Ljubiša Stanković Digital Signal Processing 85

For this system

∞ ∞
2−1
∑ |h(n)| = 1 + ∑ 2−n = 1 + 1 − 2−1 = 2.
n=−∞ n =1
The system is stable since the sum of the impulse response absolute values is finite.

Solution 2.4. The convolution is calculated sample by sample as

∞
y (0) = ∑ x (k) x (−k) = x (0) x (0) = 1
k=−∞
∞
y (1) = ∑ x ( k ) x (1 − k ) = x (0) x (1) + x (1) x (0) = 2
k =−∞
∞
y(−1) = ∑ x (k) x (−1 − k) = 0
k =−∞
∞
y (2) = ∑ x ( k ) x (2 − k ) = 3
k =−∞
..
.

The calculation process is illustrated in Fig. 2.10, along with the final result y(n).

1.5 1.5
1 1
x(−k)
x(k)

0.5 0.5
0 0
−0.5 −0.5
−15 −10 −5 0 5 10 15 −15 −10 −5 0 5 10 15
k k
1.5 1.5
1 1
x(1−k)

x(2−k)

0.5 0.5
0 0
−0.5 −0.5
−15 −10 −5 0 5 10 15 −15 −10 −5 0 5 10 15
k k
1.5 6
x(n)* x(n)

1
x(−1−k)

4
0.5
2
0
−0.5 0
−15 −10 −5 0 5 10 15 −15 −10 −5 0 5 10 15
k n

Figure 2.10 Illustration of the convolution calculation for a discrete-time signal x (n) = u(n) − u(n − 5).
86 Discrete-Time Signals and Transforms

Solution 2.5. Based on the convolution definition we an write

∞
y(n) = x (n) ∗n h(n) = ∑ x (k)h(n − k) = (2.51)
k =−∞
∞
= ∑ e−|k| (u((n − k) + 5) − u((n − k ) − 6))
k=−∞

with

1, for k ≤ n + 5 1, for k ≤ n − 6
u((n − k) + 5) = and u((n − k) − 6) =
0, for k > n + 5 0, for k > n − 6

we get

1, for n − 6 < k ≤ n + 5
(u((n − k) + 5) − u((n − k) − 6)) =
0, elsewhere.

The infinite sum in (2.51) reduces to the terms for n − 5 ≤ k ≤ n + 5

n +5
y(n) = ∑ e−|k| .
k = n −5

Since |k| = k, for k ≥ 0, and |k| = −k, for k < 0, we have three cases:
(1) For n + 5 ≤ 0, that is, n ≤ −5, we have k ≤ 0 for all terms. Therefore |k | = −k, and

n +5
1 − e11 e −5 − e6 e0.5 e−5.5 − e5.5 sinh 5.5
y(n) = ∑ e k = e n −5 = en = en 0.5 −0.5 0.5
= en .
k = n −5
1−e 1−e e e −e sinh 0.5

(2) For n − 5 ≥ 0, the lowest k = n − 5 is greater than 0. Then, k ≥ 0 for all terms and

n +5
1 − e−11 −n e
−0.5 e5.5 − e−5.5 sinh 5.5
y(n) = ∑ e − k = e − n +5 − 1
= e − 0.5 0.5 − 0.5
= e−n .
k = n −5
1−e e e −e sinh 0.5

(3) For −5 < n < 5, the index k can take positive and negative values. The convolution is
split into two sums as

n +5 −1 n +5 5− n n +5
y(n) = ∑ e−|k| = ∑ ek + ∑ e−k = ∑ e−k + ∑ e−k
k = n −5 k = n −5 k =0 k =1 k =0

1−e −( 5 − n ) −(
1 − e n +6) 1 − e n −5 −(n+6)
1/2 1 − e
= e −1 + = e−1/2 + e
1 − e −1 1 − e −1 e1/2 − e−1/2 e1/2 − e−1/2
1/2
cosh(1/2) − e cosh(n)
= .
sinh(1/2)
Ljubiša Stanković Digital Signal Processing 87

Finally, we can write

sinh 5.5 cosh 0.5 − e−5.5 cosh(n)

y(n) = e−|n| for |n| ≥ 5 and y(n) = for |n| < 5.
sinh 0.5 sinh 0.5

Solution 2.6. (a) For a parallel connection of systems the output signal is given by

y ( n ) = y1 ( n ) + y2 ( n ) + y3 ( n )
∞ ∞ ∞
= ∑ h1 ( k ) x ( n − k ) + ∑ h2 ( k ) x ( n − k ) + ∑ h3 ( k ) x ( n − k )
k =−∞ k =−∞ k =−∞
∞
= ∑ [h1 (k) + h2 (k) + h3 (k)] x (n − k).
k =−∞

The resulting impulse response of systems connected in parallel is

h(n) = h1 (k) + h2 (k) + h3 (k ) = [e− an + e−bn + 1]u(n).

(b) For a cascade of systems with the impulse responses h2 (n) and h3 (n), the output from
the first system is
∞ ∞
y2 ( n ) = ∑ h2 ( k ) x ( n − k ) = ∑ h2 ( n − k ) x ( k ).
k =−∞ k =−∞

The input to the second system is equal to the output of the first system, while the output of
the second system is
∞ ∞ ∞
y3 ( n ) = ∑ h3 ( m ) y2 ( n − m ) = ∑ h3 ( m ) ∑ h2 ( n − m − k ) x ( k )
m=−∞ m=−∞ k=−∞
∞ ∞ ∞
= ∑ ∑ h3 ( m ) h2 ( n − m − k ) x ( k ) = ∑ h23 (n − k) x (k )
k =−∞ m=−∞ k=−∞

where
∞
h23 (n) = ∑ h3 ( m ) h2 ( n − m ) = h2 ( n ) ∗ n h3 ( n ).
m=−∞

The impulse response of the whole system is

h(n) = h1 (n) + h23 (n) = h1 (n) + h2 (n) ∗n h3 (n),

with
∞ n
e−bn − eb
h2 (n) ∗n h3 (n)= ∑ e−b(n−m) u ( n − m ) u ( m ) = u ( n ) ∑ e−b(n−m) = u ( n ).
m=−∞ m =0 1 − eb
88 Discrete-Time Signals and Transforms

Solution 2.7. Since we know the impulse response h2 (n), we can calculate

h2 (n) ∗n h2 (n) = δ(n) + 2δ(n − 1) + δ(n − 2).

Therefore, the total impulse response is obtained as

h(n) = h1 (n) ∗n [ h2 (n) ∗n h2 (n)] = h1 (n) + 2h1 (n − 1) + h1 (n − 2)

h1 (n) = h(n) − 2h1 (n − 1) − h1 (n − 2).

From the last relation follows: h1 (n) = 0 for n < 0, h1 (0) = h(0) = 1, h1 (1) = h(1) −
2h1 (0) = 3, h1 (2) = h(2) − 2h1 (1) − h1 (0) = 3, h1 (3) = 2, h1 (4) = 1, h1 (5) = 0, and
h1 (n) = 0 for n > 5. The output signal for the input x (n) = δ(n) − δ(n − 1) can be easily
calculated as
y ( n ) = h ( n ) − h ( n − 1).

Solution 2.8. Instead of a direct convolution we will calculate the frequency response,
H (e jω ), of the discrete system. First, we will find the Fourier transform of the signal
e−n/2 u(n),
∞
1
H1 (e jω ) = ∑ e−n/2 e− jωn = −(1/2+ jω )
n =0 1 − e
and differentiate both sides with respect to ω

∞
− je−(1/2+ jω )
−j ∑ ne−n/2 e− jωn = (1 − e−(1/2+ jω) )2 .
n =0

The frequency response H (e jω ) is then obtained in the form

∞
e−(1/2+ jω )
H (e jω ) = ∑ ne−n/2 e− jωn = (1 − e−(1/2+ jω) )2 .
n =0

The output for a real-valued h(n) is

y(n) =5 H (e jπ/10 ) sin(πn/10 + arg{ H (e jπ/10 })

−3 H (e jπ/6 ) cos(πn/6 + π/6 + H (e jπ/6 ))
=14.1587 sin(πn/5 − 1.1481) − 5.7339 cos(πn/3 + π/6 − 1.6605).

Value of the sum S is

∞ √
−n/2 j0 e
S= ∑ ne = H (e ) = √
( e − 1)2
.
n =0
Ljubiša Stanković Digital Signal Processing 89

Solution 2.9. (a) The unit step signal can be written as

1 − an 1 1
x (n) = u(n) = lim e u(n) + − e− an u(−n − 1) = lim x a (n).
a →0 2 2 2 a →0

The Fourier transform of x a (n) is

∞
1 − an 1 1 − an
jω
Xa (e ) = ∑ e u(n) + − e u(−n − 1) e− jωn
n=−∞ 2 2 2
1 ∞ 1 a+ jω
2 2e
=
1 − e− a− jω
+ ∑ πδ(ω + 2kπ ) −
1 − e a+ jω
k=−∞

∞
1
X (e jω ) = lim Xa (e jω ) = + ∑ πδ(ω + 2kπ ).
a →0 1 − e− jω k=−∞
The result from (2.23) is used to transform the constant signal equal to 1/2.
(b) This signal can be written in the form

x (n) = 2 cos(ω0 n)u(n) = (e jω0 n + e− jω0 n )u(n).

Its Fourier transform is

∞
1
X (e jω ) = + ∑ πδ(ω − ω0 + 2kπ )
1 − e − j ( ω − ω0 ) k=−∞
∞
1
+
1 − e − j ( ω + ω0 )
+ ∑ πδ(ω + ω0 + 2kπ )
k =−∞
1 − e− jω cos(ω0 )
=2
1 − 2 cos(ω0 )e− jω + e− j2ω
∞
+ ∑ π [δ(ω − ω0 + 2kπ ) + δ(ω + ω0 + 2kπ )] .
k =−∞

(c) For a periodic signal y(n) the Fourier transform is

∞ ∞ ∞
Y (e jω ) = ∑ ∑ x (n + kN )e− jωn = ∑ X (e jω )e jωkN
k =−∞ n=−∞ k =−∞
∞
jω
= X (e ) ∑ e jωkN .
k =−∞

Using (2.23), we get

∞
2π ∞ 2kπ
Y (e jω ) = X (e jω )2π ∑ δ(ωN + 2kπ ) = X (e jω ) ∑ δ(ω + ).
k =−∞
N k=−∞ N
90 Discrete-Time Signals and Transforms

Solution 2.10. For the impulse response h(n) the frequency response is

H (e jω ) = 2aj sin(ω ) + 2jb sin(2ω ).

The first derivative of H (e jω ), at ω = 0, is

dH (e jω )
= 2aj + 4jb = j,
dω ω =0

while the second derivative, at ω = 0, if of the form

d2 H (e jω )
= −2aj − 8jb = 0.
dω 2 ω =0

The constants a and b follow from the system

a + 2b = 1/2
a + 4b = 0

as b = −1/4 and a = 1 with the resulting impulse response

1
h(n) = δ(n + 1) − δ(n − 1) − (δ(n + 2) − δ(n − 2)).
4

Solution 2.11. Note that

1
wT (n) = w R (n) ∗n w R (n)
N+1

where w R (n) = u(n + N/2) − u(n − N/2 − 1) is the rectangular window. Since

sin(ω N2+1 )
WR (e jω ) = ,
sin(ω/2)

we have
1 1 sin2 (ω N2+1 )
WT (e jω ) = WR (e jω )WR (e jω ) = .
N+1 N + 1 sin2 (ω/2)

Solution 2.12. This integral represents the energy of a discrete-time signal whose Fourier
transform is defined by

sin(ω N2+1 )
X (e jω ) = .
sin(ω/2)
Ljubiša Stanković Digital Signal Processing 91

This signal is a rectangular window, x (n) = u(n + N/2) − u(n − N/2 − 1). Its energy is

Zπ N/2 N/2
1 sin2 (( N + 1)ω/2) 2
I=
2π sin2 (ω/2)
dω = ∑ x ( n ) = ∑ 1 = N + 1.
−π n=− N/2 n=− N/2

This integral is also equal to the value of wT (0) multiplied by N + 1.

Solution 2.13. The Hann(ing) window

1
w H (n) = [1 + cos(nπ/N )] [u( N + n) − u(n − N − 1)]
2

is nonzero within − N ≤ n ≤ N − 1. Thus, the windows w H (n) and w H (n − N ) overlap

within 0 ≤ n ≤ N − 1. The new window, within this interval, is of the form

w(n) = w H (n) + w H (n − N )
1 1
= [1 + cos(nπ/N )] + [1 + cos((n − N )π/N )]
2 2
1 1
= 1 + cos(nπ/N ) + cos(nπ/N − π ) = 1.
2 2

The same holds for − N ≤ n ≤ −1, when

w(n) = w H (n + N ) + w H (n) = 1.

The resulting window is



 0 for n < −2N


 12 [1 − cos(nπ/N )] for −2N + 1 ≤ n ≤ − N + 1
w(n) = 1 for −N ≤ n ≤ N − 1


 21 [1 − cos(nπ/N )]
 for N ≤ n ≤ 2N − 1

0 for n > 2N − 1,

since 12 [1 + cos((n ± N )π/N ) = 21 [1 − cos(nπ/N )] . The Fourier transform of the

resulting window, in terms of the Fourier transform of the Hann(ing) window WH (e jω ), is

W (e jω ) = WH (e jω )e− jωN + WH (e jω ) + WH (e jω )e jωN

= WH (e jω )[1 + 2 cos(ωN )].

For a sum of 2K + 1 windows

K
w(n) = ∑ w H (n + kN )
k =−K
92 Discrete-Time Signals and Transforms

we get


 0 for n < −(K + 1) N

 π
 12 1 + cos((n + KN ) N ) for −(K + 1) N + 1 ≤ n ≤ −KN + 1
w(n) = 1 for −KN ≤ n ≤ KN − 1

 1 π

 2 1 + cos((n − KN ) N ) for KN ≤ n ≤ (K + 1) N − 1

0 for n > ( K + 1) N − 1

with
K
sin(ω (2K + 1) N/2)
W (e jω ) = WH (e jω ) ∑ e− jωkN = WH (e jω ) .
k =−K
sin(ωN/2)

Similar results hold for the Hamming and the triangular window. These results can be
generalized for shifts of N/2, N/4,. . .
For very large K the second term variations in W (e jω ) are much faster than the variations
of WH (e jω ). Thus, for large K the Fourier transform W (e jω ) approaches to the Fourier
transform of a rectangular window of the width (2K + 1) N.

Solution 2.14. Based on the definition of the Fourier transform of discrete-time signals,

∞ ∞
X (e j0 ) = ∑ x (n) = 7, X (e jπ ) = ∑ x (n)(−1)n = 1,
n=−∞ n=−∞
Zπ Zπ 2 ∞

jω
X (e )dω = 2πx (0) = 4π, X (e jω ) dω = 2π ∑ | x (n)|2 = 30π.
−π −π n=−∞

Finally, X (e jω ) = Re{ X (e jω )} + j Im{ X (e jω )} and X ∗ (e jω ) = Re{ X (e jω )} − j Im{ X (e jω )}.

Thus,
1
Re{ X (e jω )} = X (e jω ) + X ∗ (e jω ) .
2
The inverse Fourier transform of Re{ X (e jω )} is

1
y(n) = ( x (n) + x ∗ (−n)).
2

Solution 2.15. The Fourier transform of the signal y(n) is

" #
∞ ∞
jω −n/4 − jωn d −n/4 − jωn
Y (e ) = ∑ ne u(n)e =j
dω ∑e e
n=−∞ n =0
d 1 e−1/4− jω
=j = .
dω 1 − e 1/4− jω
− (1 − e−1/4− jω )2
Ljubiša Stanković Digital Signal Processing 93

The center of gravity of x (n) = e−n/4 u(n) is

∞
e−1/4− jω
∑ nx (n)
n=−∞ Y (e j0 ) (1−e−1/4− jω )2 |ω =0 1
ng = ∞ = = 1
= = 3.52.
X (e j0 ) e1/4 − 1
∑ x (n) 1−e−1/4− jω |ω =0
n=−∞

Solution 2.16. (a) The inverse Fourier transform of

jω 1 for |ω | ≤ π/3
H (e ) =
0 for π/3 < |ω | < π

is
π/3
Z π/3
1 e jωn sin(πn/3)
h(n) = e jωn dω = = .
2π 2jπn −π/3 πn
−π/3

The frequency response value at the input signal frequency ω = ±π/6 is H (e± jπ/6 ) = 1.
The output signal is given by y(n) = sin(nπ/6).
(b) The frequency response is H (e jω ) ∗ω H (e jω ), resulting in y(n) = 0.25 sin(nπ/6).
(c) The output signal is equal to y(n) = sin((n − 2)π/6) = sin(nπ/6 − π/3).

Solution 2.17. For the signal x (t) = cos(20πt + π/4) + sin(90πt), the corresponding
discrete-time signal is

x (n) = cos(20πn∆t + π/4)∆t + sin(90πn∆t)∆t.

(a) For ∆t = 1/100

x (n) = cos(0.2πn + π/4)/100 + sin(0.9πn)/100,

with the Fourier transform

π ∞ h − jπ/4
i
X (e jω ) = ∑ δ ( ω − 0.2π + 2kπ ) e jπ/4
+ δ ( ω + 0.2π + 2kπ ) e
100 k=− ∞
∞ h i
π
+ ∑
j100 k=−
δ ( ω − 0.9π + 2kπ ) − δ ( ω + 0.9π + 2kπ ) .
∞

Since the Fourier transform of h(n) = sin(nπ/2)/(nπ ) is H (e jω ) = 1 for |ω | ≤ π/2

and H (e jω ) = 0 for π/2 < |ω | < π, the result of the convolution is equal to the output of
system with the transfer function H (e jω ) to the input signal x (n). In this case

x (n) = cos(0.2πn + π/4)/100.

94 Discrete-Time Signals and Transforms

A continuous-time signal corresponding to the output discrete-time signal is given by

y(t) = cos(20πt + π/4), as shown in Fig. 2.11(top).

jω jω
H(e ), X(e )
1

−2π −π −π/2 0 π/2 π 3π/2 2π ω

jω jω
H(e ), X(e )
1

−2π −π −π/2 0 π/2 π 3π/2 2π ω

jω jω
H(e ), X(e )
1

−2π −π −π/2 0 π/2 π 3π/2 2π ω

Figure 2.11 Illustration of the system output with various sampling intervals (a)-(c).

(b) If the signal is sampled with the sampling interval ∆t = 1/50 the discrete-time signal is

x (n) = cos(0.4πn + π/4)/50 + sin(1.8πn)/50,

with the Fourier transform

π ∞
X (e jω ) = ∑ [δ(ω − 0.4π + 2kπ )e jπ/4 + δ(ω + 0.4π + 2kπ )e− jπ/4 ]
50 k=− ∞
π ∞
+ ∑ [δ(ω − 1.8π + 2kπ ) − δ(ω + 1.8π + 2kπ )].
j50 k=− ∞

The Fourier transform components within the basic period, −π ≤ ω < π, are

π
X (e jω ) = [δ(ω − 0.4π )e jπ/4 + δ(ω + 0.4π )e− jπ/4 ]
50
π
[δ(ω − 1.8π + 2π ) − δ(ω + 1.8π − 2π )]
+
j50
π π
= [δ(ω − 0.4π )e jπ/4 + δ(ω + 0.4π )e− jπ/4 ] + [δ(ω + 0.2π ) − δ(ω − 0.2π )].
50 j50
Ljubiša Stanković Digital Signal Processing 95

The result of convolution is

x (n) = cos(0.4πn + π/4)/50 − sin(0.2πn)/50,

with the corresponding continuous-time signal

x (t) = cos(20πt + π/4) − sin(10πt).

The component − sin(10πt) does not correspond to any frequency in the input signal, Fig.
2.11(middle). This effect is illustrated in Fig. 2.12.

x(n)

Figure 2.12 Illustration of the aliasing caused frequency change, from signal sin(90πt) to signal − sin(10πt).

x (n) = 3 cos(0.6πn + π/4)/100 + 3 sin(2.7πn)/100.

The Fourier transform components within the basic period, −π ≤ ω < π, are given by

3π
X (e jω ) = [δ(ω − 0.6π )e jπ/4 + δ(ω + 0.6π )e− jπ/4 ]
100
3π
+ [δ(ω − 2.7π + 2π ) − δ(ω + 2.7π − 2π )].
j100

The result of the convolution is y(n) = 0, Fig. 2.11(bottom).

Solution 2.18. The Fourier transform of an analytic part of a signal is


 2X (e jω ) for 0<ω<π
jω
Xa (e ) = X (e jω ) for ω=0

0 for −π ≤ ω < 0
= X (e jω ) + sign(ω )( X (e jω ) = X (e jω ) + Xh (e jω ).
96 Discrete-Time Signals and Transforms

The frequency response of the discrete Hilbert transformer is


 1 for 0<ω<π
H (e jω ) = 0 for ω=0 = sign(ω )

−1 for −π ≤ ω < 0

for −π ≤ ω < π. The impulse response is

Zπ
2 sin2 (πn/2)
h(n) = sign(ω )e jωn dω = .
πn
−π

For n = 0 the impulse response is h(0) = 0. The discrete-time Hilbert transformer, in the
frequency and the time domain, is shown in Fig. 2.13.

H(ejω) h(n)
1 2/π

−2π −π 0 π 2π ω
0 n

Figure 2.13 Frequency and impulse response of the discrete-time Hilbert transformer.

Solution 2.19. From the definition and conditions for the sampling theorem we could
conclude that the maximum sampling interval should be related to the maximum frequency
5Ω1 as ∆t = π/(5Ω1 ), corresponding to the periodical extension of the Fourier transform
X (Ω) with period 10Ω1 . However, in this case, there is no need to use such a large period in
order to achieve that two periods do not overlap. It is sufficient to use the period equal to
2Ω1 , as shown in Fig. 2.14. In this case, we will be able to reconstruct the signal, with some
additional processing.

1.5 1.5

1 1
Xp(Ω)
X(Ω)

0.5 0.5

0 0
−3 −2 −1 0 1 2 3 4 5 6 7 −3 −2 −1 0 1 2 3 4 5 6 7
Ω/Ω1 Ω/Ω1

Figure 2.14 Problem 2.19: illustration of the Fourier transform periodic extension.

It is obvious that, after the signal sampling with ∆t = π/Ω1 (periodic extension of
Fourier transform with 2Ω1 ), the basic period −Ω1 < Ω < Ω1 will contain the original
Ljubiša Stanković Digital Signal Processing 97

Fourier transform shifted for 4Ω1 . The reconstructed signal is

∞
sin(π (t − n∆t)/∆t)
x (t) = e j4Ω1 t ∑ x (n∆t) with ∆t = π/Ω1 .
n=−∞ π (t − n∆t)/∆t

Solution 2.20. For signal whose Fourier transform is zero for frequencies |Ω| ≥ Ωm =
2π f m = π/∆t hods
X (Ω) = X (Ω) H (Ω)
where
1 for |Ω| < Ωm = π/∆t
H (Ω) = .
0 for |Ω| ≥ Ωm = π/∆t
The impulse response of H (Ω) is equal to

π/∆t
Z
1 sin(πt/∆t)
h(t) = e jΩt dΩ = .
2π πt
−π/∆t

Then x (t) = x (t) ∗ h(t) produces the signal

Z∞ Z∞
sin(π (t − τ )/∆t)
x (t) = x (τ )h(t − τ )dτ = x (τ ) dτ.
π (t − τ )
−∞ −∞

In order to write this relation in the discrete-time form note that

X (Ω) = X p (Ω) H (Ω) (2.52)

holds if the Fourier transform of signal X (Ω) is periodically extended with the period
2π/∆t ≥ 2Ωm , to produce
∞
2π
X (Ω) ∗Ω ∑ 2πδ Ω −
∆t
k = X p ( Ω ).
k =−∞

Convolution in the frequency domain of two Fourier transforms corresponds to the product
of signals in the time domain, that is
∞
x (t) ∑ δ(t + n∆t)∆t = IFT{ X p (Ω)} = x p (t). (2.53)
n=−∞

Relation (1.67), that reads

( ) ( )
2π ∞ 2π ∞ ∞
∑ δ Ω − ∆t k = FT ∑ δ(t + n∆t) = FT ∑ δ(t − n∆t)
∆t k=− ∞ n=−∞ n=−∞
98 Discrete-Time Signals and Transforms

is used.
From (2.52) and then (2.53) the discrete-time form follows

Z∞ ∞
x (t) = x p (t) ∗t h(t) = x (τ ) ∑ δ(τ − n∆t)h(t − τ )∆tdτ
−∞ n=−∞
∞ ∞ π
sin( ∆t (t − n∆t))
= ∑ x (n∆t)h(t − n∆t)∆t = ∑ x (n∆t) π . (2.54)
n=−∞ n=−∞ ∆t ( t − n∆t )

The convergence of function sin(t)/t is very slow.

H(Ω)

X(Ω)

−Ω
m
Ωm Ω

Xp(Ω) = X(Ω)
H(Ω) −Ω < Ω < Ω
0 0

X(Ω)

−Ω0 −Ωm Ωm Ω0 Ω

Xp(Ω) = X(Ω)
−Ω < Ω < Ω
0 0

X(Ω)

−Ω0 −Ωm Ωm Ω0 Ω

Figure 2.15 Smoothed filter in the sampling theorem illustration (first two graphs) versus original sampling
theorem relation within filtering framework.

The previous derivation provides a possibility that a smooth transition in H (Ω) is

used for Ωm ≤ |Ω| ≤ Ωm + ∆Ωm . This region of smooth changes from H (Ω) = 1, for
|Ω| < Ωm , to H (Ω) = 0, for |Ω| ≥ Ωm + ∆Ωm , improves the convergence of h(t), as
Ljubiša Stanković Digital Signal Processing 99

illustrated in Fig. 2.15. The sampling step should be (Ωm + ∆Ω 2 ) = π/∆t so that the
m

periodic extension of X (Ω) H (Ω) does not include overlapped X (Ω) values. The impulse
response h(t) can be then used in the reconstruction formula

∞
x (t) = ∑ x (n∆t)h(t − n∆t),
n=−∞

∆Ωm
with a reduction of the sampling interval to ∆t = π/(Ωm + 2 ) with respect to
∆t = π/Ωm .

Solution 2.21. The Fourier transforms of discrete-time signals, in a continuous frequency

notation, are periodically extended versions of X (Ω) with the period 2π/∆t,

∞
X1 ( Ω ) = ∑ X (Ω + 2πn/∆t),
n−−∞
∞
X2 ( Ω ) = ∑ X (Ω + 2πn/∆t)e j(Ω+2πn/∆t) a .
n−−∞

Within the basic period (considering the positive frequencies 0 ≤ Ω < Ωm ), only two periods
overlap

X1 (Ω) = X (Ω) + X (Ω − 2π/∆t),

X2 (Ω) = X (Ω)e jΩa + X (Ω − 2π/∆t)e j(Ω−2π/∆t) a .

The second term X (Ω − 2π/∆t) in these relations is the overlapped period (aliasing) of
the Fourier transform, that should be eliminated using these two equations. The Fourier
transform X (Ω) of the original signal follows in the form

X1 (Ω)e− j2πa/∆t − X2 (Ω)e− jΩa

X (Ω) = for a 6= k∆t.
e− j2πa/∆t − 1

Similarly for negative frequencies, within the basic period −Ωm < Ω < 0, follows

X1 (Ω)e j2πa/∆t − X2 (Ω)e− jΩa

X (Ω) = for a 6= k∆t.
e j2πa/∆t − 1

Therefore, the signal can be reconstructed from two independent discrete-time signals
undersampled with factor of two.
A similar result could be derived for N independently sampled, N times undersampled
signals.

Solution 2.22. It is easy to show that

x (t0 + ∆t) + x (t0 − ∆t) A sin(Ω0 t0 + ϕ + Ω0 ∆t) + A sin(Ω0 t0 + ϕ − Ω0 ∆t)

=
2x (t0 ) 2A sin(Ω0 t0 + ϕ)
100 Discrete-Time Signals and Transforms

x (t0 + ∆t) + x (t0 − ∆t) 2 sin(Ω0 t0 + ϕ) cos(Ω0 ∆t)

= = cos(Ω0 ∆t),
2x (t0 ) 2 sin(Ω0 t0 + ϕ)

with
1 x (t0 + ∆t) + x (t0 − ∆t)
Ω0 = arccos .
∆t 2x (t0 )
The condition for a unique solution is that the argument of cosine is 0 ≤ Ω0 ∆t ≤ π, limiting
the approach to small values of ∆t.
We will discuss the discrete complex-valued signal as well. For a complex sinusoid
x (n) = A exp( j2πk0 n/N + φ0 ), with available two samples x (n1 ) = A exp( jϕ(n1 )) and
x (n2 ) = A exp( jϕ(n2 )), from

x ( n1 )
= exp( j2πk0 (n1 − n2 )/N )
x ( n2 )

follows
2πk0 (n1 − n2 )/N = ϕ(n1 ) − ϕ(n2 ) + 2kπ,
where k is an arbitrary integer. Then

ϕ ( n1 ) − ϕ ( n2 ) k
k0 = N+ N. (2.55)
2π (n1 − n2 ) n1 − n2

Let us analyze the ambiguous term kN/(n1 − n2 ) role in the determination of k0 . For
n1 − n2 = 1, this term is kN, meaning that any frequency k0 would be ambiguous with kN.
Any value k0 + kN for k 6= 0, in this case, will be outside the basic period 0 ≤ k ≤ N − 1.
Thus, we may find k0 in a unique way, within 0 ≤ k0 ≤ N − 1. However, for n1 − n2 = L > 1,
the terms kN/(n1 − n2 ) = kN/L produce shifts within the frequency basic period. Then
several possible solutions for the frequency k0 are obtained. For example, for N = 16 and
k0 = 5 if we use n1 = 1 and n2 = 5, a possible solution to (2.55) is k0 = 5, but also

k0 = 5 + 16k/4,

or k0 = 9, k0 = 13, and k0 = 1, for k0 within 0 ≤ k0 ≤ 15, are possible solutions for frequency
of the considered discrete-time signal.

Solution 2.23. For the absolute values of an even and odd part of the signal holds

x (n) + x (−n) 2 x (n) − x (−n) 2
| xe (n)|2 + | xo (n)|2 = +

.

2 2

From this relation we can write

| x (n)|2 + | x (−n)|2
| xe (n)|2 + | xo (n)|2 = = A2s (n).
2
Ljubiša Stanković Digital Signal Processing 101

q
Obviously, | xe (n)|2 ≤ A2s (n) and | xo (n)|2 ≤ A2s (n). Replacing | xo (n)| = A2s (n) − | xe (n)|2
into | xe (n)| + | xo (n)| we get
q
| xe (n)| + | xo (n)| = | xe (n)| + A2s (n) − | xe (n)|2 .

Now, we have to check the function

q
f (χ) = χ + A2s (n) − χ2

for variable χ within 0 ≤ χ ≤ | As (n)|. Variable χ p

stands for | xe (n)|. By differentiating
f (χ) with respect to χ we get d f (χ)/dχ = 1 −
√ χ/ A2s (n) − χ2 . The stationary point is
obtained from√d f (χ)/dχ = 0 for χ = As (n)/√ 2. The derivative d f (χ)/dχ is positive for
χ < As (n)/ 2 and negative for χ > As (n)/ 2. Thus, the√stationary point is the position
of the function maximum. The maximum function value is 2As (n) since
r
As (n) A2s (n) √
| xe (n)| + | xo (n)| ≤ √ + A2s (n) − = 2As (n).
2 2

The minimum value is achieved at the interval ending points for χ = 0 or χ = As (n),
producing the final result
√
As (n) ≤ | xe (n)| + | xo (n)| ≤ 2As (n).
Chapter 3
Discrete Fourier Transform

ISCRETE - TIME signals can be processed on digital computers in the time domain.

D Their Fourier transform is a function of continuous frequency. For numeric

processing of discrete-time signals in the frequency domain their Fourier transform
should be discretized as well. Discretization in the frequency domain will enable numeric
processing of discrete-time signals in both time and frequency domain.

3.1 DFT DEFINITION

The discrete Fourier transform (DFT) is defined by

N −1
DFT{ x (n)} = X (k) = ∑ x (n)e− j2πkn/N (3.1)
n =0

for k = 0, 1, 2, . . . , N − 1.
In order to establish the relation between the DFT with the Fourier transform of discrete-
time signals, consider a discrete-time signal x (n) of limited duration. Assume that nonzero
samples of x (n) are within 0 ≤ n ≤ N0 − 1. The Fourier transform of this discrete-time
signal is given by
N0 −1
X (e jω ) = ∑ x (n)e− jωn .
n =0
The DFT values can be considered as the frequency domain samples of the Fourier transform
of discrete-time signals, taken with the sampling interval ∆ω = 2π/N in the frequency
domain, where N is the number of samples of X (e jω ) within the period −π ≤ ω < π,

X (k) = X (e j2πk/N ) = X (e jω ) . (3.2)
ω =k∆ω =2πk/N

In order to examine how the Fourier transform sampling in the frequency domain
influences the signal in the time domain, we will form a periodic extension of the discrete-
time signal x (n) with the period N equal to the number of the samples within the basic
frequency domain period, such that N ≥ N0 , Fig. 3.1.

102
Ljubiša Stanković Digital Signal Processing 103

x(n)

x(n) = x(t) ∆t
t = n∆t

0 N n
0

x (n)
p

0 N n

Figure 3.1 Periodic extension of a discrete-time signal.

With N being greater or equal to the signal duration N0 , we will be able to reconstruct
the original signal x (n) from its periodic extension x p (n). Furthermore, we will assume that
the periodic signal x p (n) is formed from the samples of periodic continuous-time signal
x p (t) with a period T (corresponding to N signal samples within the period, T = N∆t, and
x p (n) = x p (n∆t)∆t). The Fourier series coefficients of x p (t) are defined by

ZT
1
Xk = x p (t)e− j2πkt/T dt.
T
0

Assuming that the sampling theorem is satisfied, the integral can be replaced by a sum (in
the sense of Example 2.13)

N −1
1
Xk = ∑ x (n∆t)e− j2πkn∆t/T ∆t
T n =0

with x p (t) = x (t) within 0 ≤ t < T. Using T/∆t = N, x (n∆t)∆t = x (n), and X (k) = TXk
this sum can be written in the form
N −1
X (k) = ∑ x (n)e− j2πkn/N . (3.3)
n =0
104 Discrete Fourier Transform

Therefore, the relation between the DFT and the Fourier series coefficients is

X (k ) = TXk . (3.4)

Sampling the Fourier transform of a discrete-time signal corresponds to the periodical

extension of the original discrete-time signal in time by the period N (equivalent to the
period T = N∆t in the continuous-time domain). The period N in the discrete-time domain
is equal to the number of samples of the Fourier transform within one period in the frequency
domain. We can conclude that this periodic extension in time (discretization in frequency)
will not influence the possibility to recover the original signal if the original discrete-time
signal duration is not longer than N (the number of samples in the Fourier transform of
discrete-time signal).

The inverse DFT is obtained by multiplying both sides of the DFT definition (3.1) by
e j2πkm/N and summing over k

N −1 N −1 N −1
∑ X (k)e j2πmk/N = ∑ x (n) ∑ e j2πk(m−n)/N
k =0 n =0 k =0

with
N −1
1 − e j2π (m−n)
∑ e j2πk(m−n)/N = = Nδ(m − n),
k =0 1 − e j2π (m−n)/N
for 0 ≤ m, n ≤ N − 1. The inverse discrete Fourier transform (IDFT) of a signal x (n) is

N −1
1
x (n) = ∑ X (k)e j2πnk/N . (3.5)
N k =0

for 0 ≤ n ≤ N − 1.
The signal calculated using the IDFT is, by definition, periodic with the period N since

N −1
1
x (n + N ) = ∑ X (k)e j2π (n+ N )k/N = x (n).
N k =0

Therefore the DFT of a signal x (n) calculated using the signal samples within
0 ≤ n ≤ N − 1 assumes that the signal x (n) is periodically extended with the period
N, that is
∞
IDFT{DFT{ x (n)}} = ∑ x (n + mN )
m=−∞
∞
with ∑ x (n + mN ) = x (n) for 0 ≤ n ≤ N − 1.
m=−∞

The values of this periodical extension within the basic period are equal to x (n). This is a
circular extension of signal x (n). The following notations are also used for this kind of the
Ljubiša Stanković Digital Signal Processing 105

signal x (n) extension

IDFT{DFT{ x (n)}} = x (n mod N ) = x ((n)) N .

If the original signal x (n), used in the DFT calculation, is aperiodic, then

x (n) = IDFT{DFT{ x (n)}} (u(n) − u(n − N )) ,

assuming that the initial DFT was calculated for signal samples x (n) within 0 ≤ n ≤ N − 1.
In literature it is quite common to use the same notation for both x (n) and
IDFT{DFT{ x (n)}} having in mind that any DFT calculation with N signal samples
implicitly assumes a periodic extension of the original signal x (n) with period N. Thus, we
will use this kind of notation, except in the cases when we want to emphasize a difference in
the results when the inherent periodicity in the signal (when the DFT is used) is not properly
taken into account.

Example 3.1. For the signals x (n) = 2 cos(2πn/8) for 0 ≤ n ≤ 7 and x (n) = 2 cos(2πn/16) for
0 ≤ n ≤ 7 plot the periodic signals IDFT{DFT{ x (n)}} with N = 8 without calculating the
DFTs.

⋆The periodic extensions of these signals resulting from

∞
IDFT{DFT{ x (n)}} = ∑ x (n + 8m)
m=−∞

are shown in Fig. 3.2.

x(n) x(n)

0 N=8 n 0 N=8 n

...x(n−N)+x(n)+x(n+N)+.. ...x(n−N)+x(n)+x(n+N)+..

0 N=8 n 0 N=8 n

Figure 3.2 Signals x (n) = 2 cos(2πn/8) for 0 ≤ n ≤ 7 (left) and x (n) = 2 cos(2πn/16) for 0 ≤ n ≤ 7 (right)
along with their periodic extensions IDFT{DFT{ x (n)}} with N = 8.
106 Discrete Fourier Transform

Example 3.2. For the signal x (n), whose values are x (0) = 1, x (1) = 1/2, x (2) = −1, and
x (3) = 1/2, find the DFT with N = 4. What is the IDFT for n = −2?

⋆The DFT of this signal is

3
1 1
X (k) = ∑ x(n)e− j2πnk/4 = 1 + 2 e− j2πk/4 − e− jπk + 2 e j2πk/4
n =0
= 1 + (−1)k+1 + cos(2πk/4).

The IDFT is

1 3
[1 + cos(2πk/4) + (−1)k+1 ]e j2πnk/4 ,
4 k∑
x (n) =
=0

for 0 ≤ n ≤ 3. The DFT and IDFT inherently assume the signal and its Fourier transform
periodicity. Thus, the result for n = −2 is

1 3 k 1 3 k
x (−2) = ∑ X (k)e j2π (−2) 4 = ∑ X (k)e j2π (4−2) 4 = x (4 − 2) = x (2) = −1.
4 k =0 4 k =0

Matrix notation and calculation complexity: The DFT can be written in a matrix form as
    
X (0) 1 1 ··· 1 x (0)
  
−1)
 X (1) 1 e − j 2π ··· e − j 2π ( N x (1) 
  
N N
 
 .. = .. .. .. ..  ..  (3.6)
 .  
 . . . .

 . 
2π ( N −1) 2π ( N −1)( N −1)
X ( N − 1) 1 e− j N · · · e− j N x ( N − 1)

or
X = Wx, (3.7)
where X and x are the vectors containing the signal and its DFT values

X=[ X (0) X (1) . . . X ( N − 1)] T

x=[ x (0) x (1) . . . x ( N − 1)] T ,

respectively, while W is the discrete Fourier transform matrix with the coefficients
 
1 1 ··· 1

 1 WN 1 ··· WNN −1 

W= .. .. .. .. , (3.8)
 . . . . 
( N −1) ( N −1)( N −1)
1 WN · · · WN
Ljubiša Stanković Digital Signal Processing 107

where
k
WN = e− j2πk/N
is used to simplify the notation.
The number of additions to calculate a DFT is N − 1 for every X (k) in (3.1). Since
there are N DFT coefficients, the total number of additions is N ( N − 1). From the matrix
in (3.6) we can see that the multiplications are not needed for calculation of X (0). There is
no need for a multiplication in the first term of every coefficient calculation as well. If we
neglect the fact that some other terms in matrix (3.6) may also take values 1, −1, j, or − j
then the number of multiplications is ( N − 1)2 . The order of the number of multiplications
and the number of additions for the DFT calculation is N 2 .
The inverse DFT in a matrix form is

x = W−1 X, (3.9)

with W−1 = N1 W∗ , where ∗ denotes complex-conjugate operation. The same calculation

complexity analysis holds for the inverse DFT as in the case of the DFT.

3.2 DFT PROPERTIES

Most of the DFT properties can be derived in the same way as for the Fourier transform and
the Fourier transform of discrete-time signals.

1. Consider a signal x (n) shifted in time x (n − n0 ). If the DFT of signal x (n) is

X (k) = DFT{ x (n)} then X (k)e− j2πkn0 /N will represent a signal

N −1
1 2π
IDFT{ X (k )e− j2πkn0 /N } = ∑ X (k)e− j2πkn0 /N e j N kn
N k =0
N −1
1 j 2π
N k ( n − n0 ) (3.10)
=
N ∑ X (k)e = x ( n − n0 ).
k =0

Here x (n − n0 ) is the signal obtained when x (n) is periodically extended with N first
and then this periodic signal is shifted for n0 . The basic period of the original signal
x (n) is now within n0 ≤ n ≤ N + n0 − 1.

This kind of shift in periodic signals, used in the above relation, is also referred to as a
circular shift. Thus, with the circular shift

DFT{ x (n − n0 )} = X (k)e− j2πkn0 /N . (3.11)

2. For a modulated signal x (n)e j2πnk0 /N we easily get

n o
DFT x (n)e j2πnk0 /N = X (k − k0 ). (3.12)
108 Discrete Fourier Transform

3. The DFT is real-valued if

x ∗ ( n ) = x ( N − n ).
For a real-valued DFT holds
X (k) = X ∗ (k)
or
N −1 N −1 N −1
∑ x (n)e− j2πnk/N = ∑ x ∗ (n)e j2πnk/N = ∑ x ∗ ( N − n)e j2π ( N −n)k/N ,
n =0 n =0 n =0

where x ∗ ( N )e j2πNk/N = x ∗ (0)e j2π0k/N is used. Since e j2πk( N −n)/N = e− j2πnk/N

we get
N −1 N −1
∑ x (n)e− j2πnk/N = ∑ x ∗ ( N − n)e− j2πnk/N .
n =0 n =0

It means that if X (k) = X ∗ ( k ), then x ∗ (n) = x ( N − n) = x (−n).

In the same way, for a real-valued signal x (n) the DFT satisfies the following property

X ∗ ( k ) = X ( N − k ).

4. Parseval’s theorem of discrete-time periodic signals relates the energy in the time
domain and the frequency domain

N −1 N −1 N −1 1 N −1 ∗
∑ | x (n)|2 = ∑ x (n) x∗ (n) = ∑ x (n) ∑ X (k)e j2πnk/N
n =0 n =0 n =0
N k =0
N −1 N −1 N −1
1 1
= ∑ X ∗ (k) ∑ x (n)e− j2πnk/N = ∑ | X (k)|2 .
N k =0 n =0
N k =0

5. Convolution of two periodic signals x (n) and h(n), whose period is N, is defined
by
N −1
y(n) = ∑ x ( m ) h ( n − m ).
m =0
The DFT of this signal is

N −1 N −1
Y (k) = DFT{y(n)} = ∑ ∑ x (m)h(n − m)e− j2πnk/N = X (k) H (k). (3.13)
n =0 m =0

Thus, the DFT of the convolution of two periodic signals is equal to the product of
the DFTs of individual signals. Since the convolution is performed on periodic signals
(the DFT inherently assumes signals periodicity), a circular shift of signals is assumed
in the calculation. This kind of convolution is called circular convolution.
Relation (3.13) indicates that we can calculate convolution of two aperiodic discrete-
time signals of a limited duration in the following way:
Ljubiša Stanković Digital Signal Processing 109

• Calculate the DFTs of signals x (n) and h(n) with N nonzero samples, to obtain
X (k) and H (k). At this point, inherently, we make periodic extensions of x (n)
and h(n), with a period N.
• Multiply these two DFTs to obtain the DFT of the output signal Y (k ) =
X ( k ) H ( k ).
• Calculate the inverse DFT to get the convolution (the output signal with N
samples)
y(n) = IDFT{Y (k)}.

This procedure looks computationally more complex than the direct calculation of
convolution, by definition. However, due to very efficient and fast routines for the
DFT and the IDFT calculation, this way of calculating the convolution could be more
efficient than the direct one.
In using this procedure, we have to take care about the length of signals and their DFTs
that assume periodic extensions.

Example 3.3. Consider a discrete-time signal

x ( n ) = u ( n ) − u ( n − 5).

Calculate the convolution x (n) ∗ x (n). Periodically extend signals with period N = 7 and
calculate the circular convolution (corresponding to the DFT based convolution calculation
with N = 7). This value of N satisfies the condition that it is larger than the signal duration.
Compare the results. What value of N should be used for the period so that the direct convolution
corresponds to one period of the circular convolution?

⋆The signal x (n) and its reversed version x (−n), along with some shifted signals used
in the convolution calculation, are presented in Fig. 3.3.
In the circular (DFT) calculation, for example, at n = 0, the convolution value is
6
x p (n) ∗n x p (n) = ∑ x p (m) x p (0 − m) = 1 + 0 + 0 + 1 + 1 + 0 + 0 = 3.
m =0

In addition to the term x (0) x (0) = 1 which exists in the aperiodic convolution, two terms
for m = 3 and m = 4 appeared due to the periodic extension of the signal. They made that
the circular convolution value differs from the convolution of original aperiodic signals.
The same situation occurred for n = 1 and n = 2. For n = 3, 4, and 5 the correct result
for the aperiodic convolution is obtained using the circular convolution.
From the previous calculation, it could be concluded that if the signal periods in the
calculation of the circular convolution were separated by at least two more zero signal
samples (if the period N were N ≥ 9) this difference would not occur (overlapping
of the signal samples in the basic period with the extended period samples would be
avoided), as shown in Fig. 3.4 for N = 9. Then one period of the circular convolution, for
0 ≤ n ≤ N − 1, would correspond to the original aperiodic convolution.
110 Discrete Fourier Transform

1.5 6

x(n)* x(n)
1 4
x(n)

0.5
2
0
−0.5 0
−15 −10 −5 0 5 10 15 −15 −10 −5 0 5 10 15
n n
1.5 1.5

x (−m+1)
1 1
xp(m)

0.5 0.5
0 0

p
−0.5 −0.5
−15 −10 −5 0 5 10 15 −15 −10 −5 0 5 10 15
n n
1.5 1.5
x (−m+3)

1 1
xp(−m)

0.5 0.5
0 0
p

−0.5 −0.5
−15 −10 −5 0 5 10 15 −15 −10 −5 0 5 10 15
n n
1.5 6
x (n)* x (n)
xp(−m+5)

1 4
p

0.5
2
0
p

−0.5 0
−15 −10 −5 0 5 10 15 −15 −10 −5 0 5 10 15
n n

Figure 3.3 Illustration of the discrete-time signal convolution and circular convolution for signals whose length is
5 and the circular convolution is calculated with N = 7.

1.5
1
x (−m)
xp(m)

0.5
p

0
−0.5
−15 −10 −5 0 5 10 15 n
n
1.5 6
x (n)* x (n)
xp(−m+8)

1 4
p

0.5
2
0
p

−0.5 0
−15 −10 −5 0 5 10 15 −15 −10 −5 0 5 10 15
n n

Figure 3.4 Illustration of the discrete-time signal circular convolution for signals whose length is 5 and the circular
convolution is calculated with N = 9.
Ljubiša Stanković Digital Signal Processing 111

Generalization: If a signal x (n) is of the length M, then we can calculate its DFT with
any N ≥ M, and the signal will not overlap with its extended periods, implicitly added
using the DFT. If a signal h(n) is of the length L, then we can calculate its DFT with
any N ≥ L. However, if we want to use their DFTs for the convolution calculation (to
use the circular convolution), then from the previous example we see that the length of
the convolution y(n) is equal M + L − 1. Therefore, for the DFT-based calculation of
the convolution y(n), we have to use at least

N ≥ M + L − 1.

This means that both DFTs, X (k) and H (k), whose product results in Y (k), must be
at least of N ≥ M + L − 1 duration (period). Otherwise, aliasing (overlapping of the
periods) will appear and the circular convolution calculated in this way would not
correspond (within the basic period) to the convolution of the original discrete-time
(aperiodic) signals.
Duration of an input signal x (n) may be much longer that the duration of the impulse
response h(n). For example, an input signal may have tens of thousands of samples, while
the impulse response of a discrete system duration is, for example, tens of samples, that is
M ≫ L. The direct convolution of these two signals could be calculated (after first L − 1
output samples) as
n
y(n) = ∑ x ( m ) h ( n − m ).
m = n − L +1

For every output sample, L multiplications would be used. For a direct DFT application in
the convolution calculation we should wait until the end of the signal and then zero-pad
both the input signal and the impulse response up to M + L − 1. This kind of calculation
is not efficient. Instead of using the direct DFT calculation, the signal can be split into
nonoverlapping sequences whose duration N is of the order of the impulse response duration
L,
K −1
x (n) = ∑ x k ( n ),
k =0
where
xk (n) = x (n)[u(n − kN ) − u(n − (k + 1) N ]
and M = KN (the input signal can always be zero-padded up to the nearest KN duration,
where K is an integer). The output signal is
!
K −1 n K −1
y(n) = ∑ ∑ xk (m) h(n − m) = ∑ y k ( n ). (3.14)
k =0 m = n − L +1 k =0

For the calculation of the convolutions yk (n) = xk (n) ∗n h(n), the signals xk (n) and h(n)
should be of duration N + L − 1 only. These convolutions can be calculated after every
N ≪ M input signal samples. The output sequence yk (n) duration is N + L − 1. Since
yk (n), k = 0, 1, . . . , K − 1, are calculated with the step N in the time-domain, they overlap,
although the input signals xk (n) are nonoverlapping. For the two successive yk (n) and
yk+1 (n) and L ≤ N, L − 1, the samples within kN + N ≤ n < kN + N + L − 1 overlap.
112 Discrete Fourier Transform

This effect should be taken into account, by summing the overlapped output samples in y(n),
after the individual convolutions yk (n) = xk (n) ∗n h(n) are calculated using the DFTs, as
shown in Fig. 3.5.

3.3 ZERO-PADDING AND INTERPOLATION

The basic period of the DFT X (k), calculated for k = 0, 1, 2, . . . , N − 1, should be considered
as having two parts:

• One part of the DFT values for 0 ≤ k ≤ N/2 − 1, which corresponds to the positive
frequencies

2π 2π
ω= k or Ω = k, for 0 ≤ k ≤ N/2 − 1, (3.15)
N N∆t
and the

• Other part being a shifted version of the DFT corresponding to the negative frequencies
(in the original aperiodic signal)

2π 2π
ω= (k − N ) or Ω = (k − N ), for N/2 ≤ k ≤ N − 1. (3.16)
N N∆t

Illustration of the frequency value correspondence to the frequency index in the DFT is
given in Fig. 3.6
We have seen that the DFT of a signal whose duration is limited to M samples can be
calculated using any N ≥ M. In practice, this means that we can add (use) as many zeros,
after the nonzero signal x (n) values, as we like. By doing this, we increase the calculation
complexity, but we also increase the number of samples within the same frequency range of
the Fourier transform.
If we recall that

X (k) = X (e jω )|ω =k∆ω =2πk/N = X (Ω)|Ω=k∆Ω=2πk/( N∆t) , (3.17)

holds in the case when the sampling theorem is satisfied, then we see that by increasing N
in the DFT calculation, the density of sampling (interpolation) in the Fourier transform of
the original signal increases. The DFT interpolation by zero padding the signal in the time
domain is illustrated in Fig. 3.7.
The same holds for the frequency domain. If we calculate the DFT of a signal with N
samples and then add, for example, N zeros after the region corresponding to the highest
frequencies, then by the IDFT of this 2N point DFT, we will interpolate the original signal in
time. All zero values in the frequency domain should be inserted between two parts (regions)
of the original DFT corresponding to positive and negative frequencies.
Ljubiša Stanković Digital Signal Processing 113

x(n)

0 n

h(n)

0 n

x (n)
1

0 n

x2(n)

0 n

x (n)
3

0 n

y1(n)

0 n

y2(n)

0 n

y (n)
3

0 n

y(n)

0 n

Figure 3.5 Illustration of the convolution, y(n), calculation when the input signal, x (n), duration is much longer
than the duration of the system impulse response, y(n).
114 Discrete Fourier Transform

X(Ω)|Ω=2πk/(N∆t)

−N/2 0 N/2−1 k
Ω=2πk/(N∆t)

X(k)

0 N k

Figure 3.6 Relation between the frequency in the continuous-time and the DFT frequency index.

Example 3.4. The Hann(ing) window for a signal truncation within − N/2 ≤ n ≤ N/2 − 1, is
1 2πn
w(n) = [1 + cos( )], for − N/2 ≤ n ≤ N/2 − 1. (3.18)
2 N
If the original signal values are within 0 ≤ n ≤ N − 1 then the Hann(ing) window form is
1 2πn
w(n) = [1 − cos( )], for 0 ≤ n ≤ N − 1. (3.19)
2 N
Present the zero-padded forms of the Hann(ing) windows with 2N samples.

⋆The zero-padded form of the Hann(ing) windows used for windowing data within the intervals
− N/2 ≤ n ≤ N/2 − 1 and 0 ≤ n ≤ N − 1 are shown in Fig. 3.8. The DFTs of windows (3.18)
and (3.19) are
W (k) = N [δ(k) + δ(k − 1)/2 + δ(k + 1)/2]/2
and
W (k) = N [δ(k) − δ(k − 1)/2 − δ(k + 1)/2]/2,
respectively. After the presented zero-padding the window DFT realness property w pz (n) =
w pz (n − 2N ) is preserved (for an even N in the case − N/2 ≤ n ≤ N/2 − 1 and for an odd N
for data within 0 ≤ n ≤ N − 1).
Ljubiša Stanković Digital Signal Processing 115

x(n)

n
X(k)

k
x(n)

n
X(k)

k
x(n)

n
X(k)

Figure 3.7 Discrete-time signal and its DFT (top two subplots). Discrete-time signal zero-padded and its DFT
interpolated (two subplots in the middle). Zero-padding (interpolation) factor was 2. Discrete-time signal zero-
padded and its DFT interpolated (two bottom subplots). Zero-padding (interpolation) factor was 4. According to the
duality property, the same holds if X (k) were a signal in the discrete-time and x (−n) was its Fourier transform.
116 Discrete Fourier Transform

w(n)

−N/2 0 N/2−1 n

w (n)
p

0 N n

w (n)
p

0 N 2N n

w(n)

0 N n

wp(n)

0 N n

wp(n)

0 2N n

Figure 3.8 Zero-padding of the Hann(ing) windows used to window data within − N/2 ≤ n ≤ N/2 − 1 and
0 ≤ n ≤ N − 1.
Ljubiša Stanković Digital Signal Processing 117

3.4 RELATION AMONG THE FOURIER REPRESENTATIONS

Presentation of the DFT will be concluded with an illustration (Fig. 3.9) of the relation
among the four forms of the Fourier domain signal representations for the cases of:

1. Continuous-time aperiodic signal (Fourier transform):

Z∞ Z∞
x (t) = 1
2π X (Ω)e jΩt
dΩ, X (Ω) = x (t)e− jΩt dt.
−∞ −∞

2. Continuous-time periodic signal (Fourier series):

∞
x p (t) = ∑ x (t + mT )
m=−∞

∞ T/2
Z
x p (t) = ∑ Xn e j2πnt/T , Xn = 1
T x (t)e− j2πnt/T dt,
n=−∞
− T/2

1
Xn =X (Ω)|Ω=2πn/T .
T
If the periodic signal is formed by a periodic extension of an aperiodic signal x (t) then
there is no signal overlapping (aliasing) in the periodic signal if the original aperiodic
signal duration is shorter than the extension period T and the previous relation holds.

3. Discrete-time aperiodic signal (Fourier transform of discrete-time signals)

x (n) = x (n∆t)∆t,

Zπ ∞
x (n) = 1
2π X (e jω )e jωn dω, X (e jω ) = ∑ x (n)e− jωn ,
−π n=−∞

∞
2π
X (e jω ) = ∑ X (Ω + m ) .
m=−∞ ∆t |Ω=ω/∆t

The Fourier transform X (e jω ) of the discrete-time signal x (n) is a periodic extension

of the Fourier transform X (Ω), ω = Ω∆t, of a continuous-time signal x (t). There
is no overlapping (aliasing) if the width of the Fourier transform of the original
continuous-time signal is shorter than the extension period equal to 2π/∆t.

4. Discrete-time periodic signal (discrete Fourier transform -DFT)

∞
x p (n) = ∑ x (n + mN ) = x p (t)|t=n∆t ,
m=−∞
118 Discrete Fourier Transform

x(t) X(Ω)

X(Ω)
x(t)

t Ω

jω
x(n) = x(t) ∆t X(e ) = X(Ω)
t = n∆t Ω = ω/∆t
−π ≤ ω < π
X(e jω )
x(n)

−π π

n ω

xp(t) = x(t) Xn = X(Ω)/T Ω = 2πn/T

−T/2 ≤ t < T/2
xp(t)

−T/2 T/2

t n

xp(n) = x(n) X(k) = X(e j2πk/T ) = TXk

−N/2 ≤ n < N/2 −N/2 ≤ k < N/2
xp(n)

X(k)

−N/2 N/2 −N/2 N/2

n k

Figure 3.9 Aperiodic continuous-time signal and its Fourier transform (first row). Discrete-time signal and its
Fourier transform (second row). Periodic continuous-time signal and its Fourier series coefficients (third row).
Periodic discrete-time signal and its discrete Fourier transform (DFT), (fourth row).
Ljubiša Stanković Digital Signal Processing 119

N −1 N −1
x p (n) = 1
N ∑ X (k)e j2πnk/N , X (k) = ∑ x (n)e− j2πnk/N ,
k =0 n =0

X (k) = X (e jω )|ω =2πk/N = X (Ω)|Ω=2πk/( N∆t) = TXk .

For the periodic discrete-time signal x p (n), it has been assumed that there is no
overlapping of the original aperiodic discrete-time signal x (n) samples, that is, its
duration is shorter than the extension period N, x (n) = x p (n) for 0 ≤ n ≤ N − 1.

All forms of the Fourier representations are related and shown in Fig. 3.9.

3.5 FAST FOURIER TRANSFORM

Algorithms that provide efficient calculation of the DFT, with a reduced number of arithmetic
operations, are called the fast Fourier transform (FFT). A unified approach to the DFT and
the inverse DFT, (3.5), is used. The only differences between the DFT and inverse DFT
calculation are in the sign of the exponent and a division of the final result by N.
Here we will present an algorithm based on splitting the signal x (n), with N samples,
into two signals x (n) for 0 ≤ n ≤ N/2 − 1 and x (n) for N/2 ≤ n ≤ N − 1, whose duration
is N/2. It is assumed that N is an even number. By definition, the DFT of a signal with N
samples is

N −1
DFT N { x (n))} = X (k) = ∑ x (n)e− j2πnk/N
n =0
N/2−1 N −1
= ∑ x (n)e− j2πnk/N + ∑ x (n)e− j2πnk/N
n =0 n= N/2
N/2−1 h i
= ∑ x (n) + x (n + N/2)(−1)k e− j2πnk/N
n =0

since
e− j2π (n+ N/2)k/N = e− j2πnk/N e− jπk = (−1)k e− j2πnk/N .
For an even number, k = 2r, we have

N/2−1
DFT N/2 { g(n)} = X (2r ) = ∑ g(n)e− j2πnr/( N/2)
n =0

with
g(n) = x (n) + x (n + N/2).
For an odd number, k = 2r + 1, follows

N/2−1
DFT N/2 {h(n)} = X (2r + 1) = ∑ h(n)e− j2πnr/( N/2) ,
n =0
120 Discrete Fourier Transform

where
h(n) = ( x (n) − x (n + N/2))e− j2πn/N .

x(0) X(0)

x(1) X(2)
DFT
4
x(2) X(4)

x(3) X(6)

x(4) X(1)
−1 W 0
8
x(5) X(3)
−1 W 1 DFT
8
4
x(6) X(5)
−1 W 2
8
x(7) X(7)
−1 W 3
8

Figure 3.10 DFT of length 8 calculation using two DFTs of length 4.

x(0) X(0)

x(1) X(4)
−1 W0
8

x(2) 0 X(2)
−1 W
8

x(3) 2 0 X(6)
−1 W −1 W8
8

x(4) 0 X(1)
−1 W8

x(5) 1 0 X(5)
−1 W8 −1 W8

x(6) 2 0 X(3)
−1 W8 −1 W8

x(7) 3 2 0 X(7)
−1 W8 −1 W8 −1 W8

Figure 3.11 FFT calculation scheme obtained by decimation-in-frequency for N = 8.

In this way, one DFT of N elements is split into two DFTs of N/2 elements. Having in
mind that the direct calculation of a DFT with N elements requires an order of N 2 operations,
it means that we have reduced the calculation complexity, since

N 2 > ( N/2)2 + ( N/2)2 .

Ljubiša Stanković Digital Signal Processing 121

An illustration of the DFT calculation, with N = 8, using two DFT with N/2 = 4 is
shown in Fig. 3.10.
We can continue in the same way and split every DFT with N/2 elements into two
DFTs with N/4, and so on. A complete calculation scheme is shown in Fig. 3.11.
This kind of the DFT calculation is referred to as the decimation-in-frequency algorithm.
We can conclude that in this FFT algorithm an order of N log2 N of operations is required.
Here, it has been assumed that log2 N = p is an integer, that is, N = 2 p .
If we want to be precise the number of additions is exactly

Nadditions = N log2 N.

In the first stage, there are ( N/2 − 1) multiplications. In the second stage, there are
2( N/4 − 1) multiplications. In the next stage
would be 4( N/8 − 1) multiplications. Finally
in the last stage would be 2 p−1 2Np − 1 = N2 ( NN
− 1) = 0 multiplications (N = 2 p or
p = log2 N). The total number of multiplications, in this FFT algorithm, is

N N N p −1 N
Nmultiplicat. = −1 +2 −1 +4 − 1 + ··· + 2 −1
2 4 8 2p
N N N N N
= − 1 + − 2 + − 4 + ··· + −
2 2 2 2 2
N N 1 − 2p
= p − (1 + 2 + 22 + · · · + 2 p−1 ) = p −
2 2 1−2
N N
= log2 N − ( N − 1) = [log2 N − 2] + 1.
2 2

If the multiplications by j and − j were excluded the number of multiplications would be

additionally reduced.

Example 3.5. Consider a signal x (n) within 0 ≤ n ≤ N − 1. Assume that N is an even number.
Show that the DFT of x (n) can be calculated as two DFTs, one DFT calculated using the even
samples of x (n) and the other DFT obtained using the odd samples of x (n).

⋆By the DFT definition

N −1 N/2−1 N/2−1
X (k) = ∑ x (n)e− j2πkn/N = ∑ x (2m)e− j2πk2m/N + ∑ x (2m + 1)e− j2πk(2m+1)/N
n =0 m =0 m =0
N/2−1 N/2−1
= ∑ xe (m)e− j2πkm/( N/2) + e− j2πk/N ∑ xo (m)e− j2πkm/( N/2) , (3.20)
m =0 m =0

where xe (m) = x (2m) and xo (m) = x (2m + 1) are the signal samples with even and odd
indices, respectively. If we use the notation Xe (k) = DFT{ xe (n)} and Xo (k) = DFT{ x (no )},
for k = 0, 1, . . . , N/2 − 1, then

X (k) = Xe (k) + e− j2πk/N Xo (k) for k = 0, 1, . . . , N/2 − 1

122 Discrete Fourier Transform

and

X (k) = Xe (k − N/2) − e− j2πk/N Xo (k − N/2) for k = N/2, . . . , N − 1

since Xe (k) and Xo (k) are periodic with period N/2. Thus, the DFT of N elements is split into
two DFTs of N/2 elements. If N/2 is an even number, we can continue and split two DFTs
of N/2 elements into four DFTs of N/4 elements, and so on. This is a decimation-in-time
algorithm, Fig. 3.12.

DFT4 { xe(n) } = Xe(k )

x(0) X(0)

x(4) X(1)
W80 −1

x(2) X(2)
W80 −1

x(6) X(3)
W80 −1 W82 −1

x(1) X(4)
W80 −1

x(5) X(5)
W80 −1 W81 −1

x(3) X(6)
W80 −1 W82 −1

x(7) X(7)
W80 −1 W82 −1 W83 −1

DFT4 { xo(n) } = Xo(k )

Figure 3.12 Decimation-in-time FFT algorithm for N = 8.

Example 3.6. Consider a signal x (n) within 0 ≤ n ≤ N − 1. Assume that N = 3M. Show that the
DFT of x (n) can be calculated using three DFTs of M samples.

⋆The DFT of x (n) is

3M −1
X (k) = ∑ x (n)e− j2πkn/(3M)
n =0
M −1 2M −1 3M −1
= ∑ x (m)e− j2πkm/(3M) + ∑ x (m)e− j2πkm/(3M) + ∑ x (m)e− j2πkm/(3M)
m =0 m= M m=2M
M −1 h 2πkM 2πk2M
i 2πmk
= ∑ x (m) + x (m + M )e− j 3M + x (m + 2M)e− j 3M e− j 3M .
m =0
Ljubiša Stanković Digital Signal Processing 123

Now we can consider three cases for frequency index k:

• For r = 3k, when
M −1
X (3k) = ∑ g(n)e− j2πmk/M
m =0

with g(n) = x (m) + x (m + M) + x (m + 2M).

• When r = 3k + 1, we have
M −1
X (3k + 1) = ∑ r (n)e− j2πmk/M
m =0

with r (n) = x (m) + ax (m + M ) + a2 x (m + 2M) e− j2πm/(3M) .
• For r = 3k + 2, follows
M −1
X (3k + 2) = ∑ p(n)e− j2πmk/M
m =0

with p(n) = x (m) + a2 x (m + M) + ax (m + 2M ) e− j2π2m/(3M) , where a = e− j2π/3 .
Thus, a DFT of N = 3M elements is split into three DFTs of N/3 = M elements. Three
DFTs of N/3 elements require an order of 3( N/3)2 = N 2 /3 operations. If, for example,
M = N/3 is an even number, we can continue and split three DFTs of N/3 elements into six
DFTs of N/6 elements, and so on.

3.6 SAMPLING OF PERIODIC SIGNALS

A periodic signal x (t), with a period T, can be reconstructed if its Fourier series is with
limited number of nonzero coefficients, so that Xk = 0 for k > k m . This means that the
Fourier series coefficients corresponding to frequencies greater than Ωm = 2πk m /T are
zero-valued. The periodic signal x (t) can be reconstructed from the samples taken with the
sampling interval ∆t < π/Ωm = 1/(2 f m ). The number of samples within the period is
N = T/∆t.
The reconstructed signal is

N −1 sin[(n − t
∆t ) π ]
x (t) = ∑ x (n∆t)
N sin[(n − t
n =0 ∆t ) π/N ]

for and odd N and

N −1 sin[(n − t
∆t ) π ]
x (t) = ∑ x (n∆t)e j(n−t/∆t)π/N t
n =0 N sin[(n − ∆t ) π/N ]

for an even N.
124 Discrete Fourier Transform

Example 3.7. Samples of a periodic signal x (t) are taken with the sampling interval ∆t = 1.
The obtained discrete-time signal samples x (n) are the elements of the signal vector x =
[0, 2.8284, − 2, 2.8284, 0, − 2.8284, 2, − 2.8284] T for 0 ≤ n ≤ N − 1 with N = 8. Assuming
that the signal satisfies the sampling theorem find its value at t = 1.5. Check the accuracy if the
original signal values were known, x (t) = 3 sin(3πt/4) + sin(πt/4).

⋆Using the reconstruction formula for an even number of samples, N, within the period we get
7
sin[(n − 1.5)π ]
x (1.5) = ∑ x(n)e j(n−1.5)π/8 8 sin[(n − 1.5)π/8] = −0.2242.
n =0

This result is equal to the original signal value. Calculation is repeated with 0 ≤ t ≤ 8, with a
step 0.01. The reconstructed values of x (t) are presented in Fig. 3.13.

4
x(t), x(n) with ∆t=1

−2

−4
0 2 4 6 8
time

Figure 3.13 Periodic signal reconstructed from its samples at ∆t = 1.

In order to prove the sampling theorem of periodic signals write the signal x (t) in the
form of the Fourier series expansion

km
x (t) = ∑ Xk e j2πkt/T . (3.21)
k =−k m

Using N samples of the signal x (t) within the period (assuming that N is an odd number),
that is, by sampling the signal at instants n∆t = nT/N, we get

km
x (n∆t) = ∑ Xk e j2πkn/N .
k =−k m
Ljubiša Stanković Digital Signal Processing 125

With ( N − 1)/2 ≥ k m we can write

( N −1)/2
T km T
x (n∆t)∆t = ∑ Xk e j2πkn/N = N
N k=− ∑ Xk e j2πkn/N .
k m k =−( N −1)/2

With x (n∆t)∆t = x (n) and TXk = X (k) this form of the Fourier series reduces to the DFT
and the inverse DFT

( N −1)/2 N −1
1
X (k )e j2πkn/N , x (n)− j2πkn/N .
N k=−(∑ ∑
x (n) = X (k) =
N −1)/2 n =0

Substituting the Fourier series coefficients Xk , expressed in terms of X (k) and x (n), into
signal (3.21), with k m = ( N − 1)/2, we get

N −1 N −1
2 N −1 N −1 2
1 n j2πk t
− j2πk N 1 t n
x (t) = ∑ ∑ x (n)e eT = ∑ ∑ x (n∆t)e j2πk( T − N )
T N
k =− N2−1 n=0 n=0 k =− N −1
2
N −1 t n
1 t n 1 − e j2π ( T − N ) N
= ∑ x (n∆t)e− j2π ( T − N )( N −1)/2 t n
N n =0 1 − e j2π ( T − N )
N −1 π
sin[ ∆t (t − n∆t)]
= ∑ x (n∆t) π
N sin[ N∆t (t − n∆t)]
.
n =0

This is the reconstruction formula that can be used to calculate x (t) for any t, based on the
signal samples x (n∆t) at the instants n∆t, with ∆t < π/Ωm = 1/(2 f m ).
In a similar way, the reconstruction formula for an even number of samples N can be
obtained.
The sampling theorem reconstruction formula of aperiodic signals follows as a special
case as N → ∞, since for a small argument

π π
sin[ (t − n∆t)] → (t − n∆t)
N∆t N∆t
and
∞ π
sin[ ∆t (t − n∆t)]
x (t) → ∑ x (n∆t) π .
n=−∞ ∆t ( t − n∆t )

Example 3.8. For a signal x (t) whose period is T, it is known that the signal has components
corresponding to the nonzero Fourier series coefficients at the indices k1 , k2 , . . . , k K . What is the
minimum number of signal samples needed to reconstruct the signal? What condition should be
satisfied the sampling instants and the frequencies for the reconstruction?
126 Discrete Fourier Transform

⋆The signal x (t) can be reconstructed using the Fourier series (1.22). In calculations, a finite
number of K nonzero terms will be used,
K
x (t) = ∑ Xkm e j2πkm t/T .
m =1

Since there are K unknown values Xk1 , Xk2 ,. . . , XkK the minimum number of equations to
calculate their values is K. The equations are written for K time instants
K
∑ Xkm e j2πkm ti /T = x (ti ), for i = 1, 2, . . . , K
m =1

Xk1 e j2πk1 t1 /T + Xk2 e j2πk2 t1 /T + · · · + XkK e j2πkK t1 /T = x (t1 )

Xk1 e j2πk1 t2 /T + Xk2 e j2πk2 t2 /T + · · · + XkK e j2πkK t2 /T = x (t2 )
..
.
Xk1 e j2πk1 tK /T + Xk2 e j2πk2 tK /T + · · · + XkK e j2πkK tK /T = x (tK ).

A matrix from of this system of equations is

ΦX= y, X = Φ −1 y

where

X = [ Xk 1 Xk 2 . . . Xk K ] T , y = [ x (t1 ) x (t2 ) . . . x (tK )] T

 
e j2πk1 t1 /T e j2πk2 t1 /T ... e j2πkK t1 /T
 e j2πk1 t2 /T e j2πk2 t2 /T ... e j2πkK t2 /T 
 
Φ= .. .. .. .. 
 . . . . 
e j2πk1 tK /T e j2πk2 tK /T ... e j2πkK tK /T

The reconstruction condition is det kΦk 6= 0 for selected time instants ti and given frequency
indices k i .

3.7 ANALYSIS OF A SINUSOID USING THE DFT

Analysis and estimation of frequency and amplitude of pure sinusoidal signals is of great
importance in many applications.
Consider a simple continuous-time sinusoidal signal

x (t) = Ae jΩ0 t (3.22)

whose Fourier transform is X (Ω) = 2π Aδ(Ω − Ω0 ). The whole signal energy is concen-
trated just in one frequency point at Ω = Ω0 . Obviously, the position of maximum is equal
Ljubiša Stanković Digital Signal Processing 127

to the signal frequency. For this operation we will use the notation

Ω0 = arg max | X (Ω)| . (3.23)
−∞<Ω<∞

Assume that the signal x (t) is sampled with the sampling interval ∆t. The discrete-time
form of this signal is
x (n) = Ae jω0 n ∆t,
where ω0 = Ω0 ∆t. In order to compute the DFT of this signal, we will assume a value of N
and calculate
N −1
X (k) = ∑ Ae jω0 n e− j2πnk/N ∆t.
n =0
In general, the DFT is of the form

N −1
1 − e jω0 N e− j2πk
X (k) = A ∑ e jω0 n e− j2πnk/N ∆t = A ∆t (3.24)
n =0 1 − e jω0 e− j2πk/N
sin( N (ω0 − 2πk/N )/2)
= Ae j(( N −1)(ω0 −2πk/N )/2) ∆t (3.25)
sin((ω0 − 2πk/N )/2)

The absolute value of the DFT is given by

sin( N (ω0 − 2πk/N )/2)
| X (k)| = | A| ∆t. (3.26)
sin((ω0 − 2πk/N )/2)

3.7.1 Leakage Effect

Two cases may appear in the analysis of a sinusoidal signal:

1. The signal frequency is such that

ω0 = 2πk0 /N

or Ω0 = 2πk0 /( N∆t), where 0 ≤ k0 ≤ N − 1 is an integer. The sampling interval ∆t is

such that it is contained an integer number of times within the period T0 = 2π/Ω0 . Then

N −1
X (k) = A ∑ e j2πk0 n/N e− j2πnk/N ∆t = N Aδ(k − k0 )∆t. (3.27)
n =0

Obviously, in this case we can find the signal frequency index from

k0 = arg{ max | X (k)|}. (3.28)

0≤ k ≤ N −1

Frequency is calculated as Ω0 = 2πk0 /( N∆t) for 0 ≤ k0 ≤ N/2 − 1 and Ω0 =

2π (k0 − N )/( N∆t) for N/2 ≤ k0 ≤ N − 1. This case is illustrated in Fig. 3.14, top row.
Noisy signals will be considered later in the book.
128 Discrete Fourier Transform

The estimate of the signal amplitude is obtained as

1
A= X ( k 0 ).
N∆t

2. In reality, the signal period (or Ω0 ) is not known in advance (if we knew it, then this
analysis would not be needed). So, it is highly unlikely to have the previous case with the
frequency on the grid, when Ω0 = 2πk0 /( N∆t) as in Fig. 3.14, top row. More common
is the case illustrated in Fig. 3.14, bottom row, when the true signal frequency does not
correspond to any DFT sample position. Then, the simple signal of sinusoidal form produces
th DFT components at all frequencies since | X (k )| in (3.26) is not zero for any k. This effect
that a simple sinusoidal signal produces nonzero DFT values at all frequencies (Fig. 3.14,
bottom row) is known as the leakage effect.
X(k)
x(n)

n k
X(k)
x(n)

n k

Figure 3.14 Sinusoid x (n) = cos(8πn/64) and its DFT with N = 64 (top row) and sinusoid x (n) =
cos(8.8πn/64) and its DFT absolute value, with N = 64 (bottom row).

The estimation of frequency, based on

sin( N (ω0 − 2πk/N )/2)
k̂0 = arg max ,

0≤k≤ N −1 sin(( ω0 − 2πk/N ) /2)

will produce an estimation error, defined by

2π
e = Ω0 − k̂0 .
N∆t

The estimation error could be up to a half of the discretization interval, ∆Ω = 2π/( N∆t),

π π 2π π 2π π
− ≤e< and k̂0 − ≤ Ω0 < k̂0 + . (3.29)
N∆t N∆t N∆t N∆t N∆t N∆t
Two ways to improve the estimation will be described here.
Ljubiša Stanković Digital Signal Processing 129

1. The simplest way to reduce the estimation error is to increase the number of samples and
to reduce the discretization interval in frequency ∆Ω = 2π/( N∆t). This could be achieved
by appropriate zero-padding in the time domain, before the DFT calculation (corresponding
to the interpolation in the frequency domain). This way of error reduction increases the
calculation complexity.
2. The other way is based on the window function application in the DFT calculation

N −1 2πk
X (k) = ∑ w(n) Ae jω0 n e− j2πnk/N ∆t = W e j( N −ω0 ) ∆t,
n =0

where W (e jω ) is the Fourier transform of the window function. Windows, like for example
Hann(ing) or Hamming window, smooth the transition and reduce discontinuities at the
ending calculation points that cause leakage. A simple realization with, for example, the
Hann(ing) window (relation (2.31) and Fig. 2.7)

1
w(n) = [1 − cos(2nπ/N )] [u(n) − u(n − N − 1)] .
2

adjusted to the time interval 0 ≤ n ≤ N − 1, produces

N −1
1
X H (k) = ∑ [1 − cos(2nπ/N )] Ae jω0 n e− j2πnk/N ∆t
n =0
2

A N −1 1 j2nπ/N 1 − j2nπ/N
Ae jω0 n e− j2πnk/N ∆t
2 n∑
= 1 − e − e
=0 2 2

1 1 1
= X R ( k ) − X R ( k − 1) − X R ( k + 1) ,
2 2 2

where XR (k) would be the DFT if the rectangular window were used. It is defined by (3.24).
The DFT of sinusoids on the grid and outside of the grid, multiplied by a Hann(ing) window,
are shown in Fig. 3.15. The leakage effect is reduced. However the DFT is spread over two
additional consecutive samples even in the case when the frequency is on the DFT grid, Fig.
3.15(top). In this case the amplitude is estimated as

1
A= [ X (k0 ) + X (k0 + 1) + X (k0 − 1)].
N∆t
This method is more efficient for the leakage effect reduction than for the improvement in the
frequency estimation. However, the idea of using a few neighboring samples in the parameters
estimation will be used next to define an approach for accurate frequency estimation.

3.7.2 Displacement

The maximum DFT value and its relation with a few surrounding values of the windowed
DFT are used to calculate correction, the displacement bin, of the estimated frequency.
130 Discrete Fourier Transform

x(n)w(n)

XH(k)
n k
x(n)w(n)

XH(k)
n k

Figure 3.15 Sinusoid x (n) = cos(8πn/64) multiplied by a Hann(ing) window and its DFT with N = 64 (top
row) and sinusoid x (n) = cos(8.8πn/64) multiplied by a Hann(ing) window and its DFT absolute value, with
N = 64 (bottom row).

If we apply a window function w(n) in the DFT calculation, we get

2πk
X (k) = W e j( N −ω0 ) ∆t.

For a given window function it is possible to derive the exact displacement formula for
the shift of the maximum position with respect to the detected maxim position. However,
instead of deriving an exact formula for every window form, we will present an approach
that combines the interpolation and a general fitting polynomial form. It can be used with
any window.
We can always interpolate the DFT values X (k) (by appropriate zero-padding of the
signal x (n)), so that there are several DFT samples within the main lobe. Then, for any
symmetric window we can approximate the Fourier transform around the maximum by
a quadratic function (in analog domain X (Ω) = aΩ2 + bΩ + c). Since there are three
parameters, a, b, and c, in this approximation, we need three Fourier transform values to
calculate them. Let us denote the largest sample of the Fourier transform, following from

k̂0 = arg max | X (k)| ,
0≤ k ≤ N −1

by
X0 = | X (k̂0 )|
and the two neighboring Fourier transform samples by

X−1 = | X (k̂0 − 1)| and X1 = | X (k̂0 + 1)|.

By using the Lagrange polynomial interpolation of the second-order, at a point x = d,

taking the bin index as the independent variable k −1 = −1, k0 = 0, k1 = 1, with the Fourier
Ljubiša Stanković Digital Signal Processing 131

transform values at these points being denoted by X−1 , X0 and X1 , we have the Lagrange
second-order polynomial, Fig. 3.16,

(d − 0)(d − 1) (d + 1)(d − 1) (d − 0)(d + 1)

X (k̂0 + d) = X−1 + X0 + X1
(−1 − 0)(−1 − 1) (0 + 1)(0 − 1) (1 − 0)(1 + 1)
= d2 [− X0 + X−1 /2 + X1 /2)] + d[ X1 − X−1 ]/2 + X0 . (3.30)

This function reaches its maximum at ∂X (k̂0 + d)/∂d = 0, resulting in the displacement bin
for the frequency correction

| X (k̂0 + 1)| − | X (k̂0 − 1)|

d = 0.5 , (3.31)
2| X (k̂0 )| − | X (k̂0 + 1)| − | X (k̂0 − 1)|

with the frequency as in (3.32). The displacement calculation is illustrated in Fig. 3.16. Thus,

X(0) X(1)
X(Ω), X(k)

X(Ω), X(k)
X(−1)

Ω, k Ω, k

Figure 3.16 Illustration of the displacement bin correction for a true maximum position calculation based on the
three neighboring values (full range – left and zoomed graph – right) .

the best frequency estimation is

2π
Ω0 = (k̂0 + d) (3.32)
N∆t
2π
for 0 ≤ k̂0 ≤ N/2 − 1 and Ω0 = N∆t (( k̂ 0 + d) − N ) for N/2 ≤ k̂0 ≤ N − 1.

Example 3.9. The sinusoidal signal x (t) = A exp( jΩ0 t) is sampled with a sampling interval
∆t = 1/128 and N0 = 64 samples are considered. Prior to the DFT calculation, the signal is
zero-padded four times, up to N = 256. The DFT maximum is detected at the frequency index
position k̂0 = 95. The maximum DFT value is X (95) = 0.9936. Neighboring DFT values are
X (96) = 0.9432 and X (94) = 0.8470. Calculate the displacement bin d and estimate the value
of signal frequency Ω0 .

⋆The displacement bin value is

0.9432 − 0.8470
d = 0.5 = 0.2442.
1.9872 − 0.9432 − 0.8470
132 Discrete Fourier Transform

The total number of samples in the DFT calculation was N = 4N0 = 256, meaning that the
value k̂0 = 95 is within the first half of the samples (corresponding to positive frequency Ω0 ).
Therefore, we can use (3.32) for the frequency calculation
2π
Ω0 = (k̂ + d) = 95.2442π.
N∆t 0
The true signal used in simulation was x (t) = exp( j95.25t)/64, with the estimation error
e = 95.25 − 95.2442 = 0.0058. If the position of the maximum was only used the estimated
frequency would be 95π with an error of e = 0.25.

It is possible to derive the exact displacement formula for some specific windows,
based on their Fourier transform function. For example, for the Hann(ing) window the exact
displacement formula is

1.5[| X (k̂0 + 1)| − | X (k̂0 − 1)|]

dH = . (3.33)
| X (k̂0 − 1)| 1 + |X (k̂0 +1)| + | X (k̂0 )| + | X (k̂0 + 1)|
| X (k̂0 )|

After the displacement is calculated the signal can also be modulated for the displace-
ment frequency shift in order to produce a signal with the frequency on the frequency grid.
This is especially important if we expect that the signal contains much smaller higher-order
harmonics that were masked with strong values of the dominant harmonic. If we detected
that the k0 th harmonic is dominant and displaced for d then this harmonic should be removed
from the signal modulated by the resulting estimated frequency. The DFT of the new signal
is used for the analysis of the second largest harmonic and so on.

3.8 DISCRETE COSINE AND SINE TRANSFORMS

The DFT of signal satisfies many desirable properties. Its calculation is simple and efficient
using the FFT algorithm. With the DFT calculation the signal periodic extension is assumed
and embedded in the discrete transform. However, this periodic extension of signal will, in
general, introduce significant signal change (corresponding to discontinuities in continuous
time) at the period ending points Fig. 3.17 (first and second row). This change (discontinuity)
will significantly worsen the DFT coefficients convergence and increase the number of
coefficients needed in the signal reconstruction for a given accuracy. In order to reduce
influence of this effect and to improve convergence of signal transform coefficients the signal
could be extended in an appropriate way.
The discrete cosine transforms (DCT) and the discrete sine transforms (DST) are used
to analyze real-valued discrete-time signals, periodically extended to produce even or odd
signal forms, respectively. However, this extension is not straightforward for discrete-time
signals.
Consider a discrete-time signal of duration N, when x (n) takes nonzero values for
0 ≤ n ≤ N − 1. If we try with a direct extension (using all signal values) and form a periodic
Ljubiša Stanković Digital Signal Processing 133

signal y(n), whose basic period is of duration 2N, as

x (n) for 0≤n≤ N−1
y(n) =
x (2N − n − 1) for N ≤ n ≤ 2N − 1

the obtained signal is not even, Fig. 3.17(third row). It is obvious that y(n) does not satisfy
the condition y(n) = y(−n) = y(2N − n), required for a real-valued DFT. The same holds
for an odd extension, Fig. 3.17(fourth row),

x (n) for 0≤n≤ N−1
y(n) = .
− x (2N − n − 1) for N ≤ n ≤ 2N − 1

One of our goals, to have a real-valued transform after the periodic extension of a real-valued
signal, is not achieved. However, from Fig. 3.17(third and fourth row) we can see that the
signals y(n) are even (or odd) with respect to the vertical line at n = −1/2. Thus, if we add
zeros between every sample of y(n) and assume that the position which was at n = −1/2 in
the initial signal is the new coordinate origin, n = 0, in the new signal z(n), then these signals
will be even and odd, respectively, Fig. 3.17(last two rows). This is just one of possible
extensions to make the original discrete-time signal even (or odd). Several forms of the DCT
and DST are defined based on other ways of getting an even (odd) signal extension.
The most commonly used form of the DCT is the so called DCT-II or just DCT. If no
form of the DCT is referred to in its name, then it is assumed that DCT-II form is used. It will
be presented here. Signal periodic extension for this transform corresponds to the already
described one in Fig. 3.17. The DCT definition is

N −1
2π (2n + 1)
C (k) = ∑ 2x(n) cos( 4N
k ), 0 ≤ k ≤ N − 1.
n =0

This transform will be derived and explained next. There are two main advantages of this
transform over the standard DFT calculation. The DCT coefficients are real-valued for a
real-valued signal. This transform can produce a better energy concentration than the DFT.
In order to understand why a better energy concentration can be obtained we will compare
the DCT to the standard DFT
N −1
X (k) = ∑ x (n)e− j2πnk/N , 0≤k≤ N−1
n =0

calculation procedures for a real-valued signal x (n) with N samples.

In the DCT calculation it is assumed that the signal is extended as an even function, by
creating a sequence

x (n) for 0≤n≤ N−1
y(n) =
x (2N − n − 1) for N ≤ n ≤ 2N − 1.

This extension eliminates possible signal discontinuities at the period ending points. Thus, in
general the Fourier transform of such a signal will converge faster, requiring fewer coefficients
134 Discrete Fourier Transform

x(n)

x(n+N) x(n) x(n−N)

x(−n−1) y(n) x(n) x(2N−n−1)

−x(−n−1) y(n) x(n) −x(2N−n−1)

z(n)

Figure 3.17 Illustration of a signal x (n), its periodic extension corresponding to the DFT, an even and odd
discrete-time signal extension corresponding to the DCT and DST of type II.

in the reconstruction. A zero value is then inserted between every pair of samples and an
even signal z(n), with period 4N, is formed

z(2n + 1) = y(n), z(2n) = 0.

The 4N-sample DFT of z(n) is calculated

4N −1 2N −1
XC (k ) = DFT{z(n)} = ∑ z(n)e− j2πnk/(4N ) = ∑ z(2n + 1)e− j2π (2n+1)k/(4N )
n =0 n =0
2N −1 N −1
2π (2n + 1)k
= ∑ y(n)e− j2π (2n+1)k/(4N ) = ∑ 2x(n) cos( ) = C ( k ).
n =0 n =0
4N
Ljubiša Stanković Digital Signal Processing 135

Only N terms of the transform are used and the DCT values are obtained.
Since the basis functions are orthogonal the inverse DCT is obtained by multiplying
2π (2m+1)k
both sides of the DCT by cos( 4N ) and summing over 0 ≤ k ≤ N − 1,

N −1 N −1
2π (2n + 1)k 2π (2m + 1)k
∑ 2x(n) ∑ wk cos(
4N
) cos(
4N
)
n =0 k =0
N −1
2π (2m + 1)k
= ∑ wk C (k) cos(
4N
),
k =0

where w0 = 1/2 and wk = 1 for k 6= 0. Since

N −1
2π (2n + 1)k 2π (2m + 1)k N
∑ wk cos(
4N
) cos(
4N
) = δ(m − n)
2
k =0

we get
N −1
1 2π (2n + 1)k
x (n) = ∑ wk C (k) cos( ). (3.34)
N k =0
4N
A symmetric relation, with the same coefficients in the time and frequency domain, is

N −1
2π (2n + 1)k
C (k ) = vk ∑ x (n) cos(
4N
)
n =0
N −1
2π (2n + 1)k
x (n) = ∑ vk C (k) cos(
4N
),
k =0
√ √
where v0 = 1/N and vk = 2/N for k 6= 0.
In a similar way the discrete sine transforms are defined. The most common form is the
DST of type II (DST-II), whose definition is

N −1
2π (2n + 1)
S(k) = ∑ 2x(n) sin( 4N
(k + 1))
n =0

for 0 ≤ k ≤ N − 1. Its relation to the DFT can be established by creating a sequence

x (n) for 0≤n≤ N−1
y(n) =
− x (2N − n − 1) for N ≤ n ≤ 2N − 1

Zero values are inserted and a signal z(n) is formed as

z(2n + 1) = y(n)
z(2n) = 0.
136 Discrete Fourier Transform

Again a 4N-sample DFT is calculated

4N −1 2N −1
XS ( k ) = ∑ z(n)e− j2πnk/(4N ) = ∑ y(n)e− j2π (2n+1)k/(4N )
n =0 n =0
( )
N −1
2π (2n + 1)k
= Im ∑ 2jx(n) sin( 4N
) = S ( k ),
n =0

with N terms of the transform being used. The DST is the imaginary part of this DFT.

Example 3.10. Consider the signal

x (n) = cos(2π (2n + 1)/64) + 0.75 cos(7π (2n + 1)/64).

Calculate its DFT with N = 32. Plot the periodic extension of the signal x (n). Plot its even
extension y(n). Calculate the DFT (the DCT) of such a signal and discuss the results.

⋆Signal x (n), along with its extended versions and corresponding transforms, is presented in
Fig. 3.18. Better energy concentration in the DCT is due to the introduced symmetry in y(n). The
X(k)
x(n)

n k
[x(n) x(n)]

X2(k)

n k
C(k)
y(n)

n k

Figure 3.18 Illustration of the DCT calculation.

artificial discontinuity in the DFT, which causes its slow convergence, is eliminated in the DCT.
Ljubiša Stanković Digital Signal Processing 137

By using periodic extensions in the cosine transform, the convolution property of the
DFT is lost. Thus, this kind of transforms may be used for a signal reconstruction and
compression but not in the realization of discrete systems, unless they are properly related to
the corresponding DFT values (see Problem 3.10).

Example 3.11. For the signal

x (n) = sin(2π (2n + 1)/64) − 0.5 cos(7π (2n + 1)/64).

Calculate its DFT with N = 32. Plot the periodic extension of this signal. Plot even and odd
extensions y(n) of x (n). Calculate the DCT and DST. Comment the results.

⋆The signal x (n), with its periodic extensions yc (n) and ys (n), corresponding to the DFT,
DCT, and DST, respectively, is presented in Fig. 3.19(left), as x (n), [ x (n) x (n)], yc (n), and
ys (n), respectively. The corresponding transforms are shown in the right panels of this figure.
Note that the convergence of the DFT and DCT is similar. Here the DST converges faster, since
its extension is "smoother".

3.9 DISCRETE WALSH-HADAMARD AND HAAR TRANSFORMS

Two discrete signal transforms that can be calculated without using multiplications will
be presented next. One of them will be used to explain the basic principle of the wavelet
transform calculation as well.
Let us consider a two-sample signal x (n), with N = 2. The corresponding two-sample
DFT is
1
X (k) = ∑ x(n)e− j2πnk/2 = x(0) + (−1)k x(1).
n =0

It can be calculated without using multiplications, X (0) = x (0) + x (1) and X (1) =
x (0) − x (1). Now we can show that it is possible to define basis functions for any signal
duration in such a way that the multiplications are not used in the signal transformation.
These transform values will be denoted by H (k ). For two-sample signal case

H (0) = x (0) + x (1), for k = 0 and H (1) = x (0) − x (1), for k = 1.

The whole frequency interval is represented by a low-frequency value X (0) and a high-
frequency value X (1). In a matrix form

H (0) 1 1 x (0)
= . (3.35)
H (1) 1 −1 x (1)

The transformation matrix is

1 1
T2 = . (3.36)
1 −1
138 Discrete Fourier Transform

X(k)
x(n)

n k
[x(n) x(n)]

X2(k)
n k
y (n)

C(k)
c

n k
y (n)

S(k)
s

n k

Figure 3.19 Signal and its periodic extensions, corresponding to: the DFT (second row), the cosine transform
(third row), and the sine transform (fourth row). Positive frequencies for the DFT are shown.

Example 3.12. For the signal shown in Fig. 3.20 calculate the two-sample DFT for every pair of
signal samples

Hn (0) = y L (n) = x (2n) + x (2n + 1)

Hn (1) = y H (n) = x (2n) − x (2n + 1)

for 0 ≤ n ≤ N/2 − 1. Discuss the results.

⋆The values of lowpass part, Hn (0) = y L (n), and the highpass part, Hn (1) = y H (n), are
calculated and are presented in Fig. 3.20. The signal y L (n) is a low-frequency and smoothed
version of the original signal, while the signal y H (n) contains the details that are lost in the
smoothed version y L (n).
Ljubiša Stanković Digital Signal Processing 139

x(n)

n
yL(n)

n
yH(n)

Figure 3.20 Original signal x (n) and its two-sample lowpass part y L (n) and highpass part y H (n).

The original signal values may easily be reconstructed from Hn (0) = y L (n) and Hn (1) =
y H (n) as
x (2n) 1 1 1 Hn (0)
=
x (2n + 1) 2 1 −1 Hn (1)
for 0 ≤ n ≤ N/2 − 1.
In some cases the smoothed version y L (n), with a half of the samples of the original signal,
(3.20), is quite good representative of the original signal, so there is no need to use corrections.
Note that for many instants, the correction is zero as well. This result can be used as a basis
for the signal compression, when the signal is presented with a reduced set of samples, with no
significant distortion.

There are two possibilities to continue and apply the two-point DFT scheme to a signal
with N samples:

• The first one consists in splitting further both existing (lowpass and highpass) signals,
y L (n) and y H (n), into their corresponding lowpass and highpass parts. This scheme
leads to the discrete Walsh-Hadamard transform, shown in Fig. 3.21 for the signal
x (n) from Fig. 3.20.

• In the second case, the splitting is done for the lowpass part, y L (n), only, while the
highpass correction, y H (n), is kept unchanged. This scheme leads to the Haar wavelet
transform, Fig. 3.22.

These two transforms will be explained in details next.

140 Discrete Fourier Transform

x(n)

yH(n)
y (n)
L

n n
yHL(n)
y (n)
LL

n n
yHH(n)
y (n)
LH

n n

Figure 3.21 Illustration of the procedure leading to the Walsh-Hadamard transform calculation.

3.9.1 Discrete Walsh-Hadamard Transform

Let us continue the idea of splitting both (lowpass and highpass) parts of the signal and define
a transformation of a four-sample signal. For this signal form two auxiliary two-sample
signals y L (n) and y H (n) as

y L (0) = x (0) + x (1), y L (1) = x (2) + x (3) (3.37)

y H (0) = x (0) − x (1), y H (1) = x (2) − x (3). (3.38)

They represent low-frequency and high-frequency parts of the pairs: x (0), x (1) and x (2),
x (3) of two-sample signals. The lowpass part of the auxiliary two-sample lowpass signal
y L (n) is
H (0) = y L (0) + y L (1) = x (0) + x (1) + x (2) + x (3).
The highpass part of the auxiliary two-sample lowpass signal y L (n) is

H (1) = y L (0) − y L (1) = x (0) + x (1) − x (2) − x (3).

Ljubiša Stanković Digital Signal Processing 141

y (n)
x(n)

H
n n

y (n)
y (n)

LH
L

n n
y (n)
LL

Figure 3.22 Illustration of the procedure leading to the Haar wavelet transform calculation.

Then we calculate the lowpass part of the auxiliary highpass signal as

H (3) = y H (0) + y H (1) = x (0) − x (1) + x (2) − x (3).

Finally the highpass part of the auxiliary highpass signal is

H (4) = y H (0) − y H (1) = x (0) − x (1) − x (2) + x (3).

The transformation matrix, for the case of four-sample transform, is

 
  1 1 y L (0)
H (0)  
 H (1)   1 −1 y L (1) 
 = -- - - - - - - - - - - - - - - - -- . (3.39)
 H (2)   
 1 1 y H (0) 
H (3)
1 −1 y H (1)

By replacing the values of y L (n) and y H (n) with signal values x (n), we get the
transformation equation
     

H (0) 1 1 1 1 x (0) x (0)
 H (1)   1 −1 −1   x (1)   
   1   = T4  x (1)  , (3.40)
 H (2)  =  1 −1 1 −1   x (2)   x (2) 
H (3) 1 −1 −1 1 x (3) x (3)
142 Discrete Fourier Transform

with the transformation matrix T4 .

The next step would be in grouping the two four-sample transforms into an eight-
sample-based analysis. The transformation equation in the case of eight signal samples
is        
H (0) 1 1 1 1 1 1 1 1 x (0) x (0)
 H (1)   1 1 1 1 −1 −1 −1 −1   x (1)   x (1) 
       
 H (2)   1 1 −1 −1 1 1 −1 −1   x (2)   x (2) 
       
 H (3)   1 1 −1 −1 −1 −1 1 1   x (3)   x (3) 
 =    = T8  , (3.41)
 H (4)   1 −1 1 −1 1 −1 1 −1   x (4)   x (4) 
       
 H (5)   1 −1 1 −1 −1 1 −1 1   x (5)   x (5) 
       
 H (6)   1 −1 −1 1 1 −1 −1 1   x (6)   x (6) 
H (7) 1 −1 −1 1 −1 1 1 −1 x (7) x (7)
where the transformation matrix is denoted by T8 . Note that the inverse transform is
1 T
x = T− 1
8 H= T H.
8 8
The four-sample transformation matrix could be written as
" # " #
T2 ⊗ 1 1 T2 ⊗ [T2 (1, :)]
T4 = =
T2 ⊗ 1 −1 T2 ⊗ [T2 (2, :)]

where ⊗ denotes Kronecker multiplication of two submatrices in T2 (its rows) with T2 , defined by
(3.36). Notation T2 (i, :) is used for the ith row of T2 . The transformation matrix of order N is obtained
by a Kronecker product of N/2-order transformation matrix rows and T2 ,
 
T2 ⊗ [T N/2 (1, :)]
 T ⊗ T
 2 [ N/2 (2, :)]  
TN =  
.
 (3.42)
 . . . 
T2 ⊗ [T N/2 ( N/2, :)]

In this way, although we started from a two-point DFT, in splitting the frequency domain, we did
not obtain the Fourier transform of a signal, but a form of the Walsh-Hadamard transform. In ordering
the coefficients (matrix rows) in our example, we followed the frequency region order from the Fourier
domain (for example, in the four-sample case, low-low, low-high, high-low, and high-high frequency
region).
Three ways of ordering transform coefficients in the Walsh-Hadamard transform (ordering of
transformation matrix rows) are used. They produce the same result with different orderings of the
coefficients and different recursive formulae for constructing the transformation matrices. The presented
way of ordering coefficients, as in (3.41), is known as the Walsh transform with dyadic ordering. It
will be used in examples and denoted as the Walsh-Hadamard transform.
The Hadamard transform would correspond to the so called natural ordering of rows from the
transformation matrix T8 ,  
1 1 1 1 1 1 1 1
 1 −1 1 −1 1 −1 1 −1 
 
 1 1 −1 −1 1 1 −1 −1 
 
 1 −1 −1 1 1 −1 −1 1 
H8 =  1 1 1 1 −1 −1 −1 −1 

 
 1 −1 1 −1 −1 1 −1 1 
 
 1 1 −1 −1 −1 −1 1 1 
1 −1 −1 1 −1 1 1 −1
Ljubiša Stanković Digital Signal Processing 143

It would correspond to [ H (0), H (4), H (2), H (6), H (1), H (5), H (3), H (7)] T order of coefficients
in the Walsh transform with dyadic ordering (3.41).
Recursive construction of the Hadamard transform matrix H2N is easy using the Kronecker
product of T2 defined by (3.36) and HN ,

HN HN
H2N = T2 ⊗ HN = .
HN −HN

The following order [ H (0), H (1), H (3), H (2), H (6), H (7), H (5), H (4)] T in (3.41) would corre-
spond to a Walsh transform with sequence ordering.
Calculation of the Walsh-Hadamard transforms requires only additions. For an N-order transform
the number of additions is ( N − 1) N.

3.9.2 Discrete Haar Wavelet Transform

Consider again two pairs of signal samples, x (0), x (1) and x (2), x (3). The high frequency parts of
these pairs are calculated as y H (n) = x (2n) − x (2n + 1), for n = 0, 1. They are used in the Haar
transform without any further modification. Since they represent highpass Haar transform coefficients,
they will be denoted, by W (2) = y H (0) = x (0) − x (1) and W (3) = y H (1) = x (2) − x (3). The
lowpass coefficients of these pairs are y L (0) = x (0) + x (1) and y L (1) = x (2) + x (3). The highpass
and lowpass parts of these signals are calculated as y LH (0) = [ x (0) + x (1)] − [ x (2) + x (3)] and
y LL (0) = [ x (0) + x (1)] + [ x (2) + x (3)]. For a four-sample signal the transformation ends here with
W (1) = y LH (0) and W (0) = y LL (0). Note that the order of coefficients is such that the lowest
frequency coefficient corresponds to the transform index k = 0. Matrix form of the transform for a
four-sample signal is     
W (0) 1 1 1 1 x (0)
 W (1)   1 1 −1 −1   x (1) 
    
 W (2)  =  1 −1 0 0   x (2)  .
W (3) 0 0 1 −1 x (3)
For an eight-sample signal the highpass coefficients would be kept without further modification in
every step (scale), while for the lowpass parts of signal their highpass and lowpass parts would be
calculated. The transformation matrix in the case of a signal with eight samples is
    
W (0) 1 1 1 1 1 1 1 1 x (0)
 W (1)   1 1 1 1 −1 −1 −1 −1   x (1) 
    
 W (2)   1 1 −1 −1 0 0 0 0   x (2) 
    
 W (3)   0 0 0 0 1 1 −1 −1   x (3) 
 =   (3.43)
 W (4)   1 −1 0 0 0 0 0 0   x (4)  .
    
 W (5)   0 0 1 −1 0 0 0 0   x (5) 
    
 W (6)   0 0 0 0 1 −1 0 0   x (6) 
W (7) 0 0 0 0 0 0 1 −1 x (7)
This is the Haar transform or the Haar wavelet transform of a signal with eight samples.
The Haar transform is useful in the analysis of signals when we can expect that in a slow-varying
signal there are few details.
The Haar wavelet transform is computationally very efficient. The efficiency comes from the
fact that the Haar wavelet transform almost does not transform the signal at high frequencies. It leaves
it almost as it is, using a very simple two-sample transform. For lower frequencies, the number of
operations is increased.
In specific, for the highest N/2 coefficients, the Haar transform does only one addition (of two
signal values) for every coefficient. For next N/4 coefficients the Haar wavelet uses 4 signal values
144 Discrete Fourier Transform

with 3 additions and so on. The total number of additions for the Haar transform is
N N N N
Nadditions = (2 − 1) + (4 − 1) + (8 − 1) + · · · + ( N − 1).
2 4 8 N
For N of the form N = 2m we can write
1 1 1 1
Nadditions = N log2 N − N ( + 2 + 3 + · · · + m )
2 2 2 2
1 1 − 21m
= N log2 N − N = N log2 N − ( N − 1) = N [log2 N − 1] + 1.
2 1 − 12

This is the same order of additions as in the FFT algorithms.

Example 3.13. Consider the signal

x (n) = [2, 2, 12, −8, 2, 2, 2, 2, −3, −3, −3, −3, 3, −9, −3, −3].

Calculate its Haar and Walsh-Hadamard transform with N = 16. Discuss the results.

⋆Signal x (n) is presented in Fig. 3.23.

W(k)
x(n)

n k
x0−1(n)
x0(n)

n n
x0−1,9,14(n)
x0−1,9(n)

n k

Figure 3.23 Signal x (n) and its discrete Haar transform H (k ). Reconstructed signals: using H (0) presented
by x0 (n), using two coefficients H (0) and H (1) denoted by x0−1 (n), using H (0), H (1), and H (9) denoted by
x0−1,9 (n), and using H (0), H (1), H (9), and H (14) denoted by x0−1,9,14 (n). Vertical axes scales for the signal
and transform are different.
Ljubiša Stanković Digital Signal Processing 145

In full analogy with (3.43), the Haar transformation matrix of order N = 16 is formed. For
example, higher coefficients are just two-sample signal transforms,

W (k) = x (2(k − 8)) − x (1 + 2(k − 8)), k = 8, 9, . . . , 15.

Although there are some short duration pulses (x (2), x (3), x (13)), the Haar transform coefficients
W (2), W (3), . . . , W (8), W (10), W (11), W (12), W (13), W (15) are zero-valued, Fig. 3.23. This
is the result of the Haar transform property to decompose the high frequency signal region into
short duration (two-sample) basis functions. A short duration pulse is contained in the high
frequency part of only one Haar coefficient. This is not the case in the Fourier transform (or
Walsh-Hadamard transform) where a single delta pulse will cause that all coefficients are nonzero,
Fig. 3.24. Transformation matrix T16 for the Walsh-Hadamard transform is obtained from T8
using (3.42).
Property that high-frequency coefficients are well localized in the time domain and that they
represent a short duration signal components is used in image compression, where adding high
frequency coefficients adds details into an image, with important property that one detail in the
image corresponds to one (a few) nonzero coefficient. Reconstruction of the signal from the Haar
transform, using various number of coefficients, is presented in Fig. 3.23. As explained, it can
be considered as "a zooming" a signal toward the details when the higher frequency coefficients
are added. Since a half of the coefficients are zero-valued a significant compression ratio can
be achieved by storing or transmitting the nonzero coefficients only. This is a basic idea for
multiresolution wavelet based image representations and compression.
H(k)
x(n)

n k

Figure 3.24 Signal x (n) and its Walsh-Hadamard transform HD (k ).

Example 3.14. For the signals

(a) x (n) = [1, −1, 1, −1, 1, −1, 1, −1, 1, −1, 1, −1, 1, −1] and
(b) x (n) = [2, 0, −2, 0, 0, −2, 0, 2, 0, 2, 0, −2, −2, 0, 2, 0],
calculate the Haar wavelet transform and the Walsh-Hadamard transform with N = 16.

⋆The Haar wavelet transform and the Walsh-Hadamard transform are shown in Fig. 3.25. We can
see that for a signal of long duration, with high frequencies, the number of nonzero coefficients in
the Haar wavelet transform is large. Just one such a component in the Walsh-Hadamard transform
can require a half of the available coefficients in the Haar wavelet transform, Fig. 3.25(left). In
addition, the fact that a much smaller number of coefficients is used for the Walsh-Hadamard
146 Discrete Fourier Transform

x(n)

x(n)
n n
W(k)

W(k)
k k
H(k)

H(k)

k k

Figure 3.25 The Haar wavelet transform (second row) and the Walsh-Hadamard transform (third row) for high
frequency long duration signals (first row). Vertical axes scales for the signal and transform are different.

transform based reconstruction, as compared to a very large number of coefficients in the Haar
wavelet transform reconstruction, may annul the Haar transform calculation complexity advantage
in this case.

3.10 PROBLEMS

Problem 3.1. Calculate the DFT of the following signals:

(a) x (n) = δ(n),
(b) x (n) = δ(n) + δ(n − 1) − 2jδ(n − 2) + 2jδ(n − 3) + δ(n − 4), and
(c) x (n) = an (u(n) − u(n − 10)),
using the smallest possible value of period N.

Problem 3.2. If the signals g(n) and f (n) are real-valued, show that their DFTs, G (k) and F (k), can
be obtained from the DFT Y (k) of the signal y(n) defined as y(n) = g(n) + jh(n).

Problem 3.3. Frequency of a continuous time signal is related to the DFT index according to
(
2πk/( N∆t) for 0 ≤ k ≤ N/2 − 1
Ω=
2π (k − N )/( N∆t) for N/2 ≤ k ≤ N − 1.

This mapping is achieved in programs using shift functions. Show that the shift will not be necessary if
we use the signal x (n)(−1)n . The DFT values of this signal will be ordered from the one corresponding
to the lowest negative frequency, toward the highest positive frequency.
Ljubiša Stanković Digital Signal Processing 147

Problem 3.4. If the DFT of a signal x (n), with period N, is X (k) find the DFT of the signals

x (n) for n = 2m
y(n) =
0 for n = 2m + 1

and
0 for n = 2m
z(n) =
x (n) for n = 2m + 1.

Problem 3.5. Find the convolution of signals x (n) and h(n) whose nonzero values are x (0) = 1,
x (1) = −1 and h(0) = 2, h(1) = −1, h(2) = 2, using their DFTs and the inverse DFT of the resulting
product, that is x (n) ∗n h(n) = IDFT{DFT{ x (n)}DFT{y(n)}}.

Problem 3.6. Find the circular convolution of the signals x (n) = e j4πn/N + sin(2πn/N ) and
h(n) = cos(4πn/N ) + e j2πn/N within the common period for both signals.

Problem 3.7. Find the signal whose DFT is Y (k) = | X (k)|2 and X (k) is the DFT of the signal
x (n) = u(n) − u(n − 3), calculated with the period N = 10.

Problem 3.8. What is the relation between the discrete Hartley transform (DHT) of real-valued signals
x (n), defined by
N −1
2πnk 2πnk
H (k) = ∑ x (n) cos + sin
n =0
N N
and the DFT of the same signal? Express the DHT in terms of the DFT and the DFT in terms of the
DHT.

Problem 3.9. Show that the DCT of a signal x (n) with N samples, defined by
N −1
2πk 1
C (k) = ∑ 2x (n) cos(
2N
(n + ))
2
n =0

can be calculated using an N-sample DFT of the signal

2x (2n) for 0 ≤ n ≤ N/2 − 1
y(n) =
2x (2N − 2n − 1) for N/2 ≤ n ≤ N − 1
as
πk
N −1 2πk πk
C (k) = Re{e− j 2N ∑ y(n)e− j N n } = Re{e− j 2N DFT{y(n)}}.
n =0

Problem 3.10. A real-valued signal x (n) of a duration shorter than N, defined for 0 ≤ n ≤ N − 1,
has the Fourier transform X (k). The signal y(n) is formed as

2x (n) for 0 ≤ n ≤ N − 1
y(n) = (3.44)
0 for N ≤ n ≤ 2N − 1,

with the DFT denoted by Y (k). The signal z(n) is formed using y(n) as

z(2n + 1) = y(n)
z(2n) = 0.

(a) What are the real and imaginary parts of Z (k) = DFT{z(n)}? How they are related to the
DCT and the DST of the signal x (n)? (b) The signal x (n) is applied as an input to a linear impulse
invariant system with the impulse response h(n) such that h(n) is of the duration shorter than N,
148 Discrete Fourier Transform

defined within 0 ≤ n ≤ N − 1, and x (n) ∗n h(n) is also within the same interval, 0 ≤ n ≤ N − 1. The
DCT of the output signal is calculated. How the DCT of the output signal is related to the DCT and
DST of the input signal x (n)?

Problem 3.11. Consider a signal x (n) whose duration is N, with nonzero values within the interval
0 ≤ n ≤ N − 1. Define the system with the output
N −1
yk (n + ( N − 1)) = ∑ x (n + m)e− j2πmk/N
m =0

so that its value yk ( N − 1) at the last instant of the signal duration is equal to the DFT of signal, for a
given k,
N −1
y k ( N − 1) = ∑ x (m)e− j2πmk/N = DFT{ x (n)} = X (k).
m =0
Note that the system is causal since yk (n) uses only x (n) at instant n and previous instants.
Show that the output signal yk (n) is related to the previous output value yk (n − 1) by the equation

yk (n) = e j2πk/N yk (n − 1) + e j2πk/N [ x (n) − x (n − N )].

This equation can be used for a recursive DFT calculation.

Problem 3.12. Show that the discrete Hartley transform (DHT) coefficients of a signal x (n) with an
even number of samples N can be calculated, for an even frequency index k = 2r, using the DHTs with
N/2 samples (fast DHT calculation).
√
Problem 3.13. Find the DFT of the signal x (n) = exp( j4π 3n/N ), for n = 0, 1, . . . , N − 1 with
N = 16. If the DFT is interpolated four times (signal zero-padded), find the displacement bin, estimate
the signal frequency, and compare it with the true frequency value. What is the displacement bin if the
general formula is applied without the interpolation?

3.11 EXERCISE

Exercise 3.1. Find the DFT of the signal x (n) = δ(n) − δ(n − 3) with the assumed periods N = 4
and N = 8.

Exercise 3.2. Calculate the DFT of the signal x (n) = sin(nπ/4) for 0 ≤ n < N with N = 8 and
N = 16.

Exercise 3.3. For a real-valued signal x (n), the DFT is calculated with N = 8 and the following DFT
values are known: X (0) = 1, X (2) = 2 − j, X (4) = 2, X (5) = j, X (7) = 3. Find the remaining DFT
values. What are the values of x (0) and ∑7n=0 x (n)?

Exercise 3.4. Signal x (n) is presented in Fig. 3.26. Find X (0), X (4), and X (8), where X (k) is the
DFT of the signal x (n) calculated with the period N = 16.

Exercise 3.5. Prove that the DFT value X ( N/2) is real-valued for an arbitrary real-valued signal
x (n), defined for 0 ≤ n < N, where N is an even integer.

Exercise 3.6. Consider the signal x (n) whose DFT values X (k), calculated with N = 16, are presented
in Fig. 3.27.
Ljubiša Stanković Digital Signal Processing 149

4
3
2
x( n) 1
0
−1
−2
−3

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
n

Figure 3.26 Discrete signal x (n) (Exercise 3.4)

4
3
2
X( k)

1
0
−1
−2
−3

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
k

Figure 3.27 DFT of the discrete signal x (n) (Exercise 3.6)

1. Find the DFT of the signal y1 (n) = x (n) + (−1)n x (n).

2. Find the DFT of the signal y2 (n) = x (n) − (−1)n x (n).
3. Find the DFT of the signal y3 (n) = x (n) ∗n x ( N − n), where ∗n denotes the circular
convolution with the period N.
4. Find the DFT of the signal (
x (n) for n 6= 8
y4 ( n ) =
0 for n = 8.

5. Find the value x (0).

6. Calculate ∑15
n =0 x ( n ).
7. Calculate ∑15 2
n=0 | x ( n )| .
8. Calculate ∑15 n
n=0 (−1) x ( n ).

Exercise 3.7. Prove that if | x (n)| ≤ A for 0 ≤ n < N then | X (k)| ≤ N A for any k, where X (k) is
the DFT of x (n) calculated with N points.

Exercise 3.8. Prove that if ∑nN=−01 | x (n)| ≤ A then ∑kN=−01 | X (k)| ≤ N A where X (k) is the DFT of
x (n) calculated with N samples.
150 Discrete Fourier Transform

3.12 SOLUTIONS

Solution 3.1. The DFT assumes that the signals are periodic. In order to calculate the DFT, we have
to assume a period of the considered aperiodic signals first. The period N should be greater or equal
to the signal duration, so that the signal values do not overlap after their periodic extension. Larger
values of N will increase the density of the frequency domain samples, but they will also increase the
computation time.
a) For this signal any N ≥ 1 is acceptable, producing

X (k) = 1, k = 0, 1, . . . , N − 1,

with the assumed period N.

b) For this signal, we may use any N ≥ 5. With N = 5, we get
5−1
X (k) = ∑ x(n)e− j2πnk/5 = 1 + e− j2πk/5 − 2je− j4πk/5 + j2e− j6πk/5 + e− j8πk/5
n =0
= 1 + 2 cos(2πk/5) − 4 sin(4πk/5).

c) For the period N ≥ 10 the DFT is given by

9
1 − a10 e− j20πk/N
X (k) = ∑ (ae− j2πk/N )n = 1 − ae− j2πk/N
.
n =0

Solution 3.2. From the signal y(n) = g(n) + j f (n) its real and imaginary parts g(n) and f (n) can
be obtained as
y(n) + y∗ (n) y(n) − y∗ (n)
g(n) = , and f (n) = .
2 2j

Since the DFT of y∗ (n) is equal to

!∗
N −1 N −1
∗ ∗ − j2πnk/N j2πnk/N
DFT{y (n)} = ∑ y (n)e = ∑ y(n)e
n =0 n =0

with e j2πnk/N = e j2πn(k− N )/N = e− j2πn( N −k)/N , it follows

DFT{y∗ (n)} = Y ∗ ( N − k).

Then the DFTs of signals g(n) and f (n) are obtained from

Y (k) + Y ∗ ( N − k) Y (k) − Y ∗ ( N − k)
G (k) = and F (k) = .
2 2j

Solution 3.3. The DFT of the signal formed as x (n)(−1)n is given by

N −1
X1 ( k ) = ∑ x (n)(−1)n e− j2πnk/N .
n =0
Ljubiša Stanković Digital Signal Processing 151

For the range of frequency indices within 0 ≤ k ≤ N/2 − 1, we can write

N −1 N −1
N
X1 ( k ) = ∑ x (n)e− jπn e− j2πnk/N = ∑ x (n)e− j2πn(k+ N/2)/N = X (k + ).
n =0 n =0
2

For N/2 ≤ k ≤ N − 1 holds,

N −1 N −1
N
X1 ( k ) = ∑ x (n)e jπn e− j2πnk/N = ∑ x (n)e− j2πn(k− N/2)/N = X (k − ).
n =0 n =0
2

Solution 3.4. The DFT of the signal y(n) is given by

N −1
1 N −1
∑ y(n)e− j2πnk/N = [ x (n) + (−1)n x (n)]e− j2πnk/N
2 n∑
Y (k) =
n =0 =0
1 N −1 1 N
[ x (n) + x (n)e− jπnN/N ]e− j2πnk/N = [ X (k) + X (k + )]
2 n∑
=
=0 2 2

with X (k + N/2) = X (k − N/2) for k > N/2.

For the signal z(n) its DFT can be written in the form
N −1
1 N −1 1 N
∑ z(n)e− j2πnk/N = [ x (n) − (−1)n x (n)]e− j2πnk/N = [ X (k) − X (k + )].
2 n∑
Z (k) =
n =0 =0 2 2

It is obvious that the DFT of signal x (n) is equal to the sum of the DFTs of signals y(n) and z(n),

Y ( k ) + Z ( k ) = X ( k ).

Solution 3.5. For the convolution calculation using the DFTs of signals, the minimum number for
the period N is N = K + L − 1 = 4, where K = 2 is the duration of the signal x (n) and L = 3 is the
duration of the impulse response h(n). With N = 4, we get

X (k) = 1 − e− j2πk/4
H (k) = 2 − e− j2πk/4 + 2e− j4πk/4
Y (k) = X (k) H (k) = 2 − 3e− j2πk/4 + 3e− j4πk/4 − 2e− j6πk/4 .

The signal is equal to the inverse DFT of Y (k) = X (k) H (k),

y(n) = IDFT{Y (k)} = 2δ(n) − 3δ(n − 1) + 3δ(n − 2) − 2δ(n − 3).

Solution 3.6. The DFT of the circular convolution y(n) of signals x (n) and h(n), y(n) = x (n) ∗ h(n),
is equal to the corresponding DFTs: Y (k) = X (k) H (k) with
N −1
1 1
X (k) = ∑ [e j4πn/N + 2j e j2πn/N − 2j e− j2πn/N ]e− j2πnk/N
n =0
N N
= Nδ(k − 2) + δ ( k − 1) − δ ( k + 1)
2j 2j
152 Discrete Fourier Transform

and
N −1
1 1
H (k) = ∑ [ 2 e j4πn/N + 2 e− j4πn/N + e j2πn/N ]e− j2πnk/N
n =0
N N
= δ(k − 2) + δ(k + 2) + Nδ(k − 1).
2 2
The value of Y (k) is
N2 N2
Y (k) = δ ( k − 2) + δ ( k − 1).
2 2j
The circular convolution is obtained as the inverse DFT of Y (k)

N j4πn/N N j2πn/N
y(n) = e + e .
2 2j

Solution 3.7. The DFT of signal y(n), equal to Y (k) = | X (k)|2 , can be written as Y (k) = X (k) X ∗ (k).
The inverse DFT of this product is equal to the convolution of individual inverse DFTs, that is

y(n) = IDFT{ X (k)} ∗n IDFT{ X ∗ (k)}.

Since
!∗
1 N −1 ∗ 1 N −1
IDFT{ X ∗ (k)} = X (k )e j2πnk/N = X (k)e− j2πnk/N
N k∑
=0
N k∑
=0
!∗
1 N −1
X (k)e j2πk( N −n)/N = x∗ ( N − n)
N k∑
=
=0

we get

y(n) = ( x (n))10 ∗n ( x ∗ (10 − n))10 = (u(n) − u(n − 3))10 ∗n (u(10 − n) − u(7 − n))10
= (δ(n + 2) + 2δ(n + 1) + 3δ(n) + 2δ(n − 1) + δ(n − 2))10 ,

where ( x (n)) N indicates that the signal x (n) is periodically extended with period N.

Solution 3.8. For a real-valued signals we can write

N −1
2πnk 2πnk
X (k) = ∑ [ x(n) cos N
− jx (n) sin
N
]
n =0
N −1
2πnk 2πnk
X ( N − k) = ∑ [ x(n) cos N
+ jx (n) sin
N
].
n =0

From the previous equations, we can easily conclude that the following relations hold
N −1
2πnk X (k) + X ( N − k) H (k) + H ( N − k)
∑ x (n) cos
N
=
2
=
2
n =0
N −1
2πnk X ( N − k) − X (k) H (k) − H ( N − k)
∑ x (n) sin
N
=
2j
=
2
.
n =0
Ljubiša Stanković Digital Signal Processing 153

The DHT can be calculated as a sum of these two terms, that is

2H (k) = X (k) + X ( N − k) − j[ X ( N − k) − X (k)].

The DFT is obtained using the DHT in the same way as

2X (k) = H (k) + H ( N − k) − j[ H (k) − H ( N − k)].

Solution 3.9. We can split the summation in the DCT into an even and odd part

N −1
2πk 1
C (k) = ∑ 2x (n) cos(
2N
(n + )) =
2
n =0
N/2−1 N/2−1
2πk 1 2πk 1
∑ 2x (2n) cos(
2N
(2n + )) + ∑ 2x (2n + 1) cos(
2 2N
(2n + 1 + )).
2
n =0 n =0

By reverting the summation index in the second sum, using n = N/2 − 1 − m, the summation over m
is from m = N/2 − 1, for n = 0, down to m = 0, for n = N/2 − 1. Then
N/2−1
2πk 1
∑ 2x (2n + 1) cos(
2N
(2n + 1 + ))
2
n =0
N/2−1
2πk 1
= ∑ 2x ( N − 2m − 1) cos(
2N
( N − 2m − 1 + )).
2
m =0

The summation index in this sum can be shifted for N/2 + m = n to get
N/2−1
2πk 1
∑ 2x ( N − 2m − 1) cos(
2N
( N − 2m − 1 + ))
2
m =0
N −1
2πk 1
= ∑ 2x (2N − 2n − 1) cos( (2N − 2n − )).
n= N/2
2N 2

Now we can go back to the DCT and replace the second sum, to get
N/2−1
2πk 1
C (k) = ∑ 2x (2n) cos(
2N
(2n + ))
2
n =0
N −1 N −1
2πk 1 2πk 1
+ ∑ 2x (2N − 2n − 1) cos(
2N
(2n + )) = ∑ y(n) cos(
2 2N
(2n + ))
2
n= N/2 n =0

with cos( 2πk 1 2πk 1

2N (2N − 2n − 2 )) = cos( 2N (2n + 2 )) and

2x (2n) for 0 ≤ n ≤ N/2 − 1
y(n) =
2x (2N − 2n − 1) for N/2 ≤ n ≤ N − 1.

We can conclude that the relation between the DFT and DCT is given by
N −1 2πk 1 πk
C (k) = Re{ ∑ y(n)e− j 2N (2n+ 2 ) } = Re{e− j 2N DFT{y(n)}}.
n =0
154 Discrete Fourier Transform

Solution 3.10. (a) For the signal z(n) we can write

4N −1 2N −1
DFT{z(n)} = ∑ z(n)e− j2πnk/(4N ) = ∑ z(2n + 1)e− j2π (2n+1)k/(4N )
n =0 n =0
2N −1 N −1
= ∑ y(n)e− j2π (2n+1)k/(4N ) = ∑ 2x (n)e− j2π (2n+1)k/(4N ) .
n =0 n =0

The real and imaginary parts of DFT{z(n)} are equal to

N −1
2π (2n + 1)k
Re{DFT{z(n)}} = ∑ 2x (n) cos(
4N
) = C (k)
n =0
N −1
2π (2n + 1)k
Im{DFT{z(n)}} = − ∑ 2x (n) sin(
4N
) = −S(k)
n =0
DFT{z(n)} = C (k) − jS(k),

and
N −1
Z (k) = DFT{z(n)} = e− j2πk/(4N ) ∑ 2x (n)e− j2πnk/(2N )
n =0

Z (k)e jπk/(2N ) = Y (k) = 2X (k/2).

Note that X (k/2) is just a notation for 2X ( 2k ) = Y (k), where Y (k) = DFT{y(n)} and y(n) is
zero-padded version of 2x (n), defined by (3.44).
(b) If the signal x (n) is used as an input to a system then the DCT is calculated for

xh (n) = x (n) ∗n h(n)

X h ( k ) = X ( k ) H ( k ).

It has been assumed that all signals, x (n), h(n), and x (n) ∗n h(n), are zero-valued outside the interval
0 ≤ n ≤ N − 1 (it means that the duration of x (n) and h(n) should be such that their convolution is
within 0 ≤ n ≤ N − 1). Then, for the signal zh (n), related to xh (n) = x (n) ∗n h(n) in the same way
as the signal z(n) is related to x (n) in (a), we can write

k k k k
DFT{zh (n)}e jπk/(2N ) = 2Xh ( ) = 2X ( ) H ( ) = Y (k) H ( ).
2 2 2 2
Then
k
Ch (k) = DCT{ xh (n)} = Re{Y (k) H ( )e− jπk/(2N ) }
2
k k
= Re{Y (k)e− jπk/(4N ) } Re{ H ( )} − Im{Y (k)e− jπk/(4N ) } Im{ H ( )}
2 2
k k
= C (k) Re{ H ( )} + S(k) Im{ H ( )}.
2 2
The system output is given by

x (n) ∗n h(n) = xh (n) = IDCT{Ch )k)}

, with IDCT{Ch )k)} defined by (3.34). The transform H (k/2) is the DFT of the zero-padded signal
h(n) with a factor of 2. Only the first half of the DFT samples are then used in calculation.
Ljubiša Stanković Digital Signal Processing 155

Solution 3.11. For the signal yk (n) we may write

N −1
yk (n) = ∑ x (n − N + 1 + m)e− j2πmk/N .
m =0

Now let us shift the summation for one sample

N 2π
N
yk (n) = ∑ x (n − N + m)e− j2π (m−1)k/N = e j N k ∑ x (n − N + m)e− j2πmk/N
m =1 m =1
N −1
j 2π
=e N k [ ∑ x (n − N + m)e− j2πmk/N − x (n − N )e− j2π0k/N + x (n)e− j2πNk/N ]
m =0
= e j2πk/N [yk (n − 1) − x (n − N ) + x (n)].

Within the interval 0 ≤ n ≤ N − 1 holds

yk (n) = e j2πk/N [yk (n − 1) + x (n)],

since x (n − N ) = 0. This proves the problem statement.

If the signal x (n) continues as a periodic signal after n = 0, then
∞
x p (n) = ∑ x(n − lN ) (3.45)
l =0

and for n ≥ N, holds x p (n − N ) = x p (n) and yk (n) = e j2πk/N yk (n − 1),


0 for n < 0
yk (n) = yk (n) = e j2πk/N [yk (n − 1) + x (n)] for 0 ≤ n ≤ N − 1

yk (n) = e j2πk/N yk (n − 1) for n ≥ N

for x p (n) defined by (3.45).

Solution 3.12. For k = 2r the DHT can be written as

N/2−1 h i N −1 h i
H (2r ) = ∑ x (n) cos 2πrn
N/2 + sin 2πrn
N/2 + ∑ x (n) cos 2πrn
N/2 + sin 2πrn
N/2
n =0 n= N/2
N/2−1 h i
2πrn
= ∑ ( x (n) + x (n + N/2)) cos N/2 + sin 2πrn
N/2 .
n =0

Therefore, the value of H (2r ) is equal to

N/2−1
2πrn 2πrn
H (2r ) = ∑ g(n) cos
N/2
+ sin
N/2
,
n =0

where g(n) = x (n) + x (n + N/2). This is the DHT of g(n) with N/2 samples.
Note: For odd frequency indices k = 2r + 1 we can write
N −1
2π (2r + 1)n 2π (2r + 1)n
H (2r + 1) = ∑ x ( n ) cos
N
+ sin
N
.
n =0
156 Discrete Fourier Transform

After some lengthy, but straightforward transformations, we get

N/2−1
2πnr 2πnr
H (2r + 1) = ∑ f (n) cos
N/2
+ sin
N/2
,
n =0

where
N 2πn N 2πn
f (n) = [ x (n) − x (n + )] cos + [ x ( − n) − x ( N − n)] sin .
2 N 2 N
This is again the DHT of the signal f (n) with N/2 samples. In this way, the DHT of the signal with N
samples is split into the two DHTs with N/2 samples.

√
Solution 3.13. The absolute value of the DFT of the signal x (n) = exp( j4π 3n/N ), for n =
0, 1, . . . , N − 1 with N = 16 is defined by

15 √ sin π (2√3 − k)
j2π (2 3−k )n/16

| X (k)| = | ∑ e
|= √ , with (3.46)

n =0 sin π (2 3 − k)/16

|X| = (1.5799, 2.1361, 3.5045, 10.9192, 9.4607, 3.3454,

2.0805, 1.5530, 1.2781, 1.1225, 1.0362, 0.99781, 0.9992, 1.0406, 1.1310, 1.2929),

where |X| is the vector whose elements are the DFT values | X (k)|, k = 0, 1, . . . , 15. The maximum DFT
absolute value is achieved at k = 3. This means that the frequency estimation, without displacement
bin, would be
(2π · 3)/16 = 1.1781,
√
while the true frequency is (2π · 2 3)/16 = 1.3603. The error is 13.4%.
For the zero-padded signal (interpolated DFT), with a factor of 4,
15 √ 15 √
3n/16 − j2πnk/64 3−k )n/64
| X (k)| = | ∑ e j4π e | =| ∑ e j2π (8 |
n =0 n =0

sin π (8√3 − k)/4

= √ .
sin π (8 3 − k)/64
h √ i h √ i
The maximum value is obtained for k = 8 3 = 14, where 8 3 denotes the nearest integer value.
The maximum absolute DFT value at k = 14, along with the absolute values of its neighbors, is

sin π (8√3 − 14)/4

| X (14)| = √ = 15.9662,
sin π (8 3 − 14)/64

sin π (8√3 − 15)/4

| X (15)| = √ = 13.9412
sin π (8 3 − 15)/64

sin π (8√3 − 13)/4

| X (13)| = √ = 14.8249.
sin π (8 3 − 13)/64
Ljubiša Stanković Digital Signal Processing 157

The displacement bin is equal to

| X (15)| − | X (13)|
d = 0.5 = −0.1395.
2 | X (14)| − | X (15)| − | X (15)|
√
The true frequency index would be 8 3 = 13.8564, with the true frequency 2π · 13.8564/64 = 1.3603.
The correct value of frequency index is shifted from the nearest integer k = 14 (on the frequency grid)
for 14 − 13.8564 = −0.1436, when the interpolation is done. Thus, the obtained displacement bin
value −0.1395 is close to the true shift value −0.1436. The estimated frequency, using the displacement
bin, is 1.3608. As compared to the true frequency, the error is 0.03%.
If the displacement
√ formula is applied on the DFT values, without interpolation, we would get
d = 0.3356, while 2 3 = 3.4641 is displaced from the nearest integer for 0.4641.
Chapter 4
z-Transform
HE Fourier transform of discrete-time signals and the DFT are used for direct signal processing

T and calculations. A transform that generalizes these transforms, in the same way as the Laplace
transform generalizes the Fourier transform of continuous-time signals, is the z-transform.
This transform provides an efficient tool for qualitative analysis and design of the discrete systems.

4.1 DEFINITION OF THE z-TRANSFORM

The Fourier transform of a discrete-time signal x (n) can be considered as a special case of the
z-transform defined by
∞
X (z) = ∑ x (n)z−n , (4.1)
n=−∞
where z = r exp( jω ) is a complex number. The value of the z-transform along the unit circle, |z| = 1
or z = exp( jω ), is equal to the Fourier transform X (e jω ) of discrete-time signals.
The z-transform, in general, converges only for some values of the complex argument z. The
region of z where X (z) is finite is the region of convergence (ROC) of the z-transform.

Example 4.1. Consider a discrete-time signal

x ( n ) = a n u ( n ) + b n u ( n ),

where a and b are complex numbers, | a| < |b|. Find the z-transform of this signal and its region
of convergence.

⋆The z-transform of x (n) is

∞ ∞ ∞
X (z) = ∑ (an z−n + bn z−n ) = ∑ (a/z)n + ∑ (b/z)n
n =0 n =0 n =0
1 1 z z
= + = + .
1 − a/z 1 − b/z z−a z−b
Infinite geometric series, with progression coefficient ( a/z), converges for | a/z| < 1. Thus, the
region of convergence for the first part of the z-transform is |z| > | a|. The other series converges
for |b/z| < 1, that is, for |z| > |b|. The resulting transform is finite if both terms are finite (or do

158
Ljubiša Stanković Digital Signal Processing 159

not cancel out to produce a finite value). Since | a| < |b|, the region of convergence for X (z) is
|z| > |b|, as shown in Fig. 4.1.

a a
Im{z}

Im{z}

Im{z}
b b
Re{z} Re{z} Re{z}

Figure 4.1 Regions of convergence (gray area) for the signal x (n) = an u(n) + bn u(n).

Example 4.2. Consider a discrete-time signal

x (n) = an u(n − 1) − bn u(−n − 1),

where a and b are complex numbers, |b| > | a|. Find the z-transform of x (n) and its region of
convergence.

⋆The z-transform is
∞ −1 ∞ ∞
X (z) = ∑ an z−n − ∑ bn z−n = ∑ an z−n − ∑ b−n zn
n =1 n=−∞ n =1 n =1
a/z z/b a z
= − = + .
1 − a/z 1 − z/b z−a z−b
Infinite geometric series with progression coefficient ( a/z) converges for | a/z| < 1. The other
series converges for |z/b| < 1. Since |b| > | a| the region of convergence is | a| < |z| < |b|, Fig.
4.2.
Note that in this example and the previous one two different signals bn u(n) and −bn u(−n −
1) produced the same z-transform Xb (z) = z/(z − b), but with different regions of convergence.
For the signal bn u(n) the region of convergence was |b/z| < 1, while for −bn u(−n − 1) the
region of convergence was |z/b| < 1.
160 z-Transform

a a
Im{z}

Im{z}

Im{z}
b b

Re{z} Re{z} Re{z}

Figure 4.2 Regions of convergence (gray area) for the signal x (n) = an u(n − 1) − bn u(−n − 1).

4.2 PROPERTIES OF THE z-TRANSFORM

4.2.1 Linearity

The z-transform is linear since

∞
Z { ax (n) + by(n)} = ∑ [ ax (n) + by(n)]z−n = aX (z) + bY (z),
n=−∞

with the region of convergence being at least the intersection of the regions of convergence of X (z) and
Y (z). In special cases the region can be larger than the intersection of the regions of convergence of
X (z) and Y (z) if some poles, defining the region of convergence, cancel out in the linear combination
of the transforms.

4.2.2 Time-Shift

For a shifted signal x (n − n0 ), the z-transform is given by

∞ ∞
Z { x (n − n0 )} = ∑ x ( n − n0 ) z − n = ∑ x (n)z−(n+n0 ) = X (z)z−n0 .
n=−∞ n=−∞

Additional pole at z = 0 is introduced for n0 > 0. The region of convergence is the same except for
z = 0 or z → ∞, depending on the value of n0 .

Example 4.3. Find the z-transform domain form of the equation

1
x (n) − x ( n − 1) = y ( n ).
2

⋆ The z-transform of this equation is obtained using the linearity and the shift property

Z { x (n) − 0.5x (n − 1)} = Z {y(n)} or

X (z) − 0.5X (z)z−1 = Y (z).
Ljubiša Stanković Digital Signal Processing 161

Example 4.4. For a causal signal x (n) = x (n)u(n), find the z-transform of x (n + n0 )u(n), for
n0 ≥ 0.

⋆ The signal x (n + n0 )u(n) has the z-transform

∞ ∞
Z { x (n + n0 )u(n)} = ∑ x(n + n0 )z−n = ∑ x(n + n0 )z−(n+n ) zn 0 0

n =0 n =0
" #
∞
n0 −n −1 − n0 +1
=z ∑ x (n)z − x (0) − x (1) z − · · · − x ( n0 − 1) z
n =0
h i
= z n0 X ( z ) − x (0 ) − x (1) z −1 − · · · − x ( n0 − 1) z − n0 +1 .

For n0 = 1 follows
Z { x (n + 1)u(n)} = zX (z) − x (0). (4.2)
Note that for this signal x (n + n0 )u(n) 6= x (n + n0 )u(n + n0 ).

4.2.3 Multiplication by an exponential signal: Modulation

For a signal x (n) multiplied by an exponential signal an the z-transform is

∞
z
Z { an x (n)} = ∑x ( n ) a n z − n = X ( ),
n=−∞ a

with region of convergence being scaled by | a|. In the special case, when a = e jω0 , the z-transform
plane is just rotated in the complex plane
∞
Z {e jω0 n x (n)} = ∑ x (n)e jω0 n z−n = X (ze− jω0 ),
n=−∞

with the same region of convergence as X (z).

4.2.4 Differentiation

Consider the z-transform of a causal signal x (n)

∞ ∞ ∞
dX (z)
X (z) = ∑ x(n)z−n and dz
= ∑ −nx (n)z−n−1 = ∑ −nx (n)z−n−1 .
n =0 n =0 n =1

We can conclude that

dX (z)
Z {nx (n)u(n)} = −z .
dz
This kind of the z-transform derivations can be generalized to

d N X (z)
Z {n(n + 1) . . . (n + N − 1) x (n)u(n)} = (−1) N z N .
dz N
162 z-Transform

4.2.5 Convolution in time

The z-transform of the convolution of signals x (n) and h(n) is

∞
Z { x (n) ∗n h(n)} = Z { ∑ x (m)h(n − m)}
m=−∞
∞ ∞ ∞ ∞
= ∑ ∑ x (m) h(n − m)z−n = ∑ ∑ x ( m ) h ( l ) z − m − l = X ( z ) H ( z ),
n=−∞ m=−∞ l =−∞ m=−∞

with the region of convergence being at least the intersection of the regions of convergence of X (z)
and H (z). In the case of a product of two z-transforms it may happen that some poles are canceled out
causing that the resulting region of convergence is larger than the intersection of the individual regions
of convergence.

Example 4.5. Find the z-transform of the signal ∑nm=−∞ x (n).

⋆ This signal can be written as the convolution of signal x (n and u(n), that is
n n
∑ x (n) = x (n) ∗n u(n) = ∑ x (m)u(n − m)
m=−∞ m=−∞

The z-transform of the convolution of two signal is equal to the product of their corresponding
z-transforms,
z
Z { x (n) ∗n u(n)} = X (z) . (4.3)
z−1

4.2.6 Initial and Stationary State Signal Value

The initial value of a causal signal may be calculated as

x (0) = lim X (z). (4.4)

z→∞

According to the z-transform definition all terms with z−n vanishes as z → ∞. The term which does
not depend on z is obtained as the result of this limit. It is equal to x (0).
The stationary state value of a causal signal x (n) is

lim x (n) = lim (z − 1) X (z). (4.5)

n→∞ z →1

This relation follows from

Z { x (n + 1)u(n))} − Z { x (n)u(n))} = zX (z) − x (0) − X (z),

Ljubiša Stanković Digital Signal Processing 163

where (4.2) is used. By definition of the z-transform we get

" #
N N
−n −n
lim [Z { x (n + 1)u(n))} − Z { x (n)u(n))}] = lim
z →1 N → ∞ n =0
∑ x ( n + 1) z − ∑ x (n)z
z →1 n =0

= lim [ x ( N + 1) − x (0)].
N →∞

Thus,
lim [ x ( N + 1) − x (0)] = lim [zX (z) − x (0) − X (z)],
N →∞ z →1
produces the stationary state value (4.5).

4.2.7 Table of the z-transform

Signal x (n) z-transform X (z)

δ(n) 1
z
u(n) z −1 , | z | > |1|
z
an u(n) z− a , |z| > | a|
nan−1 u(n) z
, z > | a|
( z − a )2 | |
z
− an u(−n − 1) z− a , | z| < | a|
an x (n) X (z/a)
z ( 1 − a 2)
a|n| , | a| < 1 (z− a)(1− az)
, | a| < |z| < |1/a|
x ( n − n0 ) z − n0 X ( z )
nx (n)u(n) −zdX (z)/dz
n ( n − 1) x ( n ) u ( n ) z2 d2 X (z)/dz2
1−z−1 cos(ω0 )
cos(ω0 n)u(n) 1−2z−1 cos(ω0 )+z−2
z−1 sin(ω0 )
sin(ω0 n)u(n) 1−2z−1 cos(ω0 )+z−2
1
n! u ( n ) exp(1/z)
[ x (n)u(n)] ∗n u(n) = ∑nm=0 x (m) z−z 1 X (z)

4.3 INVERSE z-TRANSFORM

4.3.1 Direct Power Series Expansion

Most common approach to the z-transform inversion is based on a direct expansion of the given
transform into power series with terms z−1 , within the region of convergence. After the z-transform is
expanded into such a series
∞
X (z) = ∑ Xn z − n
n=−∞
the signal is identified as x (n) = Xn for −∞ < n < ∞.
164 z-Transform

In general, various techniques may be used to expand a function into power series. Most of the
cases in signal processing, after some transformations, reduce to a simple form of an infinite geometric
series
∞
1
= 1 + q + q2 + · · · = ∑ q n
1−q n =0
for |q| < 1.

Example 4.6. For the z-transform

1 1
X (z) = 1 −1
+
1− 2z
1 − 3z

identify possible regions of convergence and find the inverse z-transform for each of them.

⋆Obviously the z-transform has the poles z1 = 1/2 and z2 = 1/3. Since there are no poles in
the region of convergence there are three possibilities to define the region of convergence: (1)
|z| > 1/2, (2) 1/3 < |z| < 1/2, and (3) |z| < 1/3. The signals are obtained using power series
expansion for every case.
(1) For the region of convergence |z| > 1/2, the term 12 z−1 satisfies the condition that
1 −1
| 2 z | < 1. It can be expanded into geometric series as
∞
1 1 n ∞
1 −n 1 1
1
= ∑ = ∑ n z for < 1 or |z| > .
1 − 2z n =0
2z n =0
2 2z 2

However, the term 3z does not satisfy the condition |3z| < 1 for |z| > 1/2. This part of X (z)
should be modified so that it can also be expanded into geometric series for |z| > 1/2. This is
achieved if X (z) is rewritten as

1 1
X (z) = 1
+ 1
.
1− 2z −3z(1 − 3z )

Now, the second part of X (z) can be expanded using the following geometric series
∞
1 1 n ∞
1 −n 1
< 1 or |z| > 1 .
1
= ∑ = ∑ n
z for 3z
1 − 3z n =0
3z n =0
3 3

Both of these sums converge for |z| > 1/2. The resulting power series expansion of X (z) is
∞
1 −n 1 ∞ 1 −n
X (z) = ∑ 2 n
z −
3z n∑ n
z
n =0 =0 3
∞ ∞
1 1
= ∑ n z−n − ∑ n z−n .
n =0
2 n =1
3

The inverse z-transform, for this region of convergence |z| > 1/2, is

1 1
x (n) = u ( n ) − n u ( n − 1).
2n 3
Ljubiša Stanković Digital Signal Processing 165

(2) For the region of convergence defined by 1/3 < |z| < 1/2, the z-transform should be written
in the form
−2z 1
X (z) = + 1
.
1 − 2z −3z(1 − 3z )
The corresponding geometric series are
∞ 0
1 1
= ∑ (2z)n = ∑ 2−n z−n for |2z| < 1 or |z| <
1 − 2z n=0 n=−∞ 2
∞ n ∞
1 1 1 −n 1 1
= ∑ = ∑ z for < 1 or |z| > .
1
1 − 3z 3z 3 n 3z 3
n =0 n =0

They converge for 1/3 < |z| < 1/2. The resulting power series expansion is
0
1 ∞ 1 −n
∑ 2− n z − n −
3z n∑
X (z) = −2z n
z
n=−∞ =0 3
−1 1 −n ∞
1
=− ∑ n
z − ∑ n z−n .
n=−∞ 2 n =1
3

The inverse z-transform for this region of convergence is

1 1
x (n) = − u(−n − 1) − n u(n − 1).
2n 3
(3) Finally, for the region of convergence |z| < 1/3 we can write

−2z 1
X (z) = + .
1 − 2z 1 − 3z
The corresponding geometric series are
∞ 0
1 1
= ∑ (2z)n = ∑ 2−n z−n for |2z| < 1 or |z| <
1 − 2z n=0 n=−∞ 2
∞ 0
1 1
= ∑ (3z)n = ∑ 3−n z−n for |3z| < 1 or |z| < .
1 − 3z n=0 n=−∞ 3

Both series converge for |z| < 1/3. The power series expansion is
0 0
X (z) = −2z ∑ 2− n z − n + ∑ 3− n z − n
n=−∞ n=−∞
−1 1 −n 0
1
=− ∑ n
z + ∑ n z−n .
n=−∞ 2 n=−∞ 3

The inverse z-transform, in this case, is

1 1
x (n) = − u(−n − 1) + n u(−n).
2n 3
166 z-Transform

Example 4.7. For the z-transform

X (z) = e a/z
identify the region of convergence and find the inverse z-transform.

⋆ Expanding e a/z into a complex Taylor (Laurant) series

1 1
X (z) = e a/z = 1 + ( a/z) + ( a/z)2 + ( a/z)3 + . . .
2! 3!
follows
1 2 1 1
x (n) = δ(n) + aδ(n − 1) + a δ ( n − 2) + a3 δ ( n − 3) + · · · = a n u ( n ).
2! 3! n!
The series converges for any z except z = 0.

Example 4.8. For the z-transform

z2 + 1
X (z) =
(z − 1/2)(z2 − 3z/4 + 1/8)

find the signal x (n) if the region of convergence is |z| > 1/2.

⋆ The denominator of X (z) could be rewritten in the form

z2 + 1 z2 + 1
X (z) = =
(z − 1/2)(z − z1 )(z − z2 ) (z − 1/2)2 (z − 1/4)

where z1 = 1/2 and z2 = 1/4. Writing X (z) in the form of partial fractions

A B C
X (z) = + + ,
(z − 21 )2 z − 1
2 z− 1
4

the coefficients A, B, and C follow from

( z2 + 1) A(z − 41 ) + B(z − 21 )(z − 41 ) + C (z − 21 )2

1 2 1
=
(z − 2 ) (z − 4 ) (z − 21 )2 (z − 14 )

or from
1 1 1 1
(z2 + 1) = A(z − ) + B(z − )(z − ) + C (z − )2 . (4.6)
4 2 4 2
For z = 1/4 we get 17/16 = C/16 or C = 17. Using the value z = 1/2 gives
1 1 1
( + 1) = A ( − )
4 2 4
Ljubiša Stanković Digital Signal Processing 167

and A = 5 is obtained. Finally if the highest order coefficients in the relation (4.6) with z2 are
equated
z2 = Bz2 + Cz2
we get 1 = B + C, producing B = −16. The z-transform is
5 −16 17
X (z) = 1 2
+ 1
+ .
(z − 2) z− 2 z − 1/4

For the region of convergence |z| > 1/2 and for the parameter | a| ≤ 1/2 holds
∞
1 1 −1 −1
= a = z (1 + az + a2 z −2 + . . . ) = ∑ a n −1 z − n .
z−a z (1 − z ) n =1

Differentiating both sides of the previous equation with respect to a we get

∞
d 1 1
( )= 2
= ∑ ( n − 1) a n −2 z − n
da z − a (z − a) n =2

Using this relation with a = 1/2 the inverse z-transform of X (z) is

n−1 1 1
x (n) = 5 u(n − 2) − 16 n−1 u(n − 1) + 17 n−1 u(n − 1).
2n −2 2 4
Note: In general, the relation

1 1 dm 1
+
= ( )=
(z − a) m 1 m! dam z − a
!
∞
1 dm n −1 − n (n − 1)(n − 2)..(n − m) ∞ n−m−1 −n
=
m! dam ∑a z =
m! ∑a z
n =1 n =1

produces the inverse z-transform

(n − 1)(n − 2)..(n − m) n−m−1

x (n) = a u(n)
m!
(n − 1)(n − 2)..(n − m) n−m−1
= a u ( n − m − 1)
m!

n
= a n − m −1 u ( n − m − 1).
m

4.3.2 Theorem of Residues Based Inversion

In general, the inversion is calculated using the Cauchy relation from the complex analysis
I
1
zm−1 dz = δ(m),
2πj
C

where C is any closed contour line within the region of convergence. The complex plane origin is
within the contour. By multiplying both sides of X (z) by zm−1 , after integration along the closed
168 z-Transform

contour within the region of convergence we get

I ∞ I
1 1
zm−1 X (z)dz = ∑ zm−1 x (n)z−n dz = x (m).
2πj n=−∞ 2πj
C C

The integral is calculated using the theorem of residues

I
( )
1 1 d(k−1) [zn−1 X (z)(z − zi )k ]
x (n) = zn−1 X (z)dz = ∑ ,
2πj zi ( k − 1) ! dzk−1 | z = zi
C

where zi are the poles of zn−1 X (z) within the integration contour C, which is within the region of
convergence and k is the pole order. If the signal is causal, n ≥ 0, and all poles of zn−1 X (z) within
contour C are simple (first-order poles, with k = 1) then, for a given instant n,
n o
x (n) = ∑ [zn−1 X (z)(z − zi )]|z=zi . (4.7)
zi

Example 4.9. For the z-transform

2z + 3
X (z) =
(z − 1/2)(z − 1/4)

find the signal x (n) for n ≥ 0 if the region of convergence is |z| > 1/2.

⋆ According to the residuum theorem for n ≥ 1

n o
x (n) = ∑ [zn−1 X (z)(z − zi )]|z=zi
zi

zn−1 (2z + 3) 1 zn−1 (2z + 3) 1

= 1 1
(z − )|z=1/2 + 1 1
(z − )|z=1/4
(z − 2 )( z − 4 )
2 (z − 2 )(z − 4 ) 4
1 1 7
n −1 4 4n −1 2 1 1
= 2 1 + −1
= 16 − 14 n−1 .
4 4
2n −1 4

For n = 0 additional pole at z = 0 exists

z−1 (2z + 3) z−1 (2z + 3) 1

x (0) = 1 1
z | z =0 + (z − )|z=1/2
(z − 2 )( z − 4 ) (z − 21 )(z − 41 ) 2
z−1 (2z + 3) 1
+ 1 1
(z − )|z=1/4 = 0.
(z − 2 )( z − 4 )
4

The value of x (0) can be verified using x (0) = limz→∞ X (z).

The resulting inverse z-transform for n ≥ 0 is
1 1
x (n) = 16 u(n − 1) − 14 n−1 u(n − 1).
2n −1 4
Using the theorem of residuum we can prove that x (n) = 0 for n < 0 with the region of
convergence |z| > 1/2.
Ljubiša Stanković Digital Signal Processing 169

Hint: Since for each n < 0 there is a pole at z = 0 of the order −n + 1, to avoid different
derivatives for each of n we can make a substitution of variables z = 1/p, with dz = −dp/p2 .The
new region of convergence in the complex plane p will be |1/p| > 1/2 or | p| < 2. All poles are
now outside this region and outside the integration contour, producing the zero-valued integral.

4.4 DISCRETE SYSTEMS AND THE z-TRANSFORM

For a linear time-invariant discrete system described by

∞
y(n) = x (n) ∗n h(n) = ∑ x ( m ) h ( n − m ),
m=−∞

the z-transform is derived in Section 4.2.5 in the form

Y ( z ) = X ( z ) H ( z ).

The z-transform of the output signal is obtained by multiplying the input signal z-transform by the
transfer function
∞
H (z) = ∑ h(n)z−n .
n=−∞
It is possible to relate two important properties of the linear time invariant systems with the transfer
function properties.
The system is stable if
∞
∑ |h(n)| < ∞.
n=−∞
It means that the z-transform exists at |z| = 1, that is, that the circle

|z| = 1

belongs to the region of convergence for a stable system.

The system is causal if h(n) = 0 for n < 0. Since H (z) = h(0) + h(1)z−1 + h(2)z−2 + . . . it
is obvious that z → ∞ belongs to the region of convergence for a causal system.
From the previous two properties we can conclude that a linear time-invariant system is stable
and causal if the unit circle |z| = 1 and z → ∞ belong to the region of convergence. Since there are no
poles within the region of convergence one may conclude that a transfer function H (z) may correspond
to a stable and causal system only if all of its poles are inside the unit circle.

Example 4.10. For the systems whose transfer functions are

1
H1 (z) = , |z| > 3/2
(z − 1/3)(z − 3/2)
1
H2 (z) = , 1/3 < |z| < 3/2
z(z − 1/3)(z − 3/2)
1
H3 (z) = , |z| > 3/4
(z − 1/3)(z − 3/4)
170 z-Transform

a a a
Im{z}

Im{z}

Im{z}
1 1 1
c

b b
Re{z} Re{z} Re{z}

2 4
60 h (n) 1.5 h2(n) 3 h3(n)
1
40 1 2
20 0.5 1

0 0 0

−10 0 10 −10 0 10 −10 0 10

Figure 4.3 Regions of convergence (gray) with corresponding signals. Poles are marked by "x".

plot the regions of convergence and discuss the stability and causality. Find and plot the impulse
response for every case.

⋆ The regions of convergence are shown in Fig. 4.3. The system described by H1 (z) is causal
but not stable. The system H2 (z) is stable but not causal, while the system H3 (z) is both stable
and causal. Their impulse responses are shown in Fig. 4.3 as well.

Amplitude of the frequency response (gain) of a discrete system is related to the transfer function
as
| H (e jω )| = | H (z)||z=e jω .
Consider a discrete system whose transfer function assumes the form of a ratio of two polynomials

B0 + B1 z−1 + . . . + B M z− M B (z − z01 )(z − z02 ) . . . (z − z0M )

H (z) = = 0 zN−M
A0 + A1 z −1 + . . . + A N z − N A0 (z − z p1 )(z − z p2 ) . . . (z − z pN )

where z0i are the zeros and z pi are the poles of the transfer function. For the amplitude of the frequency
response we my write
B TO1 TO2 . . . TO M
| H (e jω )| = 0 ,
A0 TP1 TP2 . . . TPN
where TOi are the distances from the point T at the given frequency z = e jω to zero Oi at z0i . Distances
from the point T to the poles Pi at z pi are denoted by TPi .
Ljubiša Stanković Digital Signal Processing 171

Example 4.11. Plot the frequency response of a causal notch filter with the transfer function

z − e jπ/3
H (z) = .
z − 0.95e jπ/3

⋆ The transfer functions calculation is illustrated in Fig. 4.4. Its value is

|e jω − e jπ/3 | TO1
| H (e jω )| = =
|e jω − 0.95e jπ/3 | TP1

where the zero O1 is positioned at z01 = e jπ/3 and the pole P1 is at z p1 = 0.95e jπ/3 . For any
point T at z = e jω , ω 6= π/3, the distances TO1 and TP1 from T to O1 and from T to P1 are
almost the same, TO1 ∼ = TP1 . Then | H (z)||z=e jω ∼
= 1 except at ω = π/3, when TO1 = 0 and
TP1 6= 0 resulting in | H (z)||z=e jπ/3 = 0. The frequency response | H (e jω )| is shown in Fig. 4.4.

O
1
1.5
T P
1
ω
|H(ejω)|

π/3
Im{z}

0.5

0
Re{z} −2 0 π/3 2 ω

Figure 4.4 Poles and zeros of a first-order notch filter (left). The frequency response of this notch filter (right).

4.5 DIFFERENCE EQUATIONS

An important class of discrete systems can be described by difference equations. They are obtained
by converting corresponding differential equations or by describing an intrinsically discrete system
relating the input and output signal in a recursive way. A general form of a linear difference equation
with constant coefficients, that relates the output signal y(n), at an instant n, with the input signal x (n)
and the previous input and output samples, is

y(n) + A1 y(n − 1) + . . . + A N y(n − N ) = B0 x (n) + B1 x (n − 1) + . . . + B M x (n − M).

172 z-Transform

4.5.1 Solution Based on the z-transform

The z-transform of the linear difference equation, assuming zero-valued initial conditions, is

[1 + A1 z−1 + · · · + A N z− N ]Y (z) = [ B0 + B1 z−1 + · · · + B M z− M ] X (z),

since Z { x (n − i )} = X (z)z−i and Z {y(n − k)} = Y (z)z−k . The solution y(n) of the difference
equation is obtained as an inverse z-transform of

B0 + B1 z−1 + · · · + B M z− M
Y (z) = X ( z ).
1 + A1 z −1 + · · · + A N z − N

Example 4.12. A causal discrete system is described by the difference equation

5 1
y ( n ) − y ( n − 1) + y ( n − 2) = x ( n ). (4.8)
6 6
If the input signal is x (n) = 1/4n u(n) find the output signal.

⋆The z-transform domain form of the system is Y (z) − 56 z−1 Y (z) + 16 z−2 Y (z) = X (z),
producing
1
Y (z) = X ( z ).
1 − 65 z−1 + 61 z−2
The z-transform of the input signal is X (z) = 1/(1 − 41 z−1 ) for |z| > 1/4. The z-transform of
the output signal y(n) is

z3
Y (z) = .
(z − 21 )(z − 31 )(z − 41 )

For a causal system the region of convergence is |z| > 1/2. The output signal is the inverse
z-transform of Y (z). For n > 0 it is equal to
n o
y(n) = ∑ [zn−1 Y (z)(z − zi )]|z=zi
zi =1/2,1/3,1/4

z n +2 z n +2 z n +2
= 1 1
+ 1 1
+
(z − 3 )( z − 4 ) |z=1/2 (z − 2 )( z − 4 ) |z=1/3 (z − 21 )(z − 31 ) |z=1/4
1 8 3
=6 − n + n.
2n 3 4
For n = 0 there is no pole at z = 0. Thus, the above expressions hold for n = 0 as well. The
output signal is given by

6 8 3
y ( n ) = n − n + n u ( n ).
2 3 4

Note: This kind of solution assumes the initial conditions from the system causality and x (n) in
the form: y(0) = x (0) = 1 and y(1) − 5y(0)/6 = x (1), that is, y(1) = 13/12.
Ljubiša Stanković Digital Signal Processing 173

Example 4.13. A first-order causal discrete system is described by the following difference equation

y(n) + A1 y(n − 1) = B0 x (n) + B1 x (n − 1). (4.9)

Find its impulse response and discuss its behavior in terms of the system coefficient A1 .

⋆For the impulse response calculation the input signal is defined by x (n) = δ(n) with X (z) = 1.
Then we have

(1 + A1 z−1 )Y (z) = ( B0 + B1 z−1 )

B0 + B1 z−1
Y (z) = .
1 + A1 z −1
The pole of this system is z = − A1 . The are two possibilities for the region of convergence
|z| > | A1 | and |z| < | A1 |. For a causal system the region of convergence is |z| > | A1 |. Thus, the
z-transform Y (z) can be expanded into a geometric series with q = A1 z−1 = ( A1 /z)

Y (z) = B0 + B1 z−1 1 − A1 z−1 + A21 z−2 − A31 z−3 + · · · + (− A1 z−1 )n + . . .
∞ ∞
= B0 + B0 ∑ (− A1 )n z−n + B1 ∑ (− A1 )n−1 z−n
n =1 n =1
∞
= B0 + (− A1 B0 + B1 ) ∑ (− A1 )n−1 z−n
n =1

with
y(n) = B0 δ(n) + (− A1 )n−1 (− A1 B0 + B1 )u(n − 1).
We can conclude that, in general, the impulse response has an infinite duration for any A1 6= 0. It
is a result of the recursive relation between the output y(n) and its previous value(s) y(n − 1).
This kind of systems is referred to as the infinite impulse response (IIR) systems or recursive
systems. If the value of coefficient A1 is zero-valued, that is A1 = 0, then there is no recursion
and
y(n) = B0 δ(n) + B1 δ(n − 1).
This is the system with a finite impulse response (FIR). This kind of system produces an output
to the signal x (n) in the form

y(n) = B0 x (n) + B1 x (n − 1),

and is referred to as the moving average (MA) system. Systems without recursion are always
stable since a finite sum of the finite signal values is always finite.
Systems that would contain only x (n) and the output signal recursions,

y(n) + A1 y(n − 1) = B0 x (n)

are auto-regressive (AR) systems or all pole systems. This kind of systems could be unstable, due
to the output signal recursion. In our case, the system is obviously unstable if | A1 | > 1. Systems
defined by (4.9) are in general auto-regressive moving average (ARMA) systems.
174 z-Transform

If the region of convergence were |z| < | A1 |, then the function Y (z) would be expanded
into series with q = z/A1 as
∞
B0 + B1 z−1 B0 B
Y (z) = = z+ 1 ∑ (− A1−1 z)n
A1 z−1 (z/A1 + 1) A1 A1 n =0
0 0
B
= B0 ∑ (− A1 )n−1 z−(n−1) + 1 ∑ (− A1 )n z−n
n=−∞ A1 n=−∞
−1 B1 0
= B0 ∑ (− A1 )n z−n + ∑ (− A1 )n z−n
n=−∞ A1 n=− ∞

with
B1
y(n) = B0 (− A1 )n u(−n − 1) + (− A1 )n u(−n).
A1
This system would be stable if |1/A1 | < 1 and unstable if |1/A1 | > 1, having in mind that
y(n) is nonzero for n < 0. This is an anticausal system since it has impulse response satisfying
h(n) = 0 for n ≥ 1.
Here, we have just introduced the notions. These systems will be considered in Chapter 5.

4.5.2 Solution to Difference Equations in the Time Domain

A direct way to solve a linear difference equation with constant coefficients of the form

y ( n ) + A1 y ( n − 1) + · · · + A N y ( n − N ) = x ( n ) (4.10)

in the time domain will be described next.

The homogeneous part of this difference equation is

y(n) + A1 y(n − 1) + · · · + A N y(n − N ) = 0. (4.11)

Solution for the homogeneous equation is of the form

yi (n) = Ci λin ,

where Ci and λi are constants. Replacing yi (n) into (4.11), the characteristic polynomial equation
follows

Ci λin + Ci A1 λin−1 + · · · + Ci A N λin− N = 0,

or λiN + A1 λiN −1 + · · · + A N = 0.

This is a polynomial of the Nth order. In general, it has N solutions λi , i = 1, 2, . . . , N. All functions
yi (n) = λin , i = 1, 2, . . . , N are the solutions of equation (4.11). Since the equation is linear, a linear
combination of these solutions,
N
yh (n) = ∑ Ci λin
i =1
is also a solution of the homogeneous equation (4.11). This solution is called homogeneous part of the
solution to (4.10).
Ljubiša Stanković Digital Signal Processing 175

Next a particular solution y p (n), corresponding to the form of the input signal x (n), should be
found. The solution to equation (4.10) is then

y ( n ) = y h ( n ) + y p ( n ).

The constants Ci , i = 1, 2, . . . , N are calculated based on the initial conditions y(i − 1), i = 1, 2, . . . , N.

Example 4.14. Find the output of a causal discrete system

5 1
y ( n ) − y ( n − 1) + y ( n − 2) = x ( n ) (4.12)
6 6
to the input signal x (n) = (n + 11/6)u(n) by solving the difference equation in the discrete-time
domain. The initial conditions are y(0) = 1 and y(1) = 5.

⋆The solution to the homogeneous part of difference equation (4.12)

5 1
y ( n ) − y ( n − 1) + y ( n − 2) = 0
6 6
is of the form yi (n) = Ci λin . Its replacement into the equation results in the characteristic
polynomial
5 1
λ2i − λi + = 0,
6 6
whose roots are λ1 = 1/2 and λ2 = 1/3. The homogeneous part of the solution is
1 1
yh (n) = C1 + C2 n .
2n 3
Since x (n) = (n + 11/6)u(n) is a linear function of n, a particular solution is of the form
y p (n) = An + B. Replacing y p (n) into (4.12) we obtain

5 1
y p (n) − y p (n − 1) + y p (n − 2) = n + 11/6
6 6
5 1
An + B − ( An − A + B) + ( An − 2A + B) = n + 11/6,
6 6
and A = 3, B = 1 follow. The solution to (4.12) is a sum of the homogeneous and the particular
solution,
1 1
y(n) = yh (n) + y p (n) = C1 n + C2 n + 3n + 1.
2 3
Using the initial conditions

y(0) = C1 + C2 + 1 = 1
C1 C
y (1) = + 2 +4=5
2 3
the constants C1 = 6 and C2 = −6 follow. The final solution is

6 6
y(n) = n − n + 3n + 1 u(n).
2 3
176 z-Transform

Note: The z-transform based solution would assume y(0) = x (0) = 11/6 and y(1) =
5y(0)/6 + x (1) = 157/36. The solution with the initial conditions y(0) = 1 and y(1) = 5 could
be obtained from this solution with appropriate changes of the first two samples of the input signal
in order to take into account the previous system state and to produce the given initial conditions
y(0) = 1 and y(1) = 5 .
If multiple polynomial roots are obtained, for example λi = λi+1 , then yi (n) = λin and
yi+1 (n) = nλin .

Example 4.15. Goertzel algorithm: Show that the discrete-time signal

y(n) = e j(2πk0 n/N + ϕ)

is the solution to the homogeneous difference equation

y(n) − e j2πk0 /N y(n − 1) = 0. (4.13)

Consider a periodic signal x (n) with a period N and its DFT values X (k),

1 N −1
x (n) = ∑ X (k)e j2πnk/N . (4.14)
N k =0

If the signal within one of its periods, 0 ≤ n ≤ N − 1, is applied as the input to the system
described by difference equation (4.13) show that the output signal at n = N − 1 is equal to the
DFT of the signal x (n) at frequency index k = k0 , that is

y ( N − 1) = X ( k 0 ).

⋆For the signal y(n) holds

y(n) = e j(2πk0 n/N + ϕ) = e j(2πk0 (n−1+1)/N + ϕ)

= e j(2πk0 /N ) y(n − 1).

Consider now the case when the input signal x (n) is applied to the system. Since the system is
linear, consider one component of the input signal (4.14) in the form
1
xk (n) = X (k)e j2πnk/N ,
N
for an arbitrary 0 ≤ k ≤ N − 1. Then the difference equation (and the z-transform relation) for
this input signal reads

yk (n) − e j2πk0 /N yk (n − 1) = xk (n)

Z { xk (n)}
Yk (z) = . (4.15)
1 − e j2πk0 /N z−1
Ljubiša Stanković Digital Signal Processing 177

The z-transform of xk (n), for 0 ≤ n ≤ N − 1, is

1
X (k)e j2πnk/N }
Z { xk (n)} = Z { (4.16)
N
1 N −1 1 1 − e j2πk z− N
= X (k) ∑ e j2πnk/N z−n = X (k) .
N n =0 N 1 − e j2πk/N z−1

The transform Z { xk (n)}, for a given k, has zeros at

z0N = e j2πk+ j2lπ , l = 0, 1, 2, . . . ,N − 1

or
z0 = e j2π (k+l )/N , l = 0, 1, 2, . . . , N − 1.
Note that the zero
z0 = e j2πk/N , obtained for l = 0
is canceled with the pole z p = e j2πkn/N in (4.16). Therefore the remaining zeros are at

z0 = e j2π (k+l )/N , l = 1, 2, . . . ,N − 1.

The output z-transform, Yk (z), defined by (4.15), has a pole at

z p = e j2πk0 /N .

- If k 6= k0 then one of zeros z0 = e j2π (k+l )/N , l = 1, 2,. . . , N − 1 will coincide with the pole
z p = e j2πk0 /N and will cancel it out. Thus, for k 6= k0 , the function Yk (z) will not have any pole.
Then, I
1
y k ( N − 1) = z N −2 Yk (z)dz = 0 (4.17)
2πj
C
since there are no poles within C, Fig. 4.5.

Z {xk(n)} 1/(1−ej2π k0n/Nz−1), k≠ k0 1/(1−ej2π k0n/Nz−1), k=k0

z=ej2πk/N z=ej2πk/N
k0=k
Im{z}

Im{z}

z=ej2πk0/N
k0≠ k
Re{z} Re{z} Re{z}

Figure 4.5 Zeros and the pole in Z { xk (n)} (left), the pole in 1/(1 − e j2πk0 n/N z−1 ) for k 6= k0 (middle), and
the pole in 1/(1 − e j2πk0 n/N z−1 ) for k = k0 (right). Illustration is for N = 16.

- If k = k0 , then the pole at k = k0 is already canceled out in Z { xk (n)} and z p = e j2πk0 /N

remains as a pole of Y (z). In this case, the signal value at n = N − 1 is equal to the residuum of
178 z-Transform

the function in (4.17) at the pole z p = e j2πk0 /N , relation (4.7),

yk0 ( N − 1) = z N −2 Yk0 (z)(z − e j2πk0 /N ) j2πk /N
z=e
0

1 1−e j2πk 0 z − N

= z N −1 X ( k 0 )
N 1 − e j2πk0 /N z−1 j2πk /N
z=e 0

1 z N − e j2πk0
= X (k0 ) lim j2πk0 /N
= X ( k 0 ).
N z→e j2πk0 /N z − e

Therefore, the output signal of the system, at n = N − 1, is

y k ( N − 1) = X ( k ) δ ( k − k 0 ).

Note: The difference relation

y(n) − e j2πk0 n/N y(n − 1) = x (n) (4.18)

with the z-transform domain form

X (z)
Y (z) =
1 − e j2πk0 n/N z−1
is often extended to
X (z) 1 − e− j2πk0 n/N z−1
Y (z) =
1 − e j2πk0 n/N z−1 1 − e− j2πk0 n/N z−1
1 − e− j2πk0 n/N z−1
Y (z) = X (z)
1 − 2 cos(2πk0 n/N )z−1 + z−2
In the discrete-time domain the system

y(n) − 2 cos(2πk0 /N )y(n − 1) + y(n − 2) = x (n) − e− j2πk0 n/N x (n − 1) (4.19)

is called the Goertzel algorithm for the DFT calculation at a given single frequency k0 .
It is interesting to note that the computation of (4.19) is more efficient than the computation
of (4.18). For the calculation of (4.18), for one k0 , we need one complex multiplication (4 real
multiplications) and one complex addition (2 real additions). For N instants and one k0 we need
4N real multiplications and 2N real additions. For the calculation of (4.19) we can use linear
property and calculate only

y1 (n) − 2 cos(2πk0 /N )y1 (n − 1) + y1 (n − 2) = x (n) (4.20)

at every instant. It requires a multiplication of complex signal with a real coefficient. It means 2
real multiplications for every instant or 2N in total for N instants. The resulting output, at the
instant N − 1, is

y( N − 1) = T { x ( N − 1)} − e− j2πk0 ( N −1)/N T { x ( N − 1)}

= y1 ( N ) − e j2πk0 y1 ( N − 1).

It requires just one additional complex multiplication for the last instant and for one frequency.
The total number of multiplications is 2N + 4. It is reduced with respect to the previously needed
4N real multiplications. The total number of additions is 4N + 2. It is increased. However the
time needed for a multiplication is much longer than the time needed for an addition. Thus, the
Ljubiša Stanković Digital Signal Processing 179

overall efficiency is improved. The efficiency is even more improved having in mind that (4.20) is
the same for calculation of X (k0 ) and X (−k0 ) = X ( N − k0 ).

4.6 RELATION OF THE z-TRANSFORM TO OTHER TRANSFORMS

The z-transform will be related to the all other transform, presented so far. Consider a continuous-time
signal x (t) and its the Laplace transform X (s). If the integral in the Laplace transform is approximated
by the corresponding sum, we have
Z∞ ∞ ∞
X (s) = x (t)e−st dt ∼
= ∑ x (n∆t)e−sn∆t ∆t = ∑ x (n)e−sn∆t ,
−∞ n=−∞ n=−∞

with x (n) = x (n∆t)∆t. When this relation is compared to the z-transform definition we can conclude
that the Laplace transform of x (t) can be approximated by the z-transform of its samples with

z = exp(s∆t),

that is,
X (s) ↔ X (z)|z=exp(s∆t) . (4.21)
A point in the complex Laplace domain, s = σ + jΩ, maps to the point z = re jω
with r = eσ∆t
and ω = Ω∆t. Points from the left half-plane in the s domain, σ < 0, map to the interior of the unit
circle in the z domain, r < 1.
In applied mathematics, the transform X (z) at z = exp(s∆t) is called the starred or star
transform. It can be obtained as the Laplace transform of the sampled signal, denoted in the continuous-
time domain as
∞
x ∗ (t) = x (t) ∑ δ(t − n∆t).
n=−∞
The starred transform is derived by
Z∞ ∞ Z∞ ∞
X (s) = x ∗ (t)e−st dt = ∑ x (n∆t)e−sn∆t δ(t − n∆t)dt = ∑ x (n)e−sn∆t . (4.22)
−∞ n=−∞ −∞ n=−∞

According to the sampling theorem, for the Laplace transform of discrete-time signal holds
X (s)|σ=0 = X ( jΩ) = X ( j(Ω + 2kπ/∆t)).
The Fourier transform of a discrete-time signal is
∞
X (e jω ) = X (z)|z=e jω = ∑ x (n)z− n
|z=e jω
.
n=−∞

Example 4.16. The Fourier transform of a causal discrete-time signal x (n) is X (e jω ). Write its
z-transform in terms of the Fourier transform of the discrete-time signal, that is, write the z-
transform values in the complex plane based on their values on the unit circle.
180 z-Transform

⋆The signal x (n) can be expressed in term of its Fourier transform as

Zπ
1
x (n) = X (e jω )e jωn dω
2π
−π

The z-transform of this signal is given by

∞ Zπ ∞ Zπ
−n 1 jω jωn −n 1 X (e jω )
X (z) = ∑ x (n)z =
2π
X (e ) ∑e z dω =
2π 1 − e jω z−1
dω,
n =0 −π n =0 −π

for |z| > 1. In this way, the relation between X (z), for any z, and X (e jω ) is established.

The DFT of discrete-time signal with N nonzero samples is

N −1
X (k) = X (e jω )|ω =2πk/N = X (z)|z=e j2πk/N = ∑ x (n)z− n
|z=e j2πk/N
.
n =0

N=16
jω j2π k/16
z=e z=e

π/∆t
Im{s}=Ω

Im{z}

−π/∆t

Re{s}=σ Re{z} Re{z}

Figure 4.6 Illustration of the z-transform relation with the Laplace transform (left), the Fourier transform of
discrete signals (middle), and the DFT (right).

Example 4.17. Consider a discrete-time signal x (n) with N samples different from zero within
0 ≤ n ≤ N − 1. Show that all values of X (z), for any z, can be calculated based on its N samples
on the unit circle in the z-plane.

⋆If the signal has N nonzero samples, then it can be expressed in term of its DFT as
N −1
1 N −1
∑ x (n)e− j2πnk/N and x (n) = X (k)e j2πnk/N .
N k∑
X (k) =
n =0 =0

Thus, the z-transform of this x (n) can be expressed in terms of the IDFT within 0 ≤ n ≤ N − 1,
in the form
N −1
1 N −1 N −1
1 N −1 1 − z− N e j2πk
∑ x (n)z−n = ∑ X (k) ∑ e j2πnk/N z−n =
N k∑
X (z) = −1 j2πk/N
X (k)
n =0
N k =0 n =0 =0 1 − z e
Ljubiša Stanković Digital Signal Processing 181

with X (k) = X (z) at z = exp( j2πk/N ), k = 0, 1, 2, . . . , N − 1.

In this way, the z-transform X (z) is expressed in terms of its samples on the unit circle at
z = exp( j2πk/N ).
For a periodic signal x (n), when all periods are included in the z-transform calculation,
holds

1 N −1 ∞ 1 N −1 1
∑ ∑ X (k)e j2πnk/N z−n =
N k∑
X (z) = −1 e j2πk/N
X ( k ).
N k =0 n =0 =0 1 − z

4.7 PROBLEMS

Problem 4.1. Find the z-transform and the region of convergence for the following signals:
(a) x (n) = δ(n − 2),
(b) x (n) = a|n| ,
(c) x (n) = 21n u(n) + 31n u(n)

Problem 4.2. Find the z-transform and the region of convergence for the signals:
(a) x (n) = δ(n + 1) + δ(n) + δ(n − 1),
(b) x (n) = 21n [u(n) − u(n − 10)].

Problem 4.3. Using the z-transform property that

dX (z)
Y (z) = −z
dz
corresponds to
y(n) = nx (n)u(n)
in the discrete-time domain, with the same region of convergence for X (z) and Y (z), find a causal
signal whose z-transform is
(a) X (z) = e a/z , |z| > 0.
(b) X (z) = ln(1 + az−1 ), |z| > | a|.

Problem 4.4. (a) How the z-transform of x (−n) is related to the z-transform of x (n)?
(b) If the signal x (n) is real-valued show that its z-transform satisfies X (z) = X ∗ (z∗ ).

Problem 4.5. If X (z) is the z-transform of a signal x (n) find the z-transform of
∞
y(n) = ∑ x ( k ) x ( n + k ).
k =−∞

Problem 4.6. Find the inverse z-transform of

1 2
X (z) = , |z| > .
2 − 3z 3
Problem 4.7. The z-transform of a causal signal x (n) is

z+1
X (z) = .
(2z − 1)(3z + 2)

Find the signal x (n).

182 z-Transform

Problem 4.8. The transfer function of the discrete system is given by

3 − 65 z−1
H (z) = 1 −1 1 −1
.
(1 − 4 z )(1 − 3 z )

Find the impulse response if:

(a) The system is stable,
1
(b) The region of convergence is 4 < |z| < 31 ,
(c) The system is anticausal.
Problem 4.9. For the z-transform
1
H (z) = √
3
(1 − 4z)( 14 − 2
2 z+z )

identify possible regions of convergence. For every case, comment the stability and causality of
the system whose transfer function is H (z). What is the output of the stable system to the input
x (n) = 2 cos(nπ/2)?
Problem 4.10. Find the impulse response of a causal system whose transfer function is
z+2
H (z) = .
( z − 2) z2
Problem 4.11. Find the inverse z-transform of
z2
X (z) = .
z2 +1
Problem 4.12. The system is described by the following difference equation
5 1 5 3
y ( n ) − y ( n − 1) + y(n − 2) − y(n − 3) = 3x (n) − x (n − 1) + x ( n − 2).
16 32 4 16
Find the impulse response of a causal system.
Problem 4.13. Show that the system defined by
3 1
y(n) = x (n) − x ( n − 1) + x ( n − 2)
4 8
has a finite output duration for an infinite duration input x (n) = 1/4n u(n) .
Problem 4.14. A linear time-invariant system is characterized by the impulse response

h(n) = 1/3n u(n).

Using the z-transform find the output of the system to the input signal x (n) = u(n) − u(n − 6) .
Problem 4.15. Find the output of the causal discrete system
11 1 3
y(n) − y(n − 1) + y(n − 2) = 2x (n) − x (n − 1)
6 2 2
if the input signal is x (n) = δ(n) − 23 δ(n − 1).
Problem 4.16. Solve the difference equation

x (n + 2) + 3x (n + 1) + 2x (n) = 0
Ljubiša Stanković Digital Signal Processing 183

using the z-transform. The initial conditions are x (0) = 0 and x (1) = 1. The signal x (n) is causal.

Problem 4.17. Solve the difference equation

x ( n + 1) = x ( n ) + a n u ( n )

using the z-transform with the initial condition x (0) = 0.

Problem 4.18. Find the output of the causal discrete system

√
2 1
y(n) − y ( n − 1) + y ( n − 2) = x ( n ) (4.23)
2 4
to the input signal x (n) = 31n u(n) by solving the differential equation in the discrete-time domain and
using the z-transform. The initial conditions are y(n) = 0 for n < 0.

Problem 4.19. The first backward difference is defined as

∇ x ( n ) = x ( n ) − x ( n − 1),

and the mth backward difference is defined by

∇ m x ( n ) = ∇ m −1 x ( n ) − ∇ m −1 x ( n − 1).

The first forward difference is

∆x (n) = x (n + 1) − x (n),
with the mth forward difference being

∆ m x ( n ) = ∆ m −1 x ( n + 1) − ∆ m −1 x ( n ).

Find the z-transforms of these differences.

Problem 4.20. Based on the pole-zero geometry plot the amplitude of the frequency response of the
system described by the difference equation
√ √
y(n) = x (n) − 2x (n − 1) + x (n − 2) + r 2y(n − 1) − r2 y(n − 2)

for r = 0.99. Based on the frequency response, find approximative values of the output signal if the
input is the continuous-time signal

x (t) = 2 cos(10πt) − sin(15πt) + 0.5e j20πt

sampled with the sampling interval ∆t = 1/60.

Problem 4.21. Plot the frequency response of the discrete system (comb filter)

1 − z− N
H (z) =
1 − rz− N

with r = 0.9999 and r1/N ∼

= 1. Show that this system has the same transfer function as

(1 − z−2 ) N/2−1 1 − 2 cos(2kπ/N )z−1 + z−2

(1 − r2 z−2 ) k∏
H (z) = −1 + z −2
.
=1 1 − 2r cos(2kπ/N ) z
184 z-Transform

4.8 EXERCISE

Exercise 4.1. Find the z-transform and the region of convergence for the following signals:
(a) x (n) = δ(n − 3) − δ(n + 3),
(b) x (n) = u(n) − u(n − 20) + 3δ(n),
(c) x (n) = 1/3|n| + 1/2n u(n),
(d) x (n) = 3n u(−n) + 2−n u(n),
(e) x (n) = n(1/3)n u(n).
(f) x (n) = cos(n π2 ).
Exercise 4.2. Find the z-transform and the region of convergence for the signals:
(a) x (n) = 3n u(n) − (−2)n u(n) + n2 u(n).
(b) x (n) = ∑nk=0 2k 3n−k ,
(c) x (n) = ∑nk=0 3k .
Exercise 4.3. Find the inverse z-transform of:
−8
(a) X (z) = 1z−z + 3, if X (z) is the z-transform of a causal signal x (n).
(b) X (z) = (zz−+22)z2 , if X (z) is the z-transform of a causal signal x (n).
2
(c) X (z) = 6z +3z−2 , if X ( z ) is the z-transform of an infinite-duration signal x ( n ).
6z2 −5z+1
∞
Find ∑n=−∞ x (n) in this case.
Exercise 4.4. Find the inverse z-transforms of:
z−5 (5z−3)
(a) X (z) = (3z−1)(2z−4) , if x (n) is causal,
(b) Y (z) = X ( 2z ), for a causal signal y(n),
(c) Y (z) = z−2 X (z), for a causal signal y(n).
Exercise 4.5. Find the inverse z-transforms of X (z) = cosh( az) and X (z) = sinh( az).
Exercise 4.6. If X (z) is the z-transform of a signal x (n), with the region of convergence |z| > 21 , find
the z-transforms of the following signals:
(a) y(n) = x (n) − x (n − 1),
∞
(b) y(n) = ∑ x (n − kN ), where N is an integer,
k =−∞
(c) y(n) = x (n) ∗n x (−n), where ∗n denotes the convolution.
d
(d) Find the signal whose z-transform is Y (z) = dz X ( z ).
Exercise 4.7. If X (z) is the z-transform of a signal x (n), find the z-transform of
∞
y(n) = ∑ x ∗ ( n − k ) x ( n + k ).
k =−∞

Exercise 4.8. For the z-transform

(2 − z )
H (z) =
(1 − 4z)(1 − 3z)
identify possible regions of convergence and find the inverse z-transform for each of them. In
each of these cases, comment stability and causality. What is the output of the stable system to
x (n) = 1 + (−1)n ?
Exercise 4.9. Find the output of the causal discrete system
3 1
y ( n ) − y ( n − 1) + y ( n − 2) = x ( n ). (4.24)
4 8
Ljubiša Stanković Digital Signal Processing 185

to the input signal x (n) = nu(n) by:

(a) A direct solution in the time domain.
(b) Using the z-transform.
The initial conditions are y(n) = 0 for n < 0, that is y(0) = x (0) = 0 and y(1) = 3y(0)/4 +
x (1) = 1.

Exercise 4.10. A causal discrete system is described by the difference equation

5 1
y ( n ) − y ( n − 1) + y ( n − 2) = x ( n ). (4.25)
6 6
If the input signal is x (n) = 1/4n u(n), find the output signal if the initial value of the output was
y(0) = 2.
Hint: Since y(0) does not follow from (4.25) obviously the system output was "preloaded" before
the input is applied. This fact can be taken into account by changing the input signal at n = 0 to
produce the initial output. This input signal is x (n) = 1/4n u(n) + δ(n). Now the initial conditions
are y(0) = 2 and y(1) = 5/3 + 1/4 = 23/12 and we can apply the z-transform with this new input
signal.

Exercise 4.11. Solve the difference equation using the z-transform

1
x ( n + 2) − x ( n + 1) + x ( n ) = 0
2
with initial conditions x (0) = 0 and x (1) = 1/2. The signal x (n) is causal.

Exercise 4.12. Using the basic trigonometric transformations show that the real-valued signal
y(n) = cos(2πk0 n/N + ϕ) is a solution to the homogeneous difference equation

y(n) − 2 cos(2πk0 /N )y(n − 1) + y(n − 2) = 0,

with similar conclusions as in the complex-valued signal case.

Exercise 4.13. For the system

(1 − z−1 )(1 + z−1 ) 3 1 − 2 cos(2kπ/8)z−1 + z−2

(1 − rz−1 )(1 + rz−1 ) k∏
H (z) = −1 + z −2
=1 1 − 2r cos(2kπ/8) z

and r = 0.9999 plot the amplitude of the frequency response and find the output to the signal

x (n) = cos(nπ/3 + π/4) + sin(nπ/2) + (−1)n .

186 z-Transform

4.9 SOLUTIONS

Solution 4.1. (a) The z-transform of the signal x (n) = δ(n − 2) is

∞
X (z) = ∑ δ ( n − 2) z − n = z −2
n=−∞

for any z 6= 0.
(b) For the signal x (n) = a|n| , the z-transform is given by
∞ −1 ∞
(1 − a2 ) z
X (z) = ∑ a|n| z−n = ∑ a−n z−n + ∑ an z−n = (1 − az)(z − a)
n=−∞ n=−∞ n =0

for |z| < 1/a and |z| > a. If | a| < 1 then the region of convergence is a < |z| < 1/a.
(c) In this case, when x (n) = 21n u(n) + 31n u(n), the z-transform is
∞ ∞
1 −n 1 1 1
X (z) = ∑ n
z + ∑ n z−n = 1 −1
+ 1 −1
n =0
2 n =0
3 1 − 2 z 1 − 3z
2 − 65 z−1 z(2z − 65 )
X (z) = =
(1 − 12 z−1 )(1 − 31 z−1 ) (z − 21 )(z − 31 )

for |z| > 1/2 and |z| > 1/3. The region of convergence is |z| > 1/2.

Solution 4.2. (a) The z-transform of signal x (n) = δ(n + 1) + δ(n) + δ(n − 1) is
∞
X (z) = ∑ (δ(n + 1) + δ(n) + δ(n − 1)) z−n =
n=−∞
1
= z + 1 + z −1 = z + 1 + .
z
The region of convergence excludes z = 0 and z −→ ∞.
(b) For x (n) = 21n [u(n) − u(n − 10)] we know that

1, n = 0, 1, . . . , 9
u(n) − u(n − 10) =
0, elsewhere.

The z-transform of x (n) is

∞ 9
1 −n 9
1 − (2z)−10
X (z) = ∑ x (n)z−n = ∑ z = ∑ (2z)−n = =
n=−∞ n =0
2 n
n =0 1 − (2z)−1
z−10 z10 − ( 12 )10 z10 − ( 21 )10
= − 1 1
=
z z− 2 z9 (z − 12 )

The expression for X (z) is written in this way in order to find the region of convergence, observing
the zero-pole locations in the z-plane, Fig. 4.7. Poles are at z p1 = 0 and z p2 = 1/2. Zeros are
z0i = e j2iπ/10 /2, Fig. 4.7. Since the z-transform has a zero at z0 = 1/2, it will cancel out the pole
z p2 = 1/2. The resulting region of convergence will include the whole z plane, except the point at
z = 0.
Ljubiša Stanković Digital Signal Processing 187

pole−zero cancellation at z=1/2

j2π/10
z=e /2

Im{z}
z=1/2

Re{z}

Figure 4.7 Pole-zero cancellation at z = 1/2.

Solution 4.3. (a) For X (z) = e a/z holds

dX (z) a a
−z = z 2 e a/z = X (z)
dz z z
The inverse z-transform of left and right side of this equation is

nx (n)u(n) = ax (n − 1)u(n)
dX (z)
since Z [nx (n)] = −z dz and z−1 X (z) = Z [ x (n − 1)]. This means that
a
x (n) = x ( n − 1)
n
for n > 0. According to the initial value theorem, we the value of x (0) is equal to

x (0) = lim X (z) = 1.

z→∞

The signal samples are obtained in the form

a2 a3
x (1) = a, x (2) = , x (3) = ,...
2 2·3
or
an
x (n) = u ( n ).
n!
(b) For X (z) = ln(1 + az−1 )

dX (z) d(ln(1 + az−1 )) az−2 az−1

Y (z) = −z = −z =z = .
dz dz 1 + az−1 1 + az−1
Therefore
dX (z) az−1
Z [nx (n)] = −z =
dz 1 + az−1
188 z-Transform

nx (n) = a(− a)n−1 u(n − 1),

producing
−(− a)n
x (n) = u ( n − 1).
n

Solution 4.4. (a) The z-transform of the signal x (−n) is

∞
X1 ( z ) = ∑ x (−n)z−n .
n=−∞

With the following substitution, −n = m, the relation

∞
X1 ( z ) = ∑ x (m)zm = X (1/z),
m=−∞

follows. The region of convergence is complementary to the one holding for the original signal. If the
region of convergence for x (n) is |z| > a, then the region of convergence for x (−n) is |z| < a .

(b) For a real-valued signal holds x ∗ (n) = x (n). Then we can write X ∗ (z∗ ) as
∞ ∗ ∞ ∗
X ∗ (z∗ ) = ∑ x (n)(z∗ )−n = ∑ x ∗ (n) (z∗ )−n .
n=−∞ n=−∞
∗ ∗
Since (z∗ )−n = (z−n )∗ = z−n , we get
∞ ∞
X ∗ (z∗ ) = ∑ x ∗ (n)z−n = ∑ x ( n ) z − n = X ( z ),
n=−∞ n=−∞

for the real-valued signal x (n).

Solution 4.5. From the z-transform definition

∞ ∞ ∞
Y (z) = ∑ y(n)z−n = ∑ ∑ x (k ) x (n + k )z−n ,
n=−∞ n=−∞ k =−∞

and using the substitution n + k = m, the final result,

1
Y ( z ) = X ( z ) X ( ),
z
follows.

Solution 4.6. A direct expansion of the given transform into power series within the region of
convergence is used. In order to find the signal x (n), whose z-transform is X (z) = 2−13z , the z-
transformshould
be written in the form of the power series of X (z) in powers of z−1 . Since the
3z
condition 2 < 1 does not correspond to the region of convergence given in the problem formulation
we have to rewrite X (z) as
1 1
X (z) = − 2
.
3z 1 − 3z
Ljubiša Stanković Digital Signal Processing 189

2
Now, the condition 3z < 1, that is |z| > 32 , corresponds to the given region of convergence. In order
to obtain the inverse z-transform, we can write
1 1 1
X (z) = − 2
= − X1 ( z ) ,
3z 1 − 3z 3z

where
1
X1 ( z ) = 2
.
1 − 3z
The power series expansion of X1 (z) gives
∞ n ∞ n
2 2
X1 ( z ) = ∑ = ∑ z−n .
n =0
3z n =0
3

It can be concluded that the original z-transform, X (z), can be written as

1 ∞ 2 n −n
3z n∑
X (z) = − z .
=0 3

In order to get the z-transform definition form

∞
X (z) = ∑ x (n) z−n , (4.26)
n=−∞

the last expression will be rewritten as follows

1 ∞ 2 n − n −1 1 ∞ 2 n −(n+1)
X (z) = − ∑ z z =− ∑ z .
3 n =0 3 3 n =0 3

With the substitution n → n + 1 we get

n −1
1 ∞ 2
z−n .
3 n∑
X (z) = −
=1 3

Finally, comparing this result to (4.26) we get the signal x (n) in the form
n −1
1 2
x (n) = − u ( n − 1).
3 3

Solution 4.7. Since the signal is causal, the region of convergence is outside the pole with the largest
radius (outside the circle passing through this pole). The poles of the z-transform X (z) are

1 2
z p1 = and z p2 = − .
2 3
The region of convergence is |z| > 32 . The z-transform can be written as

z+1 A B
X (z) = = +
(2z − 1)(3z + 2) 2z − 1 3z + 2
3 1
A= , B=− .
7 7
190 z-Transform

The terms in X (z) should be expressed in such a way that they represent sums of geometric series for
the given region of convergence. Based on the solution to the previous problem, we conclude that
A 1 B 1
X (z) = 1
+ 2
.
2z 1 − 2z 3z 1 + 3z

Now we can write

n
A 1 A ∞ 1 A ∞ 1 n − n −1 1
z−n =
2z n∑ ∑
1
= z , |z| >
2z 1 − 2z =0 2 2 n =0 2 2

and
B 1 B ∞ 2 n −n B ∞ 2 n − n −1 2
3z n∑ ∑ 3 z
2
= − z = − , |z| > .
3z 1 + 3z =0 3 3 n =0
3
The z-transform, with m = n + 1, is of the form
m −1
A ∞ 1 −m B ∞ 2 m −1 − m
2 m∑ 3 m∑
X (z) = z + − z .
=1 2 =1 3

Replacing the values for A and B, it follows

m
3 ∞ 1 1 ∞ 2 m −m
X (z) = ∑ z−m +
14 m∑
− z .
7 m =1 2 =1 3

The signal x (n) is obtained by comparing this transform to the z-transform definition,
n
3 1 1 2 n
x (n) = + − u ( n − 1).
7 2 14 3

Solution 4.8. The transfer function may be written as

3 − 56 z−1 A B
H (z) = 1 −1 1 −1
= + ,
(1 − 4 z )(1 − 3 z ) 1 − 14 z−1 1 − 13 z−1

with A = 1 and B = 2.
(a) The region of convergence must contain |z| = 1, for a stable system. This region of convergence
is |z| > 13 . From
n ∞ ∞ n
1 2 1 −n 1 1 1
H (z) = + = ∑ z +2 ∑ z−n , |z| > and |z| >
1 − 41 z−1 1 − 13 z−1 n=0 4 n =0
3 3 4

the impulse response is obtained as

h ( n ) = (4− n + 2 × 3− n ) u ( n ).

(b) The region of convergence is 14 < |z| < 31 . The first term in H (z) is the same as in (a), since it
converges for |z| > 14 . This term corresponds to the signal 4−n u(n). The second term must be rewritten
in such a way that its geometric series converges for |z| < 13 . Then,

2 3z ∞ −1 1
1 −1
= −2 = −2 ∑ (3z)n = −2 ∑ (3z)−m with |z| < .
1− 3z
1 − 3z n =1
m=−n m=−∞ 3
Ljubiša Stanković Digital Signal Processing 191

The signal corresponding to this z-transform is −2 × 3−n u(−n − 1). Then, the impulse response of
the system with the region of convergence 14 < |z| < 31 is obtained in the form

h(n) = 4−n u(n) − 2 × 3−n u(−n − 1).

c) For an anticausal system, the region of convergence is |z| < 14 . Now, the second term in H (z) is the
same as in (b). For |z| < 41 , the first term in H (z) should be written as

1 4z ∞ −1 1
=− = − ∑ (4z)n = − ∑ (4z)−m with |z| < .
1 − 14 z−1 1 − 4z n =1
m=−n m=−∞ 4

The signal corresponding to this term is −4−n u(−n − 1). The impulse response of the anticausal
discrete system, with the given transfer function H (z), is

h(n) = −4−n u(−n − 1) − 2 × 3−n u(−n − 1).

Solution 4.9. The transfer function H (z) can be written as

1 1
H (z) = √ = √ √
3 3 3
(1 − 4z)( 14 − 2
2 z+z ) (1 − 4z)(z − 4 + j 14 )(z − 4 − j 41 )
√ √
with the poles z1 = 1/4, z2 = 43 − j 14 , and z3 = 43 + j 14 . Since |z2 | = |z3 | = 1/2, the possible
regions of convergence are:
(1) |z| < 1/4,
(2) 1/4 < |z| < 1/2, and
(3) |z| > 1/2.
In the first two cases, the system is neither causal nor stable, while in the third case, the system is
causal and stable since |z| = 1 and |z| → ∞ belong to the region of convergence.
The output to the input signal

x (n) = 2 cos(nπ/2) = 1 + cos(nπ ) = 1 + (−1)n

y(n)= H (e jω )|ω =0 + H (e jω )|ω=π (−1)n = H (z)|z=1 + H (z)|z=−1 (−1)n =−0.8681+0.0945(−1)n .

Solution 4.10. The transfer function H (z) can be written as

z+2 A B C
= + + 2.
z2 ( z − 2) z−2 z z

The multiplication of both sides of the last expression by z2 (z − 2) yields

Az2 + Bz(z − 2) + C (z − 2) = z + 2
( A + B)z2 + (−2B + C ) − 2C = z + 2.

The coefficients A, B, and C follow from the system of equations

A + B = 0, −2B + C = 1, and − 2C = 2
192 z-Transform

as A = 1, B = −1, and C = −1. The transfer function is

z −1 1 1
H (z) = − 2− .
1 − 2z−1 z z
The region of convergence of a causal system is |z| > 2. The inverse z-transform of H (z) for causal
system is the system impulse response equal to

h ( n ) = 2n −1 u ( n − 1) − δ ( n − 2) − δ ( n − 1) = δ ( n − 2) + 2n −1 u ( n − 3).

This system is not stable.

Solution 4.11. The z-transform X (z) can be written in the form

1 1
z2 2z 2z
X (z) = = + .
z2 + 1 z+j z−j

For the region of convergence defined by |z| > 1, the signal is causal and

1 1
x (n) = [1 + (−1)n ] jn u(n) = [1 + (−1)n ]e jπn/2 u(n).
2 2
For n = 4k, where k ≥ 0 is an integer, x (n) = 1 , while for n = 4k + 2 the signal values are x (n) = −1.
For other n the signal value is x (n) = 0.
For |z| < 1 the inverse z-transform is

1
x (n) = − [1 + (−1)n ] jn u(−n − 1).
2

Solution 4.12. The transfer function of this system is

3 − 45 z−1 + 3 −2
16 z 3 − 54 z−1 + 3 −2
16 z
H (z) = 5 −2 1 −3
= 1 −1 1 −2 1 −1
1 − z −1 + 16 z − 32 z (1 − 2z + 16 z )(1 − 2 z )
1 1 5
=− − 2 + .
1 − 41 z−1 1− 1 −1 (1 − 12 z−1 )
4z

For a causal system, the region of convergence is outside of the pole z = 1/2, that is |z| > 1/2. Since

1 d z
2 =
da 1 − az −1
1 −1 a=1/4
1 − 4z

d ∞ n −(n−1) ∞

∞
1
= ∑ a z = ∑ nan−1 z−(n−1) = ∑ ( n + 1) n z − n ,
da n=0 n =1
n =0
4
a=1/4 a=1/4

the inverse z-transform is

1 1 1
h(n) = − u ( n ) − ( n + 1) n u ( n ) + 5 n u ( n ).
4n 4 2
Ljubiša Stanković Digital Signal Processing 193

Solution 4.13. The transfer function of the system defined by difference equation

3 1
y(n) = x (n) − x ( n − 1) + x ( n − 2)
4 8
is
3 1
H ( z ) = 1 − z −1 + z −2 .
4 8
The z-transform of the input signal x (n) = 1/4n u(n) is equal to

1
X (z) = ,
1 − 14 z−1

with the region of convergence |z| > 1/4. The z-transform of the output signal is

(1 − 21 z−1 )(1 − 14 z−1 ) 1

Y (z) = H (z) X (z) = 1 −1
= 1 − z −1 .
(1 − 4z )
2

Its inverse transform is a finite duration output signal, given by

y(n) = δ(n) − δ(n − 1)/2.

Solution 4.14. The system transfer function is given by

1
H (z) =
1 − 13 z−1

and the input signal z-transform is

1 − z −6
X ( z ) = 1 + z −1 + z −2 + z −3 + z −4 + z −5 = .
1 − z −1
The z-transform of the output signal is

1 − z −6
Y (z) = = Y1 (z) − Y1 (z)z−6
(1 − z−1 )(1 − 1/3z−1 )
with
1 3/2 1/2
Y1 (z) = = − .
(1 − z−1 )(1 − 1/3z−1 ) 1 − z−1 1 − 31 z−1
Its inverse z-transform is n
3 1 1
y1 ( n ) = − u ( n ).
2 2 3
The system output is obtained in the form
n " #
3 1 1 3 1 1 n −6
y(n) = − u(n) − − u ( n − 6).
2 2 3 2 2 3

Solution 4.15. The transfer function follows from

11 −1 1 −2 3
Y (z)(1 − z + z ) = X (z)(2 − z−1 )
6 2 2
194 z-Transform

as
2 − 23 z−1
H (z) = 11 −1
.
1− 6 z + 21 z−2
The poles are at z p1 = 1/3 and z p2 = 3/2 with the region of convergence |z| > 3/2. This means that
the system is not stable, Fig. 4.8.
Im{z}

Im{z}

Im{z}
1/3 1/3
3/2 3/2 3/2

Re{z} Re{z} Re{z}

Figure 4.8 Poles and zeros of the system (left), the input signal z-transform (middle), and the z-transform of the
output signal (right).

The z-transform of the input signal is

3
X (z) = 1 − z−1 for |z| > 0.
2
The output signal transform is

2 − 32 z−1 3 2 − 23 z−1
Y (z) = 11 −1
1 − z −1 = .
1− 6 z + 21 z−2 2 1 − 13 z−1

The output signal transform does not have a pole at z = 3/2, since this pole is canceled out. The output
signal is
2 3 1
y(n) = n u(n) − u ( n − 1).
3 2 3n −1

Solution 4.16. The z-transform of the signal x (n + 2) is

X2 (z) = z2 X (z) − z2 x (0) − zx (1),

while for the signal x (n + 1), the z-transform is

X1 (z) = zX (z) − zx (0).

The z-transform domain form the difference equation x (n + 2) + 3x (n + 1) + 2x (n) = 0 is

z2 X (z) − z2 x (0) − zx (1) + 3zX (z) − 3zx (0) + 2X (z) = 0

with
z 1 1
X (z) = = −
− .
z2 + 3z + 2 1 + z 1 1 + 2z−1
The inverse z-transform of X (z) is

x (n) = [(−1)n − (−2)n ]u(n).

Ljubiša Stanković Digital Signal Processing 195

Solution 4.17. The z-transforms of the left and right side of this equation are
z
zX (z) − zx (0) = X (z) +
z−a

z 1 1 a
X (z) = = − .
(z − a)(z − 1) 1 − a z − 1 z − a
The inverse z-transform is
1 1 − an
x (n) = [u(n − 1) − an u(n − 1)] = u ( n − 1)
1−a 1−a
or
n −1
x (n) = ∑ ak , n > 0.
k =0

Solution 4.18. For a direct solution in the discrete-time domain we assume a solution to the
homogeneous part of the equation
√
2 1
y(n) − y ( n − 1) + y ( n − 2) = 0 (4.27)
2 4
in the form yi (n) = Ci λin . The characteristic polynomial is
√
2 1
λ2 − λ+ =0
2 4
√ √
2 2
with λ1,2 = 4 ±j 4 .
The homogeneous solution to the difference equation is
√ √ √ √
2 2 n 2 2 n 1 1
yh (n) = C1 ( +j ) + C2 ( −j ) = C1 n e jnπ/4 + C2 n e− jnπ/4 .
4 4 4 4 2 2
1
A particular solution is assumed in the form of the input signal x (n) = 3n u ( n ) , that is

1
y p (n) = A u ( n ).
3n
The constant A is obtained by replacing this signal into (4.23)
√
1 2 1 1 1 1
A n − A n −1 + A n −2 = n
3 2 3 4 3 3
√
3 2 9
A (1 − + ) = 1.
2 4
Its value is A = 0.886. The general solution to the considered difference equation is equal to the sum
of the homogeneous solution and the particular solution
1 jnπ/4 1 1
y(n) = yh (n) + y p (n) = C1 e + C2 n e− jnπ/4 + 0.886 n .
2n 2 3
Since the system is causal with y(n) = 0 for n < 0, the constants C1 and C2 can be obtained from the
initial conditions following from
√
2 1
y(n) − y ( n − 1) + y ( n − 2) = x ( n )
2 4
196 z-Transform

as
y (0) = x (0) = 1
and √ √
2 2 1
y (1) = y (0) + x (1) = +
2 2 3
. With this initial conditions, we get

C1 + C2 + 0.886 = 1 (4.28)
√ √ √ √ √
2 2 2 2 1 2 1
C1 ( +j )/2 + C2 ( −j )/2 + 0.886 = + ,
2 2 2 2 3 2 3
as C1 = 0.057 − j0.9967 = 0.9984 exp(− j1.5137) = C2∗ . The final solution is

1 1
y(n) = 2 × 0.9984 cos(nπ/4 − 1.5137) + 0.886 n .
2n 3
For the z-domain, we write
√
2 1
Y (z) − Y ( z ) z −1 + Y ( z ) z −2 = X ( z )
2 4
with
1 1
Y (z) = √
1− 2 −1
+ 1 −2 1 − 31 z−1
2 z 4z
or
z3
Y (z) = √ √ √ √ .
2 2 2 2 1
(z − ( 4 +j 4 ))( z − ( 4 −j 4 ))( z − 3 )
Using the residual value based inversion of the z-transform, we can get the signal in the form
n o
y(n) = ∑ [zn−1 Y (z)(z − zi )]|z=zi .
√ √
2 2
z1,2,3 = 4 ± j 4 ,1/3

With the residual value definition for the simple poles z1 , z2 , and z3 we get

n +2 1 1
y(n) = z √ √ + z n + 2 √ √
2− j 2
1 √ √ 2+ j 2 1
√ √
(z − 4 )( z − 3 ) 2+ j 2 ( z − 4 )( z − 3 ) z= 4
2− j 2
4

1
+ z n +2 √ √ √ √
2+ j 2 2− j 2
(z − 4 )(z − 4 ) z=1/3
√ !
√ n +2 √ !
√ n +2
1 2+j 2 1 1 2−j 2 1 1 1
= √ √ √ − √ √ √ + n +2 √
j 2 4 2+ j 2 1 j 2 4 2− j 2 1 3 ( 1
− 1 2 1
2 4 −3 2 4 −3 9 3 2 + 4)
√ √
1 −j 2 1 j 2 1
= n+2 e j(n+2)π/4 √ √ + n+2 e− j(n+2)π/4 √ √ + 0.886 n
2 2+ j 2
−31 2 2− j 2
−3 1 3
4 4
√ √
1 2 1 2 1
= n e jnπ/4 √ √ + n e− jnπ/4 √ √ + 0.886 n
2 2 + j 2 − 43 2 2 − j 2 − 34 3
1 1
= 2 × 0.9984 cos(nπ/4 − 1.5137) + 0.886 n ,
2n 3
Ljubiša Stanković Digital Signal Processing 197

for n ≥ 1. For n = 0, there is no additional pole at z = 0. The previous result holds for n ≥ 0.

Solution 4.19. The z-transform of the first backward difference is

Z [∇ x (n)] = Z [ x (n)] − Z [ x (n − 1)] = (1 − z−1 ) X (z).

The second backward difference may be written as

∇2 x (n) = ∇[∇ x (n)] = ∇[ x (n) − x (n − 1)] = ∇ x (n) − ∇ x (n − 1)

= x (n) − 2x (n − 1) + x (n − 2).

Its z-transform is
Z [∇2 x (n)] = (1 − z−1 )2 X (z).
In the same way we get
Z [∇m x (n)] = (1 − z−1 )m X (z).
The z-transform of the first forward difference is

Z [∆x (n)] = Z [ x (n + 1) − x (n)] = zX (z) − zx (0) − X (z)

= (z − 1) X (z) − zx (0).

The second forward difference is

Z [∆2 x (n)] = x (n + 2) − 2x (n + 1) + x (n),

with the z-transform

Z [∆2 x (n)] = (z − 1)2 X (z) − z(z − 1) x (0) − z∆x (0).

The z-transform of the mth forward difference is

m −1
Z [∆m x (n)] = (z − 1)m X (z) − z ∑ ( z − 1) m − j −1 ∆ j x (0).
j =0

Solution 4.20. The transfer function of this system can be written in the form
√ √ √ √ √
1 − 2z−1 + z−2 [1 − ( 22 + j 22 )z−1 ][1 − ( 22 − j 22 )z−1 ]
H (z) = √ = √ √ √ √
1 − r 2z−1 + r2 z−2 [1 − r ( 22 + j 22 )z−1 ][1 − r ( 22 − j 22 )z−1 ]
√ √ √ √
2
[z − ( 2 + j 22 )][z − ( 22 − j 22 )]
= √ √ √ √ .
[z − r ( 22 + j 22 )][z − r ( 22 − j 22 )]
√ √ √ √
The zeros and poles are z01,02 = 22 ± j 22 and z p1,p2 = r 22 ± jr 22 . Their locations are shown in
Fig. 4.9.
The amplitude of the frequency response is

B TO1 TO2 TO1 TO2
| H (e jω )| = 0 = .
A0 TP1 TP2 TP1 TP2

The values of TP1 and TO1 , and TP2 and TO2 , are almost the same for any ω, except
ω = ±π/4, where the distance to the corresponding zeros of the transfer function is 0, while the
198 z-Transform

distance to the corresponding pole is small but finite. Based on this analysis, the amplitude of the
frequency response is shown in Fig. 4.9.

O1
1.5
T
P1

|H(ejω)|
Im{z}

P2
0.5
O2

0
Re{z} −2 −π/4 0 π/4 2 ω

Figure 4.9 Location of zeros and poles for a second-order system.

The input discrete-time signal is

x (n) = x (n∆t)n∆t = [2 cos(πn/6) − sin(πn/4) + 0.5e jπn/3 ]/60.

This system will filter out signal the components at ω = ±π/4. The output discrete-time signal is

y(n) = [2 cos(nπ/6) + 0.5e jnπ/3 ]/60.

The corresponding continuous-time form of the output signal is

y(t) = 2 cos(10πt) + 0.5e j20πt .

Solution 4.21. The zeros of the system are

z−
o
N
= 1 = e− j2πm
zom = e j2πm/N , m = 0, 1, . . . , N − 1.

Similarly, the poles are equal to zmp = r1/N e j2πm/N , m = 0, 1, . . . , N − 1. The frequency response of
the comb filter is
N −1 N −1
z − zom z − e j2πm/N
H (z) = ∏ = ∏ .
m =0
z − z pm m =0 z − r
1/N e j2πm/N

With r = 0.9999 and r1/N ∼

= 1, we get

| H (e jω )| ∼
= 1 for z 6= e j2πm/N
| H (e jω )| = 0 for z = e j2πm/N .

The same conclusions hold for

(1 − z−1 )(1 + z−1 ) N/2−1 1 − 2 cos(2kπ/N )z−1 + z−2

(1 − rz−1 )(1 + rz−1 ) k∏
H (z) = −1 + r 2 z −2
=1 1 − 2r cos(2kπ/N ) z
Ljubiša Stanković Digital Signal Processing 199

since for 1 ≤ k ≤ N/2 − 1 we can group the terms in the form

(1 − e2kπ/N z−1 )(1 − e2( N −k)π/N z−1 ) 1 − 2 cos(2kπ/N )z−1 + z−2

= .
(1 − re2kπ/N z−1 )(1 − re 2 ( N − k ) π/N z −1 ) 1 − 2r cos(2kπ/N )z−1 + r2 z−2
Chapter 5
From Continuous to Discrete Systems
RANSFORMATION of continuous-time systems into corresponding discrete-time systems is of

T high importance. Some discrete-time systems are designed and realized in order to replace
or perform as equivalents of the continuous-time systems. It is quite common to design a
continuous-time system with desired properties, since the designing procedures in this domain are
simpler and well developed. In the next step the obtained continuous-time system is then transformed
into the corresponding discrete-time system.
Consider an Nth order linear continuous-time system described by a differential equation with
constant coefficients
d N y(t) dy(t) d M x (t) dx (t)
aN + · · · + a 1 + a 0 y ( t ) = b M + · · · + b1 + b0 x (t).
dt N dt dt M dt
The Laplace transform domain equation for this system is

[ a N s N + · · · + a1 s + a0 ]Y (s) = [b M s M + · · · + b1 s + b0 ] X (s), (5.1)

assuming the zero-valued initial conditions.

The topic of this chapter is to find a corresponding discrete-time system, described by the
difference equation

A0 y(n) + A1 y(n − 1) + · · · + A N y(n − N ) = B0 x (n) + B1 x (n − 1) + · · · + B M x (n − M).

The z-transform domain form of this system is

[ A0 + A1 z−1 + · · · + A N z− N ]Y (z) = [ B0 + B1 z−1 + · · · + B M z− M ] X (z). (5.2)

There are several approaches to establish the relation between the continuous-time system in (5.1) and
the discrete-time system in (5.1), represented by their corresponding impulse responses or transfer
functions.

5.1 IMPULSE INVARIANCE METHOD

A natural approach to transform a continuous-time system into the corresponding discrete-time system
is based on the relation between the impulse responses of these two systems. Assume that the impulse
response of the continuous-time system is hc (t). The impulse response h(n) of the corresponding
discrete-time system, according to this approach, is equal to the samples of hc (t),

h(n) = hc (n∆t)∆t.

200
Ljubiša Stanković Digital Signal Processing 201

Obviously this relation can be used only if the sampling theorem is satisfied for the sampling interval

Figure 5.1 Sampling of the impulse response for the impulse invariance method.

∆t. It means that the frequency response of the continuous-time system must satisfy the sampling
theorem condition

H (Ω) = FT{hc (t)} = 0

for |Ω| > Ωm

and ∆t < π/Ωm . Otherwise the discrete-time version will not correspond to the continuous-time
version of the frequency response. Here, the frequency response of the discrete-time system is related
to a periodically extended form of the continuous-time system frequency response H (Ω) as
∞
∑ H (Ω + 2kπ/∆t) = H (e jω ), Ω = ω/∆t.
k=−∞

The transfer function of the continuous-time system in (5.1) may be decomposed using the partial
fractions as
b M s M + · · · + b1 s + b0 k1 k2 kN
H (s) = N
= + + ··· + , (5.3)
a N s + · · · + a1 s + a0 s − s 1 s − s 2 s − sN
where only simple poles, s1 , s2 , . . . , s N , of the transfer function are used. The case of multiple poles
will be discussed later. The inverse Laplace transform of a causal system, described by the previous
transfer function, is

h c ( t ) = k 1 e s1 t u ( t ) + k 2 e s2 t u ( t ) + · · · + k N e s N t u ( t ).

The impulse response of the corresponding discrete-time system is equal to the the samples of hc (t),

h(n) = hc (n∆t)∆t = [k1 ∆tes1 n∆t u(n) + k2 ∆tes2 n∆t u(n) + · · · + k N ∆tes N n∆t u(n)],

since u(n∆t) = u(n). The z-transform of the impulse response h(n) of the discrete-time system is

k1 ∆t k2 ∆t k N ∆t
H (z) = −
+ −
+ ··· + . (5.4)
1−e s 1 ∆t z 1 1−e s 2 ∆t z 1 1 − es N ∆t z−1
Comparing (5.3) to (5.4) it can be concluded that the terms in the transfer functions are transformed
from the continuous-time to the discrete-time case as
ki k i ∆t
→ . (5.5)
s − si 1 − esi ∆t z−1
202 From Continuous to Discrete Systems

If a multiple pole si of the (m + 1)th order exists in the continuous-time system transfer function,
then this term can be written as
ki 1 dm k i
= .
( s − si ) m + 1 m! dsim s − si

The term in the discrete-time system, corresponding to this continuous-time system term, is

1 dm ki 1 dm k i ∆t
→ . (5.6)
m! dsim s − si m! dsim 1 − esi ∆t z−1

s=jΩ
j2π/∆t

jπ/∆t z=ejω
Im{z}
Im{s}

−jπ/∆t

−j2π/∆t

Re{s} Re{z}

Figure 5.2 Illustration of the impulse invariance method mapping.

In the impulse invariance method, the poles are mapped according to

si → esi ∆t .

This mapping relation does not hold for zeros, Fig. 5.2.

Impulse response with an initial instant discontinuity. In the case when the continuous-time impulse
response hc (t) has a discontinuity at t = 0, that is, when

hc (t)|t=−0 6= hc (t)|t=+0 ,

then the previous forms assume that the discrete-time impulse response h(0) = hc (t)|t=+0 . Recall that
the theory of Fourier transforms in this case states that the inverse Fourier transform IFT{ H ( jΩ)} =
hc (t) where the signal hc (t) is continuous and

IFT{ H ( jΩ)} = hc (t)|t=−0 + hc (t)|t=+0 /2

at the discontinuity points (in this case at t = 0). The special case of discontinuity at t = 0 can be easily
detected for a causal system by mapping H (s) onto H (z) and by checking is the following relation
satisfied
0 = hc (t)|t=−0 = hc (t)|t=+0 = h(n)|n=0 = lim H (z).
z→∞
If limz→∞ H (z) 6= 0 then a discontinuity exists and we should use

h(0) = lim H (z)/2,

z→∞
Ljubiša Stanković Digital Signal Processing 203

since hc (t)|t=−0 = 0 and hc (t)|t=+0 ∆t = limz→∞ H (z). The resulting frequency response is

H (z) − lim H (z)/2.

z→∞

Example 5.1. A continuous-time system has a transfer function of the form

3
s+ 2
H (s) = . (5.7)
s2 + 23 s + 1
2

What is the corresponding discrete-time system according to the impulse invariance method with
∆t = 1?

⋆The partial fraction decomposition of the transfer function is

3
s+ k1 k
H (s) = 2
1
= + 21
(s + 1)(s + 2)
s+1 s+ 2

with
k1 = H (s)(s + 1)|s=−1 = −1,

1
k2 = H (s)(s + ) = 2.
2 s=−1/2
Thus, we get
−1 2
H (s) = + .
s+1 s + 12
According to (5.5) the discrete-time system is
−1 2
H (z) = − −
+ .
1−e z1 1 1 − e 1/2 z−1
−

Since limz→∞ H (z) = 1, obviously there is a discontinuity in the impulse response and the
resulting transfer function should be corrected as
−1 2
H (z) = + − 1/2.
1 − e −1 z −1 1 − e−1/2 z−1
The impulse response and the frequency response of the discrete-time systems with uncorrected
and corrected discontinuity effect are presented in Fig. 5.3.

Example 5.2. A continuous-time system has a transfer function of the form

(1 − 3s/2)
H (s) = .
(6s2 + 5s + 1)(s + 1)2
What is the corresponding discrete-time system according to the impulse invariance method with
∆t = 1?
204 From Continuous to Discrete Systems

1.5 h (t) 1.5 h (t)

c c
h(n) h(n)
1 1

0.5 0.5

0 0
−5 0 5 10 15 −5 0 5 10 15

4 4
|H(ejω)| |H(ejω)|
3 |H(jΩ)| 3 |H(jΩ)|

2 2

1 1

0 0
−2 0 2 −2 0 2

Figure 5.3 Impulse responses of systems in continuous and discrete-time domains (top). Amplitude of the
frequency response of systems in continuous and discrete-time domains (bottom). System without discontinuity
correction (left) and system with discontinuity correction (right).

⋆The transfer function H (s) should expanded using partial fractions as

1 − 3s/2 k1 k2 k3 k
H (s) = = + + + 4
6(s + 12 )(s + 31 )(s + 1)2 s+ 1
2 s+ 1
3
( s + 1)2 s + 1

with

k1 = H (s)(s + 1/2)|s=−1/2 = −7, k2 = 27/8, and k3 = H (s)(s + 1)2 = 5/4.
s=−1

The coefficient k4 follows, for example, from

H (0) = 1 = 2k1 + 3k2 + k3 + k4 ,

as k4 = 29/8. Thus, we get

−7 27/8 5/4 29/8

H (s) = 1
+ 1
+ 2
+ .
s+ 2 s+ 3 ( s + 1 ) s+1

According to (5.5) and (5.6) the discrete-time system is

−7 27/8 d 5/4 29/8
H (z) = + + { } +
1 − e−1/2 z−1 1 − e−1/3 z−1 dsi 1 − esi z−1 si =−1 1 − e−1 z−1
−7z 27z/8 5e−1 z/4 29z/8
= + + + .
z − e−1/2 z − e−1/3 ( z − e −1 )2 z − e −1

Since h(0) = limz→∞ H (z) = 0 there no need to consider possible impulse response correction
due to discontinuity.
Ljubiša Stanković Digital Signal Processing 205

When the transfer function is rearranged into the form

0.0341z(z − 1.9894)(z + 0.3259)

H (z) = −
(z − 0.7165) (z − 0.6065) (z − 0.3679)2

we can easily see that the poles are mapped according to s pi → es pi ∆t , Fig. 5.4, while there is
no direct correspondence among zeros of the transfer functions. The impulse responses of the
continuous-time system and the discrete-time system are shown in Fig. 5.5.

s=jΩ

z=ejω

Im{z}
Im{s}

1
−1 2/32/3 1.9894

Re{s} Re{z}

Figure 5.4 Pole-zero locations in the s-domain and the z-domain using the impulse invariance method.

5.2 MATCHED z-TRANSFORM METHOD

The matched z-transform method is based on a discrete-time approximation of the Laplace transform
derived in the previous chapter as the starred transform (4.22)
Z∞ ∞
X (s) = x (t)e−st dt ∼
= ∑ x (n)e−sn∆t = X (z)|z=es∆t .
−∞ n=−∞

This approximation leads to a relation between the Laplace domain and the z-domain in the form of

z = es∆t .

If we use this relation to map all zeros and poles of a continuous-time system transfer function

b M s M + · · · + b1 s + b0 b (s − s01 )(s − s02 ) . . . (s − s0M )

H (s) = = M
a N s N + · · · + a1 s + a0 a N (s − s p1 )(s − s p2 ) . . . (s − s pN )

into the corresponding z-plane locations, Fig. 5.6,

z0i = es0i ∆t
z pi = es pi ∆t ,
206 From Continuous to Discrete Systems

0.3
hc(t), h(n)
0.2

0.1

−0.1
0 5 10 15 20 25 30 35 40

|H(jΩ)|, |H(ejω)|
1

0.5

0
−3 −2 −1 0 1 2 3

1
10
20log|H(jΩ)|
0 20log|H(ejω)|
10

−1
10

−2
10
−3 −2 −1 0 1 2 3

Figure 5.5 Impulse responses of systems in continuous and discrete-time domains (top). Amplitude of the
frequency response of systems in continuous and discrete-time domains (middle). Amplitude of the frequency
response of systems in continuous and discrete-time domains in logarithmic scale (bottom).

the matched z-transform method of the system follows. The discrete-time system transfer function is

(z − es01 ∆t )(z − es02 ∆t ) . . . (z − es0M ∆t )

H (z) = C .
(z − es p1 ∆t )(z − es p2 ∆t ) . . . (z − es pN ∆t )
The constant C follows from the amplitude condition. For example, it can be calculated from
H ( s ) | s =0 = H ( z ) | z =1 .

Example 5.3. For the continuous-time system with the transfer function
1−s
H (s) =
8s2 + 6s + 1
find the corresponding discrete-time system according to the matched z-transform method and
∆t = 1?
Ljubiša Stanković Digital Signal Processing 207

s=jΩ
j2π/∆t

jπ/∆t z=ejω

Im{z}
Im{s}
1

−jπ/∆t

−j2π/∆t

Re{s} Re{z}

Figure 5.6 Illustration of the zeros and poles mapping in the matched z−transform method.

⋆The transfer function of the discrete-time system is obtained from

1−s
H (s) = ,
8(s + 12 )(s + 41 )

using the matched method mapping, z0i = es0i ∆t and z pi = es pi ∆t , as

z−e
H (z) = k .
8(z − e−1/2 )(z − e−1/4 )

Since H (s)|s=0 = 1, if we want that H (z)|z=e j0 = 1 then k = −1/2.4678 = −0.4052.

5.3 DIFFERENTIATION AND INTEGRATION

The first-order backward difference is a common method to approximate the first-order derivative of a
continuous-time signal

dx (t)
y(t) =
dt
∼ n∆t) − x ((n − 1)∆t) .
y(n∆t) =
x (
∆t
The Laplace transform domain of the continuous-time first derivative is

Y (s) = sX (s). (5.8)

In the discrete-time domain, with y(n) = y(n∆t)∆t and x (n) = x (n∆t)∆t, the approximation of this
derivative results in the first-order linear difference equation

x ( n ) − x ( n − 1)
y(n) = .
∆t
208 From Continuous to Discrete Systems

In the z-transform domain this equation is of the form

1 − z −1
Y (z) = X ( z ). (5.9)
∆t
Based on (5.8) and (5.9) we can conclude that the mapping of the corresponding differentiation
operators from the continuous-time to the discrete-time domain is

1 − z −1
s= . (5.10)
∆t
With a normalized discretization step ∆t = 1 this mapping is of the form

s = 1 − z −1 .

The same result could be obtained by considering a rectangular rule approximation of a continuous-
time integral, at an instant t = n∆t,
n∆t
Z Z −∆t
n∆t
y(n∆t) = x (t)dt ∼
= x (t)dt + x (n∆t)∆t.
−∞ −∞

The value of this integral can be approximated as

y(n∆t) ∼
= y(n∆t − ∆t) + x (n∆t)∆t.

In the discrete-time domain, this relation takes the form

y(n) = y(n − 1) + x (n)∆t.

The Laplace and the z-transform domain forms of the previous integral equations are
1
Y (s) = X (s)
s
∆t
Y (z) = X ( z ).
1 − z −1
The same mapping of the z-plane to the s-plane as in (5.10) follows.
Consider the imaginary axis from the s-plane (the Fourier transform line). According to (5.10)
the mapping, with ∆t = 1, is defined by

1 − s → z −1 . (5.11)

Now we will consider the region which corresponds to the imaginary axis and the left semi-plane of
the s-domain (containing poles of a stable system), Fig. 5.7(left). The aim is to find the corresponding
region in the z-domain.
If we start from the s-domain and the region in Fig. 5.7(left), the first mapping is to reverse the
s-domain to −s and shift it for +1, as
1 − s → p.
The corresponding domain, after this mapping, is shown in Fig. 5.7(middle).
The next step is to map the region from p-domain into the z-domain, according to (5.11), as

p → z −1 .
Ljubiša Stanković Digital Signal Processing 209

By denoting Re{z} = x and Im{z} = y we get that the line Re{ p} = 1 in the p−domain,
corresponding to the imaginary axis in the s-plane, is transformed into the z-domain according to
1
Re{ p} = Re{ }
z
1
1 = Re{ }
x + jy
1 x − jy
1 = Re{ }
x + jy x − jy

resulting in
x
1=
x 2 + y2
or in 2
1 1
( x − )2 + y2 = . (5.12)
2 2
Therefore, the imaginary axis in the s-plane is mapped onto a circle defined by (5.12), Fig. 5.7(right) in
the z-plane. From the mapping relation 1 − s → z−1 it is easy to conclude that the origin s = 0 + j0
maps into z = 1 and that s = 0 ± j∞ maps into z = ±0, according to 1/(1 − s) → z.

s=0+jΩ p=1

z=ejω
Im{p}

Im{z}
Im{s}

1−s→ p p→ z−1

Re{s} Re{p} Re{z}

Figure 5.7 Illustration of the differentiation based mapping of the left s−semi-plane with the imaginary axis (left),
translated and reversed p−domain (middle), and the z−domain (right).

Mapping of the imaginary axis into z-domain can also be analyzed from

1 − (re jω )−1 1 − r −1 cos ω r −1

σ + jΩ → = +j sin ω.
∆t ∆t ∆t
For σ = 0, follows

1 − r −1 cos ω = 0 or r = cos ω, (5.13)

with
r −1 tan ω
Ω= sin ω = .
∆t ∆t
Obviously, ω = 0 maps to Ω = 0 (with Ω ∼ = ω/∆t for small ω), and ω = ±π/2 maps into Ω → ±∞.
Thus, the whole imaginary axis maps onto −π/2 ≤ ω ≤ π/2. These values of ω could be used within
the basic
p period. Relation (5.13),pwith −π/2 ≤ ω ≤ π/2, is a circle defined by (5.12) if we replace
r = x2 + y2 and cos ω = x/ x2 + y2 with σ < 0 (semi-plane with negative real values) being
mapped into r < cos ω (interior of unit circle).
210 From Continuous to Discrete Systems

Example 5.4. A continuous-time system is described by the differential equation

3 1
y′′ (t) + y′ (t) + y(t) = x (t),
4 8
with the zero initial conditions and the transfer function
1
H (s) = 3 1
.
s2 + 4s + 8

What is the corresponding transfer function of the discrete-time system using the first-order
backward difference approximation with ∆t = 1/2? What is the solution to the differential
equation for x (t) = u(t). Compare it with the solution to difference equation y(n) with ∆t = 1/8.

⋆The discrete-time system transfer function is obtained using s = 1 − z−1 /∆t in H (s) as

1
H (z) = 2
1 − z −1 3 1 − z −1 1
∆t + 4 ∆t + 8

(∆t)2
= 3 1 2
1+ 4 ∆t + 8 (∆t) − [2 + 34 ∆t]z−1 + z−2
with

y(n) = B0 x (n) + A1 y(n − 1) + A2 y(n − 2)

(∆t)2
B0 = 3 1 2
= 0.1778
1+ 4 ∆t + 8 ( ∆t )
[2 + 43 ∆t]
A1 = 3 1 2
= 1.6889
1+ 4 ∆t + 8 ( ∆t )
1
A2 = − = −0.7111,
1+ 3 1
4 ∆t + 8 (∆t)2

where ∆t = 1/2. For x (t) = u(t), the continuous-time output signal is obtained from

1
Y (s) = H (s) X (s) =
s(s2 + 34 s + 81 )
8 8 16
= + 1
− 1
s s+ 2 s+ 4
as
y(t) = [8 + 8e−t/2 − 16e−t/4 ]u(t).
The results of the difference equation for y(n) are compared with the exact solution y(t) in Fig.
5.8. The agreement is high. It could be additionally improved by reducing the sampling interval,
for example, to ∆t = 1/8.
Ljubiša Stanković Digital Signal Processing 211

10
y(t), y(n)

0 5 10 15

Figure 5.8 The exact solution to the differential equation for y(t), in solid line, and the discrete-time system
output y(n), in large dots for ∆t = 1/2 and in small dots for ∆t = 1/8.

5.4 BILINEAR TRANSFORM

In the case of a differentiator based mapping, the imaginary axis in the s−domain, corresponding to
the Fourier transform values, has been mapped onto a circle with radius 1/2 and the center at z = 1/2
in the z−domain, as shown in Fig. 5.7. This mapping does not correspond to the Fourier transform of
discrete-time signals position in the z−plane, which is along the circle line |z| = 1. A transformation
that will map the imaginary axis from the s−domain onto the unit circle in the z−domain is presented
next.
Consider numerical integration in the case of the first-order system (for example, the charge on a
capacitor), using the trapezoid rule
n∆t
Z Z −∆t
n∆t
x (n∆t) + x ((n − 1)∆t)
y(n∆t) = x (t)dt ∼
= x (t)dt + ∆t
2
−∞ −∞
x ( n ) + x ( n − 1)
y ( n ) = y ( n − 1) + ∆t.
2
In the Laplace and the z-transform domain, these relations have the following forms
1
Y (s) = X (s)
s
∆t 1 + z−1
Y (z) = X ( z ).
2 1 − z −1
The mapping from the s−domain to the z−domain is defined here by

2 1 − z −1
s→ . (5.14)
∆t 1 + z−1
In complex analysis this mapping is known as the bilinear transform.
Now we can repeat the transformation of the continuous-time system, H (s), from Example 5.1 to
get the discrete-time system, H (z), by replacing s with 2(1 − z−1 )/(1 + z−1 ) in (5.7).
212 From Continuous to Discrete Systems

Within the derivatives framework, the bilinear transform can be understood as the following
derivative approximation. Consider the first-order backward derivative approximation

y ( n ) = x ( n ) − x ( n − 1).

The same signal samples can used for the first-order forward derivative approximation

y ( n − 1) = x ( n ) − x ( n − 1).

If we assume that the difference x (n) − x (n − 1) fits better to the mean of y(n) and y(n − 1) than to
any single one of them, then the derivative approximation using the difference equation

y ( n ) + y ( n − 1)
= x ( n ) − x ( n − 1),
2
produces the bilinear transform.
In order to prove that the unit circle in the z−domain maps onto the imaginary axis in the
s−domain we may simply replace z = e jω into (5.14) and obtain

1 − e− jω e jω/2 − e− jω/2 ω
2 −
= 2 jω/2 = 2j tan( ) → s∆t.
1+e jω e + e− jω/2 2
For s = σ + jΩ, follows

σ=0
2 ω
Ω= tan( ).
∆t 2
Therefore, the unit circle z = e jω maps onto the imaginary axis σ = 0. The frequency points ω = 0
and ω = ±π map into Ω = 0 and Ω → ±∞, respectively.
The linearity of the frequency mapping Ω → ω is lost. It holds for small values of ω only
2 ω ω
Ω= tan( ) ∼
= , for |ω | ≪ 1.
∆t 2 ∆t
From
q
1+ s∆t (1 + σ∆t 2
2 ) + ( Ω∆t
2 )
2
2
z= s∆t
and |z| = q
1− (1 − σ∆t 2
+ ( Ω∆t 2
2 2 ) 2 )

it may easily be concluded that σ < 0 maps into |z| < 1, since 1 + σ∆t σ∆t
2 < 1 − 2 for σ < 0.
The bilinear transform mapping can be derived using a series of complex plane mappings. Since
s∆t
1+ 2 2
z= s∆t
= s∆t
− 1,
1− 2 1− 2

we can write
s∆t 1
1− → p1 , → p2 , and 2p2 − 1 → z.
2 p1

This series of mappings from the s-domain to the z-domain is illustrated in Fig. 5.9, with ∆t = 1. The
2
fact that p11 → p2 maps the line Re{ p1 } = 1 onto the circle ( x − 21 )2 + y2 = 21 in p2 -domain is
proven in the previous section.
Ljubiša Stanković Digital Signal Processing 213

s=0+jΩ p=1

Im{p }
Im{s}

1
1− s p−1→ p
Re{s} 2
→ p1 Re{p1} 1 2

z=ejω
Im{p2}

Im{z}

1 1

Re{p } 2p −1→ z
2 Re{z}
2

Figure 5.9 Bilinear mapping illustration trough a series of elementary complex plane mappings.

Since the bilinear transform introduces a nonlinear transformation of the frequency axis
from the continuous-time domain to the discrete-time domain, Ω = ∆t 2
tan( ω2 ), this nonlinearity
must be compensated during the system design. Usually it is done by pre-modifying the desired
important frequency values Ωc from the analog domain using Ωd = ∆t 2
tan( ω2c ), and ωc = Ωc ∆t.
The new continuous-time domain frequencies Ωd will be returned back to the desired values ωc and
Ωc = ωc /∆t after the bilinear transformation.

Example 5.5. A continuous-time system

2QΩ1 2QΩ2
H (s) = 2
+ 2
s2 + 2Ω1 Qs + Ω1 + Q 2 s + 2Ω2 Qs + Ω22 + Q2

is designed to pass the signal

x (t) = A1 cos(Ω1 t + ϕ1 ) + A2 cos(Ω2 t + ϕ2 ).

and to stop all other possible signal components. The parameters are Q = 0.01, Ω1 = π/4, and
Ω2 = 3π/5. The signal is sampled with ∆t = 1 and the discrete-time signal x (n) is formed. Using
214 From Continuous to Discrete Systems

the bilinear transform, design the discrete-time system that corresponds to the continuous-time
system with the transfer function H (s).

⋆For the beginning just use the bilinear transform relation

1 − z −1
s→2 (5.15)
1 + z −1
and map H (s) to HB (z) without any pre-modification. The result is presented in the first two
subplots of Fig. 5.10. The discrete frequencies are shifted since the bilinear transform (5.15) made
a nonlinear frequency mapping from the continuous-time to discrete-time domain, according to
ω
Ω = 2 tan( ).
2
Thus, obviously, the system HB (z) is not a system that will filter the corresponding frequencies
in x (n) in the same way as H (s) filters x (t).
In order to correct the shift introduced by the bilinear transform mapping, the continuous-
time system should be pre-modified as
2QΩ1d 2QΩ2d
Hd (s) = + 2
s2 + 2Ω1d Qs + Ω21d + Q2 s + 2Ω2d Qs + Ω22d + Q2

with
2 Ω ∆t
Ω1d = tan( 1 ) = 0.8284 = 0.2637π
∆t 2
2 Ω2 ∆t
Ω2d = tan( ) = 2.7528 = 0.8762π.
∆t 2
We see that the shift of Ω1 = 0.25π to Ω1d = 0.2637π is small since the bilinear transform
frequency mapping for small frequency values is almost linear. However, for Ω2 = 0.6π, the
shift to Ω2d = 0.8762π is significant due to a high nonlinearity of mapping in that region. The
modified system Hd (s) is presented in subplot 3 of Fig. 5.10. Next, using the bilinear transform
z −1
mapping s → 2 11−+ z −1
the modified frequencies will map to the desired ones ω1 = Ω1 ∆t and
ω2 = Ω2 ∆t. The obtained discrete-time system transfer function is of the form
2QΩ1d 2QΩ2d
H (z) = 2 + 2 .
1 − z −1 1 − z −1 1 − z −1 −1
2 1 + z −1
+ 4Ω1d Q 1 + z −1
+ Ω21d + Q2 2 1 + z −1
+ 4Ω2d Q 11− z
+ z −1
+ Ω22d + Q2

For the given values of Q, Ω21 , and Ω2d , we get

0.016569 0.0551
H (z) = 2 + 2 .
− z −1
2 11+ + 0.0331375 11− z −1
+ 0.68641 2 11− z −1 −1
−z + 7.5778
+ 0.1101 11+
z −1 + z −1 + z −1 z −1

When the expression for H (z) is appropriately rearranged, its final form is given by

0.016569(1 + z−1 )2 0.0551(1 + z−1 )2

H (z) = + =
4.65327z − 6.6272z + 4.7195 11.4677z 2 + 7.1556z−1 + 11.6879
− 2 − 1 −

0.003567(1 + z−1 )2 0.0048(1 + z−1 )2

−
+ −1 .
− 1
(z − 1.0071e j0.25π − 1
)(z − 1.0071e j0.25π ) (z − 1.0096e j0.6π )(z−1 − 1.0096e− j0.6π )
Ljubiša Stanković Digital Signal Processing 215

The frequency response of this system is shown in panel 4 of Fig. 5.10. This is the desired
discrete-time system corresponding to the continuous-time system in panel 1 of this figure. In
calculations the coefficients are rounded to four decimal places.

0.5

0
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
1

0.5

0
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

0.5

0
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
1

0.5

0
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

Figure 5.10 Amplitude of the continuous-time system with the transfer function H (s) and the amplitude of
the transfer function HB (z) of the discrete-time system obtained by the bilinear transform (first two panels). A
premodified system to take into account the nonlinearity of the frequency mapping in the bilinear transform, Hd (s),
and the amplitude of the transfer function H (z) of the discrete-time system obtained by the bilinear transform of
Hd (s) (last two panels).

Comparison of the mapping methods presented in this section is summarized in the next table.

Sampling
Fourier transform
Method theorem
H (s)|s= jΩ → H (z)|z=e jω
condition
Impulse Invariance Yes, Ω = ω/∆t Yes
Matched z-transform No No
First-oder difference No No
tan(ω/2)
Bilinear transform Yes, Ω = ∆t/2 No
216 From Continuous to Discrete Systems

5.5 DISCRETE FILTERS DESIGN

The digital filter design will be explained here. The lowpass filter is assumed as the basic filter form,
while the other filters (highpass and bandpass) are designed by modifying the system that corresponds
to the discrete-time lowpass filter. In the examples, the lowpass Butherworth filters will be used.

5.5.1 Lowpass filters

An ideal discrete lowpass filter is defined by the frequency response

1 for |ω | < ωc
H (e jω ) = .
0 for ωc < |ω | < π

The frequency response is periodic in ω with period 2π.

The implementation of an ideal lowpass filter in the DFT domain is obvious, by multiplying all
DFT coefficients corresponding to frequency ωc < |ω | < π by zero. For on-line implementations in
the discrete-time domain, the ideal filter should be approximated by a corresponding transfer function
form that can be implemented, since the impulse response of the ideal filter
Zπ
1 sin(ωc n)
h(n) = e jωn dω = 2
2π n
−π

is noncausal.
There are several methods to approximate the ideal lowpass filter frequency response. One of
them is the Butterworth approximation. Some of commonly used approximations are Chebyshev and
elliptic forms as well.
A lowpass filter of the Butterworth type is shown in Fig. 5.11, along with the ideal one.

Figure 5.11 Lowpass filter frequency response: ideal case (left) and Butterworth type (right).

Example 5.6. Implement the Butterworth discrete-time filter of the order N = 4 with a critical
frequency corresponding to the continuous-time domain filter with the critical frequency
f c = 4[kHz] and the sampling interval ∆t = 31.25[µ sec], using:
(a) The impulse invariance method and
(b) the bilinear transform.

⋆The discrete-time frequency is ωc = Ωc ∆t = 2π f c ∆t = π/4.

Ljubiša Stanković Digital Signal Processing 217

The poles of the fourth-order Butterworth filter in the continuous-time domain (Chapter I,
Subsection 1.6) are
h π π π π i
s0 = Ωc cos( + ) + j sin( + ) = Ωc (−0.3827 + j0.9239)
2 8 2 8
π 3π π 3π
s1 = Ωc cos( + ) + j sin( + ) = Ωc (−0.9239 + j0.3827)
2 8 2 8

π 5π π 5π
s2 = Ωc cos( + ) + j sin( + ) = Ωc (−0.9239 − j0.3827)
2 8 2 8

π 7π π 7π
s3 = Ωc cos( + ) + j sin( + ) = Ωc (−0.3827 − j0.9239).
2 8 2 8
The transfer function of the filter in the Laplace domain is

Ω4c
H (s) = 2
. (5.16)
( s2 + 0.7654Ωc s + Ωc )(s2 + 1.8478Ωc s + Ω2c )
(a) For the impulse invariance method the transfer function (5.16) should be expanded into partial
fractions,
k0 k1 k2 k3
H (s) = + + + ,
s − s0 s − s1 s − s2 s − s3
with the constants k i calculated based on k i = H (s)(s − si )|s=si as

k0 = (−0.3628 + j0.1503)/∆t,
k1 = (0.3628 − j0.8758)/∆t,
k2 = (0.3628 + j0.8758)/∆t,
k3 = (−0.3628 − j0.1503)/∆t.

Using the impulse invariance method we get the transfer function of the Butterworth filter
k0 ∆t k1 ∆t k2 ∆t k3 ∆t
H (z) = + + +
1 − es0 ∆t z−1 1 − es1 ∆t z−1 1 − es2 ∆t z−1 1 − es3 ∆t z−1
−0.3628 + j0.1503 0.3628 − j0.8758
= +
1 − eωc (−0.3827+ j0.9239) z−1 1 − eωc (−0.9239+ j0.3827) z−1
0.3628 + j0.8758 −0.3628 − j0.1503
+ + .
1 − eωc (−0.9239− j0.3827) z−1 1 − eωc (−0.3827− j0.9239) z−1
It can be seen that the discrete-time filter is a function of ωc . Thus, for a given continuous domain
frequencies and the sampling interval ∆t, it is possible to calculate the corresponding discrete-time
frequency ωc = Ωc ∆t and to use this frequency in the filter design with the normalized ∆t = 1.
Using the value of the critical frequency, ωc = π/4, we get
−0.3628 + j0.1503 −0.3628 − j0.1503
H (z) = +
1 − (0.5539 + j0.4913)z−1 1 − (0.5539 − j0.4913)z−1
0.3628 − j0.8758 0.3628 + j0.8758
+ + .
1 − (0.4623 + j0.1433)z−1 1 − (0.4623 − j0.1433)z−1

A system form with the real-valued coefficients is obtained by grouping the complex-conjugate
terms,

−0.7256 + 0.2542z−1 0.7256 − 0.084z−1

H (z) = + .
1 − 1.1078z−1 + 0.5482z−2 1 − 0.9246z−1 + 0.2343z−2
218 From Continuous to Discrete Systems

(b) For the bilinear transform, the critical frequency ωc has to be pre-modified according to
2 ωc 0.8284
Ωd = tan( ) = .
∆t 2 ∆t
Then, the frequency Ωd is used for the design in (5.16), instead of Ωc . The frequency Ωd will be
transformed back to Ωc = ωc /∆t, after the bilinear transform is used. Using the substitutions
2 1 − z −1
s → ∆t 1 + z −1
and
ωd = Ωd ∆t = 0.8284
in (5.16), the filter transfer function follows as

ωd 4
H (z) =
z −1 2 − z −1 −1 −1
[4( 11−
+ z −1
) + 2ωd 0.7654 11+ z −1
+ ωd2 ][4( 11− z
+ z −1
−z + ω 2 ]
)2 + 2ωd 1.8478 11+ z −1 d
0.4710
= −1 −1 −1
−z )2 + 1.2626 1−z + 0.6863][4( 1−z )2 + 3.0481 1−z + 0.6863] −1
[4( 11+ z −1 1 + z −1 1 + z −1 1 + z −1
− 1
4
0.4710 1 + z
=
3.4237z−2 − 6.6274z−1 + 5.9484 1.6382z−2 − 6.6274z−1 + 7.7704
4
0.084 1 + z−1
= −2
z − 1.9357z−1 + 1.7343 z−2 − 4.0455z−1 + 4.7433
0.084z−4 + 0.336z−3 + 0.504z−2 + 0.336z−1 + 0.084
= .
z−4 − 5.9810z−3 + 14.3z−2 − 16.1977z−1 + 8.2263
The transfer function (amplitude and phase) of the continuous-time filter and the discrete-
time filters, obtained using the impulse invariance method and the bilinear transform, are presented
in Fig. 5.12, within one period in frequency. The agreement between the amplitude and the phase
functions is high. The difference equation describing this Butterworth filter is

1.5

0.5

0
/2

Figure 5.12 Amplitude and phase of the fourth-order Butterworth filter frequency response, obtained using the
impulse invariance method and bilinear transform.

y(n) = 1.969y(n − 1) − 1.7383y(n − 2) + 0.7271y(n − 3) − 0.1216y(n − 4)

+0.0102x (n) + 0.0408x (n − 1) + 0.0613x (n − 2)
+0.0408x (n − 3) + 0.0102x (n − 4).

In calculations, the coefficients are normalized by 8.2263 and rounded to four decimal places.
Rounding may cause small quantization errors (that will be discussed within the next chapter).
Ljubiša Stanković Digital Signal Processing 219

Example 5.7. Design a continuous-time lowpass filter whose parameters are:

- passband frequency Ω p = 2π f p , f p = 3 kHz,
- stopband frequency Ωs = 2π f s , f s = 6 kHz,
- maximum attenuation in the passband a p = −2 dB, and
- minimum attenuation in the stopband as = −15.5 dB.
Find the corresponding discrete-time filter using the bilinear transform and ∆t = 50[µ sec].

⋆The maximum attenuation in the passband and the minimum attenuation in the stopband are

a p = 20 log( A p )
A p = 10a p /20 = 0.7943
As = 10as /20 = 0.1679.

The relations for the filter order N and the critical frequency Ωc are (Chapter I, Subsection 1.6)
1 2 1 2
2N ≥ A p , 2N ≤ As . (5.17)
Ωp Ωs
1+ Ωc 1+ Ωc

Using the equality in both of these relations, the value of N follows from
1
−1
A2p
ln 1
1 −1
A2s
N= = 2.9407. (5.18)
2 ln Ω p
Ωs

The first greater integer is assumed for the filter order,

N = 3.

We can use any of the relations in (5.17) with the equality sign in order to calculate Ωc . For the
2
first relation, the value of Ωc will be such that H ( jΩ p ) = A2p is satisfied. Then,

Ωp
Ωc = q = 2π × 3.2805 kHz,
1
2N
A2p
−1
ωc = Ωc ∆t = 1.0306.

The poles of the Butterworth filter in the continuous-time domain are

sk = Ωc e j(2πk+π )/6+ jπ/2 , k = 0, 1, 2

2π 2π
s0 = 2π × 3.2805 cos + j sin × 103
3 3
s1 = −2π × 3.2805 × 103

2π 2π
s2 = 2π × 3.2805 cos − j sin × 103 .
3 3
220 From Continuous to Discrete Systems

The transfer function of the Butterworth filter is

(2π3.2805 × 103 )3
H (s) = .
(s + 2π3.2805 × 103 )(s2 + 2π3.2805 × 103 s + (2π3.2805 × 103 )2 )
In the discrete-time filter design using the bilinear transform we should not use this transfer
function. For the bilinear transform we have to pre-modify the frequencies Ω p and Ωs so that they
are returned back to the specified values when the bilinear transform is applied. These frequencies
are
2 Ω p ∆t
Ωdp = tan( ) = 2π × 3.2437 kHz
∆t 2
2 Ωs ∆t
Ωds = tan( ) = 2π × 8.7623 kHz
∆t 2
with N = 2.0512, which follows from (5.18). Assuming N = 3, we get Ωdc = 2π × 3.5470 kHz
with ωdc = 1.1143. The modified transfer function in the continuous-time domain is

(2π3.5470 × 103 )3
Hd (s) = .
(s + 2π3.5470 × 103 )(s2 + 2π3.5470 × 103 s + (2π3.5470 × 103 )2 )

The discrete-time Butterworth filter transfer function H (z) follows when the substitution

2 1 − z −1
s=
∆t 1 + z−1
is performed. This filter is of the form

1.11433
H (z) = 2
1 − z −1 − z −1 z −1
(2 1 + z −1
+ 1.1143)( 2 11+ z − 1 + 1.1143 11−
+ z −1
+ 1.11432 )
0.0595(1 + z−1 )3
= .
1 − 1.0229z−1 + 0.6133z−2 − 0.1147z−3
The corresponding difference equation of this filter is

y(n) = 1.0229y(n − 1) − 0.6133y(n − 2) + 0.1147y(n − 3)

+ 0.0595x (n) + 0.1784x (n − 1) + 0.1784x (n − 2) + 0.0595x (n − 3).

Example 5.8. The continuous-time signal

22π 8π π
x (t) = 8 cos( t) + 4 sin(πt) + 4 cos( t+ )
3 3 4
is sampled with ∆t = 1/4. The discrete-time signal is passed through the ideal lowpass filter
with the frequency ωc = π/3. Find the output signal. What is the corresponding continuous-time
output signal?
Ljubiša Stanković Digital Signal Processing 221

⋆The discrete-time signal is

11π π 2π π
x (n) = 2 cos( n) + sin( n) + cos( n + ).
6 4 3 4
Its Fourier transform is given by
∞
11π 11π
X (e jω ) = 2π ∑ [δ(ω − + 2kπ ) + δ(ω + + 2kπ )]
k =−∞
6 6
π ∞ π π
+ ∑ [δ(ω − 4 + 2kπ ) − δ(ω + 4 + 2kπ )]
j k=− ∞
∞
2π 2π
+π ∑ [δ(ω − + 2kπ )e jπ/4 + δ(ω + + 2kπ )e− jπ/4 ].
k =−∞
3 3

The Fourier transform value, within the basic period −π ≤ ω ≤ π, is is

11π 11π
X (e jω ) = 2π [δ(ω −+ 2π ) + δ(ω + − 2π )]
6 6
π π π 2π jπ/4 2π − jπ/4
+ [δ(ω − ) − δ(ω + )] + π [δ(ω − )e + δ(ω + )e ].
j 4 4 3 3

In addition to the last two components that have frequencies corresponding to the analog
signal there is the first component
11π 12π 11π 12π
2π [δ(ω − + ) + δ(ω + − )]
6 6 6 6
that corresponds to
π
x1 (n) = 2 cos( n ).
6
The lowpass filter output is
π π
y(n) = 2 cos( n) + sin( n).
6 4
It corresponds to the continuous-time signal
π
y(t) = 8 cos( t) + 4 sin(πt).
6
One component at the frequency ω = 2π/3 > π/3 is filtered out. The component at ω = π/4
is unchanged. One more component has appeared at the frequency ω = π/6 due to the periodic
extension of the Fourier transform of the discrete-time signal.
In general, a signal component x (t) = exp( jΩ0 t), Ω0 < 0, with a sampling interval ∆t
such that
Kπ ≤ Ω0 ∆t < (K + 1)π
will, after sampling, result into a component within the basic period of the Fourier transform of the
K
discrete-time signal, corresponding to the continuous signal at exp( j(Ω0 t − ∆t πt). This effect
is known as aliasing. The most obvious visual effect is when a wheel rotating with f 0 = 25 [Hz],
Ω0 = 50π, is sampled in a video sequence at ∆t = 1/50 [sec]. Then Ω0 ∆t = π corresponds to
exp( j(Ω0 t − 50πt)) = e j0 , that is, the wheel looks as it were static (nonmoving) object.
222 From Continuous to Discrete Systems

5.5.2 Highpass Filters

Highpass filters can be obtained by transforming the corresponding continuous-time filters into the
discrete-time domain. For example, if a lowpass filter H (s), with cutoff frequency Ωc , is transformed
using HH (s) = H (1/s), then the resulting filter HH (s) is of the highpass type, with the cutoff
frequency 1/Ωc .
In the discrete-time domain, a highpass filter frequency response, HH (e jω ), is obtained by shifting
the corresponding lowpass filter response, H (e jω ), for π in frequency, Fig. 5.13, that is

HH (e jω ) = H (e j(ω −π ) ).

Figure 5.13 Ideal highpass filter, HH (e jω ), as a shifted version of the ideal lowpass filter, H (e jω ).

The shift in frequency corresponds to the modulation of the impulse response,

h H (n) = e jπn h(n) = (−1)n h(n).

Thus, if we have an impulse response h(n) of a lowpass filter, the corresponding highpass filter impulse
response, h H (n), is obtained by multiplying the impulse response values h(n) by (−1)n . The output
of the highpass filter to any input signal x (n) is given by
∞
y(n) = x (n) ∗n h H (n) = ∑ x (m)(−1)n−m h(n − m)
m=−∞
∞ h i
= (−1)n ∑ (−1)m x (m)h(n − m) = (−1)n (−1)n x (n) ∗n h(n) (5.19)
m=−∞

Figure 5.14 Highpass filter realization using the corresponding lowpass filter.

This relation means that the lowpass filter can be implemented using the scheme shown in Fig.
5.14.
Ljubiša Stanković Digital Signal Processing 223

1.5 1.5

1 1

|HH(ejω|
|H(ejω|

0.5 0.5

0 ω 0 ω
−π −π/2 0 π/2 π −π −π/2 0 π/2 π

Figure 5.15 Amplitude of the frequency response of a lowpass Butterworth filter (left) and the filter obtained
from the lowpass Butterworth filter when z is replaced by −z (right).

Example 5.9. For the lowpass Butterworth discrete-time filter

4
0.1236 1 + z−1
H ( z ) = −2
z − 1.9389z−1 + 1.7420 z−2 − 4.0790z−1 + 4.7686

from Fig. 5.15 plot the frequency response if z is replaced by −z.

⋆The impulse response is obtained by changing the sign for every other sample in h(n). In the
z-transform definition that means using (−z)−n instead of z−n . The frequency response of
4
0.1236 1 − z−1
HH (z) =
z−2 + 1.9389z−1 + 1.7420 z−2 + 4.0790z−1 + 4.7686

is shown in Fig. 5.15.

5.5.3 Bandpass Filters

A bandpass filter is obtained from the corresponding lowpass filter by shifting its frequency response
for ω0 and −ω0 , as shown in Fig. 5.16. The frequency response of the bandpass filter is

HB (e jω ) = H (e j(ω −ω0 ) ) + H (e j(ω +ω0 ) ).

Figure 5.16 Bandpass filter as a shifted version of the lowpass filter.

224 From Continuous to Discrete Systems

In the discrete-time domain these frequency shifts correspond to

h B (n) = e jω0 n h(n) + e− jω0 n h(n) = 2 cos(ω0 n)h(n).

In general, for an input signal x (n), the output of a bandpass filter is

∞ ∞
y(n) = h B (n) ∗n x (n) = ∑ h B (m) x (n − m) = 2 ∑ cos(ω0 m)h(m) x (n − m)
m=−∞ m=−∞
∞
=2 ∑ cos(ω0 n + ω0 m − ω0 n)h(m) x (n − m)
m=−∞
∞
=2 ∑ [cos(ω0 n) cos(ω0 m − ω0 n) − sin(ω0 n) sin(ω0 m − ω0 n)]h(m) x (n − m)
m=−∞
∞
= 2 cos(ω0 n) ∑ cos(ω0 (n − m)) x (n − m)h(m)
m=−∞
∞
+2 sin(ω0 n) ∑ sin(ω0 (n − m)) x (n − m)h(m).
m=−∞

The last relation indicates that we may write the output of a bandpass filter as a function of the lowpass
impulse response in the form

y(n) = 2 cos(ω0 n) {[cos(ω0 n) x (n)] ∗ h(n)} + 2 sin(ω0 n) {[sin(ω0 n) x (n)] ∗ h(n)} .

This relation leads to the realization of the bandpass filter using the corresponding lowpass filter, as
shown in Fig. 5.17.

× h(n) ×

x(n) sin(ω0n) 2sin(ω0n) y(n)

× h(n) ×

cos(ω0n) 2cos(ω0n)

Figure 5.17 Bandpass system realization using the corresponding lowpass filter and signal modulation.

5.5.4 Allpass Systems - System Stabilization

A system (filter) with unit (constant) amplitude of the frequency response is defined by

z−1 − ae− jθ 1 − zae− jθ z − 1a e jθ − j2θ

H A (z) = −
= = e ,
jθ
1 − ae z 1 z − ae jθ 1 − 1a e− jθ z
Ljubiša Stanković Digital Signal Processing 225

where 0 < a < 1 is real-valued and θ is an arbitrary phase. For this system

H A (e jω ) = 1.

To prove this statement consider

e− jω − ae− jθ e j(θ −ω ) − a
jω
H A (e ) = =
1 − ae jθ e− jω 1 − ae jθ e− jω
s s
(cos(θ − ω ) − a)2 + sin2 (θ − ω ) a2 − 2a cos(θ − ω ) + 1
= 2
= = 1.
(1 − a cos(θ − ω ))2 + a2 sin (θ − ω ) 1 − 2a cos(θ − ω ) + a2

Example 5.10. Given the system

z+2
H (z) = .
(z − 21 )(z − 31 )(z − 2)

This system cannot be causal and stable since there is a pole at z = 2. Define an allpass system
to be connected to H (z) in cascade such that the resulting system is causal and stable, with the
same amplitude of the frequency response as H (z).

⋆When an allpass system, H A (z), is added in cascade with the given system, H (z), the overall
system transfer function, Hs (z) is of the form

z+2 z − 1a e jθ
Hs (z) = H (z) H A (z) = 1 1
e− j2θ .
(z − 2 )( z − 3 )( z − 2) 1 − 1a e− jθ z

The values of a and θ are chosen is such a way that the undesirable pole at z = 2 is canceled out,
that is, a = 1/2 and θ = 0. With these values of a and θ we get
z+2 z−2 z+2
Hs (z) = =− .
(z − 21 )(z − 31 )(z − 2) 1 − 2z 2(z − 21 )2 (z − 13 )

This system has the same amplitude of the frequency response as the initial system H (z), since

Hs (e jω ) = H (e jω ) H A (e jω ) = H (e jω ) H A (e jω ) = H (e jω ) .

The allpass system can be generalized to the form

z−1 − a1 e− jθ1 z−1 z−1 − a2 e− jθ2 z−1 z−1 − a N e− jθ N z−1

H A (z) = − −
... ,
jθ
1 − a1 e z 1 1 jθ
1 − a2 e z 2 1 1 − a N e jθ N z−1
where 0 < ai < 1 and θi , i = 1, 2, . . . , N are arbitrary real-valued constants and phases, respectively.
The resulting frequency response amplitude is

H A (e jω ) = 1.

This system can be used for multiple poles cancellation and phase correction.
226 From Continuous to Discrete Systems

5.5.5 Inverse and Minimum Phase Systems

An inverse system to the system H (z) is defined by

1
Hi (z) = .
H (z)
It is obvious that

H (z) Hi (z) = 1
h ( n ) ∗ h i ( n ) = δ ( n ).

The inverse system can be used to reverse the signal distortion. For example, assume that the Fourier
transform of a signal x (n) is distorted during the transmission by a transfer function H (z), that is, the
received signal z-transform is R(z) = H (z) X (z). In this case, the distortion can be compensated by
processing the received signal using the inverse system. The output signal is obtained as
1
Y (z) = R ( z ) = X ( z ).
H (z)

The system Hi (z) = 1/H (z) should be stable as well. It means that the poles of the inverse system
should be within the unit circle. The poles of the inverse system are equal to the zeros of H (z).
The system H (z) whose both poles and zeros are within the unit circle is called a minimum phase
system.

Example 5.11. (a) Which of these two systems

5
z2 + z − 16
H1 (z) = 3
z2 +z+ 16
3
z2 −z+ 16
H2 (z) = 3
z2 + z + 16

is a minimum phase system?

(b) If the amplitude of the Fourier transform of the discrete-time received signal is distorted
as R(z) = H1 (z) X (z), where
H1 (z) is defined in (a), what is the stable and causal system HD (z)

that will produce Y (e jω ) = X (e jω ) at its output if the input is the received signal r (n)?

⋆ a) The systems can be written as

(z − 41 )(z + 45 )
H1 (z) =
(z + 41 )(z + 43 )
(z − 41 )(z − 43 )
H2 (z) =
(z + 41 )(z + 43 )

The first system is causal and stable for the region of convergence |z| > 3/4. However one of its
zeros is at |z| = 5/4 > 1 and the system is not a minimum phase system, since its causal inverse
form is not stable. The second system is causal and stable. The same holds for its inverse, since
Ljubiša Stanković Digital Signal Processing 227

all poles of the inverse system are within |z| < 1. Thus, the system H2 (z) is a minimum phase
system.
(b) In this case
5
z2 + z − 16 (z − 14 )(z + 45 )
R(z) = 3
X (z) = X ( z ).
z2 + z + 16 (z + 14 )(z + 43 )

An inverse system to H1 (z) cannot be used since it will not be stable. However, the inverse
system can be stabilized with an allpass system H A (z) so that the amplitude is not changed

1 1
Y (z) = R(z) H (z) = H1 (z) X (z) H (z)
H1 (z) A H1 (z) A
where
5
z+ 4
H A (z) = 5
1+ 4z
and

1 (z + 41 )(z + 34 ) (z + 45 ) (z + 14 )(z + 43 )
HD ( z ) = H A (z) = =
H1 (z) (z − 41 )(z + 54 ) (1 + 45 z) (z − 41 )(1 + 45 z)

This system is stable and causal and will produce Y (e jω ) = X (e jω ).

If a system is the minimum phase system (with all poles and zeros within |z| < 1), then this
system has a minimum group delay out of all systems with the same amplitude of the frequency
response. Thus, any nonminimum phase system will have a more negative phase compared to the
minimum phase system. The negative part of the phase is called the phase-lag function. The name
minimum phase system comes from the minimum phase-lag function.
In order to prove this statement consider a system H (z) with the same amplitude of the frequency
response as a nonminimum phase system Hmin (z). Its frequency response can, therefore, be written as

z−1 − ae− jθ
H (z) = Hmin (z) H A (z) = Hmin (z)
1 − ae jθ z−1
Here, we assumed the first-order allpass system, without any loss of generality, since the same proof
can be used for any number of allpass systems that multiply Hmin (z). Since 0 < a < 1 and the system
Hmin (z) is stable the system H (z) has a zero at |z| = 1/a > 1.
The phases of the systems in the previous equation are related as

arg{ H (e jω )} = arg{ Hmin (e jω )} + arg{ H A (e jω )}.

The phase of the allpass system is

e− jω − ae− jθ 1 − ae− jθ e jω
arg{ H A (e jω )} = arg{ −
} = arg{e− jω }
jθ
1 − ae e jω 1 − ae jθ e− jω
a sin(ω − θ )
= −ω + arg{1 − ae− jθ e jω } − arg{1 − ae jθ e− jω } = −ω − 2 arctan .
1 − a cos(ω − θ )
228 From Continuous to Discrete Systems

Its derivative (group delay) is

d(arg{ H A (e jω )}) a cos(ω − θ ) − a2

τgA (ω ) = − =1+2
dω 1 − 2a cos(ω − θ ) + a2
1 − a2 1 − a2
= 2
= 2 .
1 − 2a cos(ω − θ ) + a
1 − ae j(ω −θ )

Since a < 1, the group delay τgA (ω ) is always positive and

τg (ω ) = τg min (ω ) + τgA (ω )
τg (ω ) ≥ τg min (ω ),

with τg (ω ) and τg min (ω ) being the phase derivatives (group delays) of systems H (z) and Hmin (z),
respectively.
The phase behavior of all pass system is

1 − ae− jθ
arg{ H A (e j0 )} = arg{ }=0 (5.20)
1 − ae jθ
Zω
arg{ H A (e jω )} = − τg (ω )dω ≤ 0 (5.21)
0
since τg (ω ) > 0 for 0 ≤ ω < π.

We can conclude that the minimum phase systems satisfy the following conditions.
1. A minimum phase system is system of minimum group delay out of the systems with the same
amplitude of frequency response. A system containing one or more allpass parts with uncompensated
zeros outside of the unit circle will have larger delay than the system which does not contain zeros
outside the unit circle.
2. The phase of a minimum phase system will be lower than the phase of any other system with
the same amplitude of frequency response since, according to (5.21),

arg{ H (e jω ) = arg{ Hmin (e jω )} + arg{ H A (e jω )} ≤ arg{ Hmin (e jω )}.

This proves the fact that the phase of any system arg{ H (e jω ) is always lower than the phase of
minimum phase system arg{ Hmin (e jω )}, having the same amplitude of the frequency response.
3. Since the group delay is minimum we can conclude that
n n
∑ |hmin (m)|2 ≥ ∑ |h(m)|2 .
m =0 m =0

This relation may be proven in a similar way like the minimum phase property, by considering the
outputs of a minimum phase system and a system H (z) = Hmin (z) H A (z).

Example 5.12. A system has absolute squared amplitude of the frequency response equal to
2
2 5
2 cos(ω ) + 2
H (e jω ) =
(12 cos(ω ) + 13)(24 cos(ω ) + 25)
Find the corresponding minimum phase system.
Ljubiša Stanković Digital Signal Processing 229

⋆ For the system we can write

2

H (e jω ) = H (e jω ) H ∗ (e jω ) = H (e jω ) H (e− jω )

In the z−domain the system with this amplitude of the frequency response (with real-valued
coefficients) satisfies
2
∗ 1
1
H (z) H ( ∗ ) = H (z) H ( ) = H (e jω ) = H (e jω ) H (e− jω ).
z z=e jω z z=e jω

In this sense
2

2
e jω + e− jω + 5
2
H (e jω ) =
(6e jω + 6e− jω + 13)(12e jω + 12e− jω + 25)
and
2
1 z + 52 + z−1
H (z) H ( ) =
z (6z + 13 + 6z−1 )(12z + 25 + 12z−1 )
2
z2 + 25 z + 1
=
(6z2 + 13z + 6)(12z2 + 25z + 12)
1 (z + 2)2 (z + 12 )2 1 ( 1z + 21 )2 (z + 21 )2
= 2 3 3 4
= .
72 (z + 3 )(z + 2 )(z + 4 )(z + 3 ) 72 (z + 3 )( 1z + 23 )(z + 34 )( 1z + 43 )
2

The minimum phase system, with the desired amplitude of the frequency response, is a part of
H (z) H ∗ ( z1∗ ) = H (z) H ( 1z ) with the zeros and poles inside the unit circle
√
2 (z + 21 )2
H (z) = .
12 (z + 32 )(z + 43 )

The other poles and zeros then belong to H ∗ (1/z∗ ).

230 From Continuous to Discrete Systems

5.6 PROBLEMS

Problem 5.1. The transfer function of an RLC circuit is given by

1
LC
H (s) = R 1
s2 + s L + LC

with R/L = 8 and 1/( LC ) = 25. Find the difference equation describing the corresponding discrete-
time system obtained by the impulse invariance method. What is the impulse response of the discrete-
time system. Use the sampling interval ∆t = 1.
Problem 5.2. Could the method of impulse invariance be used to map the system

s2 − 3s + 3
H (s) =
s2 + 3s + 3
to the discrete-time domain. What is the corresponding discrete-time system obtained by the bilinear
transform with ∆t = 1?
Problem 5.3. A continuous-time system is described by the differential equation
3 1
y′′ (t) + y′ (t) + y(t) = x (t)
2 2
with the zero initial conditions. What is the corresponding transfer function of the discrete-time system
using the first-order backward difference approximation with ∆t = 1/10? Write the difference equation
of the system whose output approximates the output of the continuous-time system.
Problem 5.4. Transfer function of a continuous-time system is
2s
H (s) = − .
s2 + 2s + 2
What is the corresponding discrete-time system using the invariance impulse method and the bilinear
transform for system mapping with the sampling interval ∆t = 1?
Problem 5.5. A continuous-time system is described by the transfer function of the form

(1 + 4s)
H (s) = .
(s + 1/2)(s + 1)3
What is the corresponding discrete-time system according to:
(a) the impulse invariance method,
(b) the bilinear transform,
(c) the matched z-transform?
Use ∆t = 1.
Problem 5.6. The continuous-time system
2QΩ1
H (s) =
s2 + 2Ω1 Qs + Ω21 + Q2

is designed to pass the signal

x (t) = A1 cos(Ω1 t + ϕ1 )
and to stop all other possible signal components. The parameters are Q = 0.01, Ω1 = π/2. The signal
is sampled with ∆t = 1 and a discrete-time signal x (n) is formed. Using the bilinear transform design
the discrete system that corresponds to the continuous-time system with the transfer function H (s).
Ljubiša Stanković Digital Signal Processing 231

Problem 5.7. (a) By using the bilinear transform find the transfer function of the second-order
Butterworth filter with f ac = 4kHz. The sampling interval is ∆t = 50µ sec.
(b) Translate the discrete-time transfer function to obtain a highpass filter. Find its corresponding
critical frequency in the continuous-time domain.
Problem 5.8. Design a discrete-time lowpass Butterworth filter for the sampling frequency 1/∆t = 10
kHz. The passband should be from 0 to 1 kHz, the maximum attenuation in the passband should be 3
dB (a p ≥ −3dB) and the attenuation should be more than 10 dB (as < −10dB) for frequencies above
2 kHz.
Problem 5.9. Using the impulse invariance method design a Butterworth filter with the passband
frequency ω p = 0.1π and the stopband frequency ωs = 0.3π in the discrete-time domain. The
maximum attenuation in the passband region should be less than 2dB, and the minimum attenuation in
the stopband should be 20dB.
Problem 5.10. A highpass filter can be obtained from a lowpass using HH (s) = H (1/s). With the
bilinear transform with ∆t = 2 we can transform the continuous-time domain function into discrete
domain using the relation s = (z − 1)/(z + 1). If we have a design of a lowpass filter how to change
its coefficients in order to get a highpass filter.
Problem 5.11. For filtering of a continuous-time signal, a discrete-time filter is used. Find the
corresponding continuous-time filter frequencies if the discrete-time filter is: a) lowpass with
ω p = 0.15π, b) bandpass within 0.2π ≤ ω ≤ 0.25π, c) highpass with ω p = 0.35. Consider the
cases when ∆t = 0.001s and ∆t = 0.1s.
What should be the frequencies to design these systems in the continuous-time domain if the
impulse invariance method is used and what are the design frequencies if the bilinear transform is used?
Problem 5.12. A transfer function of the first-order lowpass system is
1−α
H (z) = .
1 − αz−1
Find the corresponding bandpass system transfer function with frequency shifts for ±ωc .
Problem 5.13. Using an appropriate allpass system find the stable systems with the same amplitude of
the frequency response as the systems:
(a)
2 − 3z−1 + 2z−2
H1 (z) =
1 − 4z−1 + 4z−2
(b)
z
H2 (z) = .
(4 − z)(1/3 − z)
Problem 5.14. The z-transform

(z − 14 )(z−1 − 14 )(z + 21 )(z−1 + 12 )

R(z) =
(z + 54 )(z−1 + 45 )(z − 73 )(z−1 − 37 )

can can be written as

1
R ( z ) = H ( z ) H ∗ ( ∗ ).
z
Find H (z) for the minimum phase system.
Problem 5.15. A signal x (n) has passed trough a media whose influence can be described by the
transfer function √
(4 − z)(1/3 − z)(z2 − 2z + 41 )
H (z) = .
z − 21
232 From Continuous to Discrete Systems

Signal r (n) is obtained.

Find a causal
and stable system to process r (n) in order to obtain the output

signal y(n) such that Y (e jω ) = X (e jω ).

5.7 EXERCISE

Exercise 5.1. The transfer function of the continuous-time system is

( s + 2)
H (s) = .
4s2 + s + 1
What is the corresponding discrete-time system obtained with ∆t = 1 by using the impulse invariance
method and the bilinear transform.

Exercise 5.2. A continuous system is described by the differential equation

1 ′
y′′ (t) + 6y′ (t) − y(t) = x (t) + x (t)
2
with the zero initial conditions. What is the corresponding transfer function of the discrete system
obtained using the first-order backward difference approximation with ∆t = 1?

Exercise 5.3. (a) The continuous-time system

2QΩ0
H (s) =
s2 + 2Ω0 Qs + Ω20 + Q2

with Q = 0.01 is designed to pass the signal

x (t) = A cos(Ω0 t + ϕ)

for Ω0 = 3π/4 and to stop all other possible signal components.

The signal is sampled with ∆t = 1 and a discrete-time signal x (n) is formed. Using the bilinear
transform, design a discrete-time system that corresponds to the continuous-time system with transfer
function H (s).
(b) What is the output r (n) of the obtained discrete-time system to the samples y(n) of the analog
signal
y(t) = 1 + 2 sin(250πt) − cos(2750πt) + 2 sin(750πt)
sampled with the sampling interval ∆t = 10−3 s. What would be the corresponding continuous-time
output signal after an ideal D/A converter.

Exercise 5.4. (a) By using the bilinear transform find the transfer function of the third-order
Butterworth filter with the cutoff frequency f c = 3.4 kHz. The sampling step is ∆t = 40 µ sec.
(b) Translate the discrete transfer function to obtain a bandpass system with the corresponding
central frequency f 0 = 12.5 kHz in the continuous-time domain.

Exercise 5.5. Design a continuous0time lowpass filter whose parameters are:

- passband frequency Ω p = 2π f p , f p = 3.5 kHz,
- stopband frequency Ωs = 2π f s , f s = 6 kHz,
- maximum attenuation in passband a p = 2 dB, and
- minimum attenuation in the stopband as = 16 dB.
Find the corresponding discrete-time filter using:
(a) the impulse invariance method and
(b) the bilinear transform,
Ljubiša Stanković Digital Signal Processing 233

with ∆t = 0.05 × 10−3 sec.

(c) Write the corresponding highpass filter transfer functions, obtained by a frequency shift in the
discrete domain for π, for both cases.

Exercise 5.6. Using an allpass system find a stable and causal system with the same amplitude of the
frequency response as the systems:

2 − 5z−1 + 2z−2
H1 (z) = ,
1 − 4z−1 + z−2
z−1
H2 (z) = .
(2 − z)(1/4 − z)
Exercise 5.7. The z-transform
(z − 31 )(z−1 − 13 )
R(z) =
(z + 21 )(z−1 + 12 )
can can be written as
1
R ( z ) = H ( z ) H ∗ ( ∗ ).
z
Find H (z) for the minimum phase system. If h(n) is the impulse response of H (z) and h1 (n) is the
impulse response of
z−1 − a1 e− jθ1
H1 (z) = H (z)
1 − a1 e jθ1 z−1
show that | h(0)| ≤ | h1 (0)| for any θ1 and | a1 | < 1. All systems are causal.

Exercise 5.8. A signal x (n) has passed trough a media whose influence can be described by the
transfer function
(1 − z/3)(1 − 5z)(z2 − z + 43 )
H (z) =
z2 − 2/3
and the signal
r (n) = x (n) ∗ h(n) is obtained. Find a causal and stable system to process r (n) in order

to obtain Y (e jω ) = X (e jω ).
234 From Continuous to Discrete Systems

5.8 SOLUTIONS

Solution 5.1. For this system we can write

1
LC 25 25 j 25
6 − j 25
6
H (s) = = = = + .
R 1
s2 + s L + LC s2 + 8s + 25 (s + 4 + 3j)(s + 4 − j3) s + 4 + j3 s + 4 − j3

The poles are mapped using

s i → z i = e si .
The discrete-time system is

j 25
6 − j 25
6
H (z) = −(
+
1 − e 4+ j3) z−1 1 − e 4− j3) z−1
−(
25 −4 −1
3 e z sin(3)
= ,
1 − 2e−4 cos 3z−1 + e−8 z−2
with the corresponding difference equation
25 −4
y(n) = e sin(3) x (n − 1) + 2e−4 cos(3)y(n − 1) − e−8 y(n − 2).
3
The output signal values can be calculated for any input signal using this difference equation. For
x (n) = δ(n) the impulse response follows. The impulse response can be obtained in a closed form
from
25 ∞ −(4+ j3)n −n 25 ∞ −(4− j3)n −n
H (z) = j ∑
6 n =0
e z −j
6 n∑
e z
=0

as
25 −4n − j3n 25
h(n) = e ( je − je j3n )u(n) = e−4n sin(3n)u(n).
6 3
There is no correction term since limz→∞ H (z) = 0.

Solution 5.2. The system is not of lowpass type. For s → ∞ we get H (s) → 1. Thus, the impulse
invariance method cannot be used. The bilinear transform can be used. It produces
( 1 − z −1 ) 2 −1
4 (1+z−1 )2 − 6 11− z
+ z −1
+3 13z−2 − 2z−1 + 1
H (z) = = .
( 1 − z −1 ) 2
4 ( 1 + z −1 ) 2 + 6 1 − z −1
+3 z−2 − 2z−1 + 13
1 + z −1

Solution 5.3. For the system

3 1
y′′ (t) + y′ (t) + y(t) = x (t)
2 2
the transfer function is
1
H (s) = .
s2 + 32 s + 1
2
Ljubiša Stanković Digital Signal Processing 235

The corresponding discrete-time system is obtained using

1 − z −1
s→ = 10(1 − z−1 )
∆t
as
1 1
H (z) = = .
100(1 − z−1 )2 + 32 10(1 − z−1 ) + 1
2 100z−2 − 215z−1 + 231
2

The difference equation for this system is given by

2 430 200
y(n) = x (n) + y ( n − 1) − y ( n − 2).
231 231 231

Solution 5.4. The transfer function can be written in the form

1+j 1−j
H (s) = − − .
s+1−j s+1+j

Using the invariance impulse method, the transfer function of the discrete-time system follows

2 − 2(cos(1) + sin(1))e−1 z−1

H (z) = − .
1 − 2 cos(1)e−1 z−1 + e−2 z−2
The bilinear transform produces

1 − z −2
H ( z ) = −2 .
5 − 2z−1 + z−2

Solution 5.5. (a) The transfer function

(1 + 4s)
H (s) =
(s + 1/2)(s + 1)3
is expanded into partial fractions, appropriate for the impulse invariance method, as

k1 k2 k3 k4
H (s) = + + +
s + 1/2 ( s + 1) ( s + 1)2 ( s + 1)3

with k1 = H (s)(s + 1/2)|s=−1/2 = −8 and k4 = H (s)(s + 1)3 s=−1 = 6. By equating the coeffi-
cients with s3 to 0, we get the relation k1 + k2 = 0. A similar relation follows for the coefficients with
s2 in the form 3k1 + 5k2 /2 + k3 = 0 or k1 /2 + k3 = 0. Then, k2 = 8 and k3 = 4. With
ki ki
→
s − si 1 − e si z −1
and
1 dm k i 1 dm ki
m → { }
m! dsi s − si m! dsim 1 − esi z−1
236 From Continuous to Discrete Systems

we get the discrete-time system transfer function

−8 8 d 4 d2 6
H (z) = + + +
1−e − 1/2 z − 1 −
1−e z1 − 1 s
ds1 1 − e z1 − 1 2 s
2!ds1 1 − e z1 − 1
s1 =−1 s1 =−1
−8 8 4e−1 z−1 3e−2 z−2 + 3e−1 z−1
= + + +
1 − e−1/2 z−1 1 − e −1 z −1 − 1
(1 − e z )− 1 2 (1 − e −1 z −1 )3
−5.83819z−3 − 9.68722z−2 + 22.0531z−1
=
(z−1 − e)3 (z−1 − e1/2 )
(b) The discrete-time system, obtained using the bilinear transform, is given by
−1
(1 + 8 11− z
+ z −1
) −14z−4 − 24z−3 + 12z−2 + 40z−1 + 18
H (z) = = .
(2 1 − z −1
+ 1/2)(2 1 − z −1
+ 1)3 3z−4 − 32z−3 + 126z−2 − 216z−1 + 135
1 + z −1 1 + z −1

(c) The matched z-transform produces

4(1 − e−1/4 z−1 )

H (z) = 3 .
1 − e−1/2 z−1 1 − e −1 z −1

Solution 5.6. Since the bilinear transform is used, we have to pre-modify the system according to

2 Ω ∆t
Ωd = tan( 1 ) = 2.0 = 0.6366π.
∆t 2
The frequency value is shifted from Ω1 = 0.5π to Ωd = 0.6366π. The modified system is
2QΩd
Hd (s) = .
s2 + 2Ωd Qs + Ω2d + Q2
−1
−z , the corresponding discrete-time system is obtained,
Now, using s = 2 11+ z −1

2QΩd
H (z) = 2 .
− z −1
2 11+ − z −1 + Ω 2 + Q 2
+ 2Ωd Q 2 11+
z −1 z − 1 d

The bilinear transform returns the pre-modified frequency to the desired one.

Solution 5.7. The poles of H (s) H (−s) for a continuous-time second-order (N = 2) Butterworth
filter are

sk = Ωc e j(2πk+π )/2N + jπ/2 = 2π f c e j(2πk+π )/4+ jπ/2 ,

where
2
fc = tan(2π f ac ∆t/2)/(2π ) = 4.6253kHz.
∆t
With k = 0, 1, 2, 3 follows √ √
2 2
±j
sk = 2π f c (±
).
2 2
For a stable system, the poles satisfy Re{s p } < 0, thus
√ √
2 2
s0,1 = 2π f c (− ±j ).
2 2
Ljubiša Stanković Digital Signal Processing 237

The transfer function H (s) is

s1 s2 4π 2 f c2
Ha ( s ) = = √ .
(s − s0 )(s − s1 ) s2 + 2π f c 2s + 4π 2 f c2

Using the bilinear transform with ∆t = 50 · 10−6 , we get the corresponding discrete-time system
transfer function,
1.0548(1 + z−1 )2
H (z) = .
5.1066 − 1.8874z−1 + z−2
This filter has −3 dB attenuation at ω = 0.4π, corresponding to Ω = 0.4π/∆t = 2π × 4 × 103 .
b) The discrete highpass filter is obtained by the shift corresponding to

Hh (e jω ) = H (e j(ω +π ) ).

This shift corresponds to the impulse response modulation hh (n) = (−1)n h(n) or to the substitution
of z by −z in the transfer function,

1.0548(1 − z−1 )2
HH (z) = .
5.1066 + 1.8874z−1 + z−2
The critical frequency of the highpass filter, HH (z), is ωc = 0.6π or f ac = 6 kHz.

Solution 5.8. For the continuous-time system the design frequencies are

f p = 1 kHz
f s = 2 kHz.

They correspond to

Ω p = 2π 103 rad/s
Ωs = 4π 103 rad/s.

The discrete-time frequencies are obtained from ω = Ω∆t = Ω/104 as

ω p = 0.2π
ωs = 0.4π.

The frequencies for the filter design, that will be mapped to ωs and ω p , after the bilinear transform is
used, are
2 0.6498
Ω pd = tan(0.2π/2) =
∆t ∆t
2 1.4531
Ωsd = tan(0.4π/2) = .
∆t ∆t
The filter order follows from
1−100.1a p
1 log 1−100.1as
N= = 1.368.
2 log Ω pd
Ωsd
We assume N = 2.
238 From Continuous to Discrete Systems

Since the frequency for −3 dB attenuation is given, the design cutoff frequency is
0.6498
Ωcd = Ω pd = .
∆t
The poles of the filter transfer function, for N = 2 and Ωcd , are
√ √
0.6498 2 2
s0,1 = (− ±j )
∆t 2 2
with the transfer function
1
s0 s1 ∆t2
0.4223
H (s) = = 1
.
(s − s0 )(s − s1 ) s2 + 0.919s ∆t + 0.4223 ∆t1 2

Mapping of this system into the discrete-time domain using the bilinear transform,

2 1 − z −1
s= ,
∆t 1 + z−1
produces the second-order Butterworth filter

0.067569(1 + z−1 )2
H (z) = .
1 − 1.14216z−1 + 0.412441z−2

Solution 5.9. The Butterworth filter order is

1−100.1a p
1 log 1−100.1as
N= = 2.335.
2 log Ω p
Ωs

with Ω p = ω p /∆t, Ωs = ωs /∆t, and ∆t = 1. Assume N = 3.

The cutoff frequency Ωc , where the amplitude of the frequency response is attenuated for 3 dB
(a p = −3dB), is
Ωp
Ωc = 2N p = 0.109345π = 0.3435.
100.1a p − 1
The transfer function H (s) poles are

s1,2 = −0.17175 ± j0.29748

s0 = −Ωc = −0.3435.

The transfer function form is

− s0 s1 s2 0.0405
H (s) = =
(s − s0 )(s − s1 )(s − s2 ) (s + 0.3435)(s2 + 0.3435s + 0.1178)
k1 k2 k3
= + +
s − s0 s − s1 s − s2
0.3435 0.17175 − j0.09916 0.17175 + j0.09916
= − − .
s + 0.3435 s + 0.17175 + j0.29748 s + 0.17175 − j0.29748

The coefficients k i are calculated from

k i = H (s)(s − si ) |s=si .
Ljubiša Stanković Digital Signal Processing 239

Using the impulse invariance method, mapping from the continuous-time domain to the discrete-time
domain, is done according to
ki ∆tk i
→ .
s − si 1 − esi ∆t z−1
The discrete-time filter transfer function is
−0.0253z−2 − 0.0318z−1
H (z) = .
−1.98774 + 4.61093z−1 − 3.68033z−2 + z−3

Solution 5.10. The transfer function of the highpass filter is

1
H H ( s ) = H ( ),
s
2 1 − z −1 2 z −1
with s = ∆t 1+z−1 = ∆t z+1 and ∆t = 2. The corresponding lowpass filter would be

z−1
HL (z) = H (s)|s= z−1 = H ( ).
z +1 z+1
The discrete-time highpass filter is

1
HH (z) = HH (s)|s= z−1 = H ( )
z +1 s s = z −1
z +1

z+1
HH (z) = H ( ).
z−1
Obviously HH (z) = HL (−z). It means that a discrete highpass system can be realized by replacing z
with −z in the transfer function. For ∆t 6= 2 a scaling is present as well.

Solution 5.11. a) The frequency relation with ∆t = 0.001 s produces a lowpass filter with Ω p =
ω p /∆t = 150 π rad/s. For ∆t = 0.1 s the frequency is Ω p = ω p /∆t = 1.5 π rad/s.
b) For ∆t = 0.001 s a bandpass filter is obtained for the range 200π rad/s ≤ Ω ≤ 250π rad/s,
while ∆t = 0.1 s produces a bandpass filter with 2π rad/s ≤ Ω ≤ 2.5π rad/s.
c) For ∆t = 0.001 s a highpass filter has the frequency Ω p = 350 rad/s, while for ∆t = 0.1 s
the highpass filter has critical frequency Ω p = 3.5 rad/s.
For the impulse invariance method starting design frequencies should be equal to the calculated
analog frequencies. If the bilinear transform is used calculated analog frequencies Ω p should be
2 Ω p ∆t
pre-modified to Ωm according to Ωm = ∆t tan 2 .

Solution 5.12. The impulse response of the passband filter is h B (n) = 2h(n) cos(ωc n). The z-
transform of the impulse response is
∞ ∞ ∞
HB ( z ) = ∑ 2h(n) cos(ωc n)z−n = ∑ h(n)(e− jωc z)−n + ∑ h(n)(e jωc z)−n
n=−∞ n=−∞ n=−∞

2(1 − α)(1 − α cos ωc z−1 )

HB (z) = H (e− jωc z) + H (e jωc z) = .
1 − 2α cos ωc z−1 + α2 z−2
240 From Continuous to Discrete Systems

Solution 5.13. The causal system

2 − 3z−1 + 2z−2
H1 (z) =
(1 − 2z−1 )2
is not stable since it has a second-order pole at z = 2. This system may be stabilized, keeping the same
amplitude of the frequency response, using a second-order allpass system with zero at z = 2
!2
z −1 − 1
2
H A (z) = 1 −1
.
1− 2z

The new system has the transfer function

2 − 3z−1 + 2z−2
H1 (z) = .
( z −1 − 2)2

The causal system H2 (z) has the pole at z = 4. It can be stabilized using the allpass system

z −1 − 1
4 4−z
H A (z) = 1 −1
= .
1− 4z
4z − 1

The transfer function of a stable system is

z
H2 (z) = .
(4z − 1)(1/3 − z)

Solution 5.14. For the z-transform

(z − 41 )(z−1 − 41 )(z + 21 )(z−1 + 12 )

R(z) =
(z + 54 )(z−1 + 54 )(z − 73 )(z−1 − 37 )

and
1
R ( z ) = H ( z ) H ∗ ( ∗ ).
z
the minimum phase system is a part of R(z) whose all zeros and poles are located inside the unit circle,
meaning that H (z) system and its inverse system 1/H (z) can be causal and stable. Therefore,

(z − 14 )(z + 21 )
H (z) = .
(z + 54 )(z − 73 )

It is easy to check that H ∗ ( z1∗ ) is equal to the remaining terms in R(z), since

1 ( 1∗ − 14 )∗ ( z1∗ + 21 )∗ (z−1 − 41 )(z−1 + 21 )

H ∗ ( ∗ ) = z1 = .
z ( z∗ + 45 )∗ ( z1∗ − 73 )∗ (z−1 + 45 )(z−1 − 73 )
∗
Here we used, for example, ( z1∗ − 41 )∗ = 1
z∗ − 1
4 = 1
z − 41 .

Solution 5.15. The received signal should be processed by the inverse system

1 z − 21
Hi (z) = = √ .
H (z) (4 − z)(1/3 − z)(z2 − 2z + 14 )
Ljubiša Stanković Digital Signal Processing 241

However, this system has two poles outside the unit circle since

z − 21
Hi (z) = .
(4 − z)(1/3 − z)(z − 1.2071)(z − 0.2071)
These poles have to be compensated, keeping the same amplitude, using two first-order allpass systems.
The resulting system transfer function is

z − 4 z − 1.2071 z − 21
Hi (z) = .
1 − 4z 1 − 1.2071z (1/3 − z)(z − 0.2071)(1 − 4z) (1 − 1.2071z)
Chapter 6
Realization of Discrete Systems
INEAR discrete-time systems may, in general, be described by a difference equation relating

L the output signal with the input signal at the considered instant and the previous values of
the output and input signal. The transfer function can be written in various forms producing
different system realizations. Some of them will be presented next. Symbols that are used in the
realizations are presented in Fig. 6.1.

a
z −1
x(n) ax(n) x(n) x(n−1) x(n) x(n)

x(n)

+ + ×
x(n) x(n)+y(n) x(n) − x(n)−y(n) x(n) x(n)y(n)

y(n) y(n) y(n)

Figure 6.1 Symbols representing particular digital systems and their functions in the realization of discrete-time
systems.

6.1 REALIZATION OF IIR SYSTEMS

A system that includes recursions of the output signal values results in an infinite impulse response
(IIR). These systems will be presented first.

6.1.1 Direct realization I

Consider a discrete system described by a linear difference equation

y(n) = A1 y(n − 1) + · · · + A N y(n − N ) + B0 x (n) + B1 x (n − 1) + · · · + B M x (n − M). (6.1)

The second-order system, as a special case of the system in (6.1), will be presented first. Its
implementation is shown in Fig. 6.2. A general system described by (6.1) can be implemented as in
Fig. 6.3. This form is a direct realization I of a discrete-time system.

242
Ljubiša Stanković Digital Signal Processing 243

x(n) B0 y(n)
+ +
−1 −1
z z

x(n−1) + + y(n−1)
B A
−1 1 1 −1
z z

x(n−2) y(n−2)
B A
2 2

Figure 6.2 Direct form implementation of the second-order system.

x(n) B y (n) y(n)

0
1
+ +

z−1 z−1

+ +
B A
−1 1 1 −1
z z

+ +
B2 A2

z−1 z−1

BM AN

Figure 6.3 Direct form I implementation of a discrete-time system.

6.1.2 Direct realization II

Direct realization I consists of two system blocks connected in cascade. The first block implements the
non-recursive part of the difference equation

y1 (n) = B0 x (n) + B1 x (n − 1) + · · · + B M x (n − M),

while the second block corresponds to the recursive relation

y ( n ) = A1 y ( n − 1) + · · · + A N y ( n − N ) + y1 ( n ),

with the output from the first block, y1 (n), being the input signal to the second block. The cascade of
these two block is shown in Fig. 6.3. The transfer functions of these blocks are

H1 (z) = B0 + B1 z−1 + · · · + B M z− M

and
1
H2 (z) = .
1 − A1 z −1 − · · · − A N z − N
244 Realization of Discrete Systems

The overall transfer function is

B0 + B1 z−1 + · · · + B M z− M
H (z) = H1 (z) H2 (z) = H2 (z) H1 (z) = . (6.2)
1 − A1 z −1 − · · · − A N z − N
It means that these two blocks can interchange their positions. After the positions are interchanged then,
using the same delay systems, we get the resulting system in the direct realization II form, presented in
Fig. 6.4. This system uses a reduced number of delay blocks in the realization.

x(n) B y(n)
0
+ +
−1
z

+ +
A1 B1
−1
z

+ +
A B
2 2

−1
z

AN BM

Figure 6.4 Direct realization II of a discrete-time system.

Example 6.1. Find the transfer function of the discrete-time system presented in Fig. 6.5.

x(n) y(n)
+ +

z−1
+
−1/2
z−1
+
1/2 1/3
z−1

−1/6

Figure 6.5 A discrete-time system.

Ljubiša Stanković Digital Signal Processing 245

⋆The system can be recognized as a direct realization II form. After its blocks are separated and
interchanged, the system in a form shown in Fig. 6.6 is obtained.

x(n) y1(n) y(n)

+ +

z−1 z−1

+
−1/2
z−1 z−1

+
1/3 1/2
z−1

−1/6

Figure 6.6 The system from Fig. 6.5, with interchanged blocks.

The output of the first block is

1 1
y1 ( n ) = x ( n ) − x ( n − 1) + x ( n − 2). (6.3)
2 3
The transfer function of this block is
1 1
H1 (z) = 1 − z−1 + z−2 .
2 3
The output of the second block is described by the following difference equation
1 1
y(n) = y ( n − 2) − y ( n − 3) + y1 ( n ). (6.4)
2 6
The transfer function of then second block is
1
H2 (z) = .
1 − 12 z−2 + 61 z−3

The difference equation for the whole system is obtained after y1 (n) from (6.3) is replaced into
(6.4)
1 1 1 1
y ( n ) = y ( n − 2) − y ( n − 3) + x ( n ) − x ( n − 1) + x ( n − 2).
2 6 2 3
The system transfer function of the whole system is

1 − 21 z−1 + 13 z−2
H (z) = H1 (z) H2 (z) = .
1 − 12 z−2 + 16 z−3
246 Realization of Discrete Systems

6.1.3 Sensitivity of the System Poles/Zeros

Systems with a large number of elements in a recursion may be sensitive to the errors due to the
coefficients deviations. Deviations of the coefficients from the true values are caused by finite length
registers used to memorize them in a computer. Influence of the finite register lengths to the signal and
system realization will be studied later, as a part of random disturbance analysis. Here, we will only
consider influence of this effect to the system coefficients, since it may influence the way how to realize
a discrete-time system.
For the first-order system with a real-valued pole
1 1
H (z) = =
1 + A1 z −1 1 − z p1 z−1

the error in coefficient A1 is the same as the error in the system pole z p1 . If the coefficient is quantized
with a step ∆, then the error in the pole location is of order ∆. The same holds for the system zeros.
For a second-order system with real-valued coefficients and a pair of complex-conjugated poles
1 1
H (z) = =
1 + A1 z −1 + A2 z −2 (1 − z p1 z−1 )(1 − z p2 z−1 )

the relation between the coefficients and the real and imaginary parts of the poles z p1/2 = x p ± jy p is

1
H (z) =
1 − 2x p z−1 + ( x2p + y2p )z−2
A1 = −2x p
A2 = x2p + y2p .

The error in coefficient A1 defines the error in the real part of the pole x p .
When the coefficient A2 takes discrete values A2 = m∆, with A1 ∼ x p = n∆, then the imaginary
q √
part of the poles may take the values y p = ± A2 − x2p = ± m∆ − n2 ∆2 with n2 ≤ mN. For small
√
n, that is, for small real part of the pole, y p = ± ∆m. For N discretization levels, assuming that the
poles are within the unit circle x2p + y2p ≤ 1, the first discretization step is changed from 1/N order to
√
1/ N order. The error, in this case, could significantly be increased. The changes in y p due to the
discretization of A2 may be large.
The quantization of x p and y p as a result of quantization of − A1 /2 and A2 = x2p + y2p is shown
in Fig. 6.7, for the case of N = 16 and N = 32 quantization levels. We see that the error in y p , when
it assumes small values, can be very large. We can conclude that the poles close to the unit circle
with larger imaginary values y p are less sensitive to the errors. The highest error could appear if the
second-order real-valued pole (with y p = 0) were implemented using the second-order system.
We have concluded that the poles close to the real axis (small y p ) are sensitive to the error in
coefficients even in the second-order systems. The sensitivity increases with the system order, since the
higher powers in the polynomial increase the maximum possible error.
Consider a general form of a polynomial in the transfer function, written in two forms

P ( z ) = z M + z M −1 A1 + · · · + A M

and
P(z) = (z − z1 )(z − z2 ) . . . (z − z M ).
If the coefficients A1 , A2 , . . . , A M are changed for small ∆A1 , ∆A2 , . . . , ∆A M (due to quantization)
then the pole position (without loss of generality and for notation simplicity consider the pole z1 ) is
Ljubiša Stanković Digital Signal Processing 247

yp=Im{zp } yp=Im{zp }

1 1

0.5 0.5

0 0

−0.5 −0.5

−1 −1
−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1
x =Re{z } x =Re{z }
p p p p

Figure 6.7 Quantization of the real part and the imaginary part, x p = Re{z p } and y p = Im{z p }, of poles
(zeros) as a result of the quantization in 16 levels (left) and 32 levels (right) of the coefficients A1 = −2x p and
A2 = x2p + y2p .

changed for
∂z1 ∂z1 ∂z1
∆z1 ∼
= ∆A1 + ∆A2 + · · · + ∆A M . (6.5)
∂A1 ∂A2 ∂A M | z = z1
Since there is no a direct relation between z1 and A1 we will find ∂z1 /∂Ai using

∂P(z) ∂P(z) ∂z1

= .
∂Ai |z=z1 ∂z1 ∂Ai |z=z1

From this relation follows

∂P(z)
∂z1 ∂Ai |z=z1 z1M−i
= = .
∂Ai |z=z1 ∂P(z) −(z1 − z2 )(z1 − z3 ) . . . (z1 − z M )
∂z1 |z=z1

The coefficients ∂z1 /∂Ai|z=z1 could be large, especially in the case when there are close poles, with a
small distance (zi − zk ).

Example 6.2. Consider the discrete-time system

1
H (z) = (6.6)
P(z)
12 7 111 95
P(z) = (z − )(z − )(z − )(z − )
27 29 132 101
∼
= (z − 0.4444)(z − 0.2414)(z − 0.8409)(z − 0.9406) (6.7)

In the realization of this system the coefficients are rounded to two decimal positions, with the
absolute error up to 0.005. Find the poles of the system with rounded coefficients.
248 Realization of Discrete Systems

⋆The system denominator is

P(z) ∼
= z4 − 2.4673z3 + 2.1200z2 − 0.7336z + 0.0849.

With coefficients rounded to two decimal positions we get

P̂(z) = z4 − 2.47z3 + 2.12z2 − 0.73z + 0.08

with poles
P̂(z) = (z − 0.5370)(z − 0.2045)(z − 0.7285)(z − 1).
The poles of this function with rounded coefficients can differ significantly from the original
pole values in (6.7). The maximum error in poles is 0.8409 − 0.7285 = 0.1124. One pole is on
the unit circle making the system with rounded coefficients unstable, in contrast to the stable
original system. Note that if the system is written as a product of the first-order functions in the
denominator and every pole value is rounded to two decimals
1
H (z) = 7 12 111 95
(z − 29 )( z − 27 )( z − 132 )( z − 101 )
P(z) ∼
= (z − 0.24)(z − 0.44)(z − 0.84)(z − 0.94)

the poles will differ from the original ones for no more than 0.005.
If the poles are grouped into the second-order terms (what should be done if the coefficients
were complex-conjugate in order to avoid calculation with complex valued coefficients), then

P(z) ∼
= (z2 − 0.6858z + 0.1073)(z2 − 1.7815z + 0.7910).

If the coefficients are rounded to two decimal positions

P̂(z) = (z2 − 0.69z + 0.11)(z2 − 1.78z + 0.79)

we will get
P̂(z) = (z − 0.25)(z − 0.44)(z − 0.8442)(z − 0.9358)
with maximum error of 0.01.
The pole values are illustrated in Fig. 6.8.
The sensitivity analysis for this example can be done for each of the poles. Assume that the
poles are denoted by z1 = 12/27, z2 = 7/29, z3 = 111/132, and z4 = 95/101. Then,

(z1 − z2 )(z1 − z3 )(z1 − z4 ) = 0.0399

∂z1 z41−1
= = −2.1979
∂A1 |z=z1 −(z1 − z2 )(z1 − z3 )(z1 − z4 )
∂z1 z41−2
= = −4.9452,
∂A2 |z=z1 −(z1 − z2 )(z1 − z3 )(z1 − z4 )
∂z1
= −11.1267,
∂A3 |z=z1
∂z1
= −25.0350
∂A4 |z=z1
Ljubiša Stanković Digital Signal Processing 249

P(z)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Re{z}
P (z)P (z)
2
1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Re{z}

Figure 6.8 Poles for a system with errors in coefficients: for the fourth-order polynomial (top) and the product of
the two second-order polynomials (bottom).

with the errors in the coefficients

∆A1 = −2.4673 − (−2.47) = 0.0027,

∆A2 = 2.12 − 2.12 = 0,
∆A3 = −0.7336 − (−0.73) = −0.0036,
∆A4 = 0.0849 − 0.08 = 0.0049.

Replacing these values into (6.5) the approximation of the error is

∆z1 ∼
= 0.0878.

The true error is ∆z1 = 0.0926. A small difference is due to the linear approximation, assuming
small ∆Ai . The obtained result is a good estimate of an order of error for the pole z1 . The error in
z1 is about 18.5 time greater than the maximum error in the coefficients Ai , which is of order
0.005.

6.1.4 Cascade Realization

A transfer function of the discrete-time system in (6.2), with M = N, might be written as a product of
the first-order subsystems

1 − zo0 z−1 1 − zo1 z−1 1 − zoN z−1

H (z) = k −
× −
··· × .
1 − z p0 z 1 1 − z p1 z 1 1 − z pN z−1

Commonly real-valued signals are processed and the poles and zeros in the transfer function are in
complex-conjugated pairs. In that case it is better to group these pairs into second-order systems to
250 Realization of Discrete Systems

avoid complex calculations. The transfer function is then of the form

B00 + B10 z−1 + B20 z−2 B + B1K z−1 + B2K z−2

H (z) = − −
× · · · × 0K = H0 (z) H1 (z) . . . HK (z),
1
1 − A10 z − A20 z 2 1 − A1K z−1 − A2K z−2
where
B0i + B1i z−1 + B2i z−2
Hi (z) =
1 − A1i z−1 − A2i z−2
are second-order systems with real-valued coefficients. The whole system may be realized as a cascade
of lower-order (first or second-order) systems, Fig. 6.9. Of course, if there are some real-valued poles
then there is no need to group them. It is better to keep the realization order of the subsystems as low as
possible, as shown in Section 6.1.3.

x(n) B B y(n)
00 0K
+ + + +
−1 −1
z z
+ + + +
A B A B
10 −1 10 1K −1 1K
z z

A B A B
20 20 2K 2K

Figure 6.9 Cascade realization of a discrete-time system.

In the realization, the second-order subsystems are commonly used. Note that it is possible to
realize these second-order subsystems using the first-order systems with real-valued coefficients x pL
and y pL that are real and imaginary parts of the complex-conjugated pair of poles, z pL = x pL ± jy pL ,
respectively. To this aim consider first an example.

Example 6.3. Find the transfer function of a system with a feedback shown in Fig. 6.10.

x(n) r(n) y(n)

+ H(z)
−

H(z)

Figure 6.10 System with a feedback.

⋆The z-transform of the signal at the output of adder the in the system form Fig. 6.10 is

R ( z ) = X ( z ) − H ( z )Y ( z ) .
Ljubiša Stanković Digital Signal Processing 251

The z-transform of the output signal is equal to

Y ( z ) = H ( z ) R ( z ) = H ( z ) X ( z ) − H 2 ( z )Y ( z ) .

The transfer function of this system is

Y (z) H (z)
He (z) = = .
X (z) 1 + H 2 (z)

Let us now consider a realization of the second-order subsystem of the form

y pL z−1
Qi ( z ) = .
1 + A1i z−1 + A2i z−2
Using the real and imaginary part of the complex-conjugate poles z pL = x pL ± jy pL , the transfer
function, Qi (z), can be expressed as

y pL z−1 y pL z−1
Qi ( z ) = =
1 − 2x pL z−1 + x2pL z−2 + y2pL z−2 (1 − x pL z−1 )2 + y2pL z−2
1 1 H (z) H2 (z)
= y pL z−1 2 =
(1 − x pL z−1 )2 y pL z−1 1 + H 2 (z)
1+ 1− x pL z−1

where
y pL z−1 1
H (z) = and H2 (z) = .
1 − x pL z−1 1 − x pL z−1

Therefore, the second-order system can be implemented as in Fig. 6.11, using the first-order systems
shown in Fig. 6.12. In this case there is no grouping of the coefficients into the second-order polynomials.

H2(z)
x(n) y(n)
+ H(z) +
− +
z −1
xp

H(z)

Figure 6.11 Complete second-order subsystem with the complex-conjugate pair of poles realized using the
first-order systems.

The error in one coefficient (real or imaginary part of a pole) does not influence the other
coefficients. However, if an error in the signal calculation happens in one cascade, then it will propagate
as an input to the following cascades. In that sense, it would be the best to order cascades in such a way
that the lowest probability of an error appears in the first cascade. From the analysis of error we can
conclude that the cascades with the poles and zeros close to the origin are more sensitive to the error
and should be used in later cascade stages.
252 Realization of Discrete Systems

H(z)

x(n) yp y(n)
−1
+ z
+

Figure 6.12 First-order system used in the realization of the second-order system with the complex-conjugate pair
of poles.

Example 6.4. For the system

1.4533(1 + z−1 )3
H (z) =
(−0.8673z−1 + 3.1327)(3.0177z−2 − 5.434z−1 + 7.54)
1 + z −1 1 + 2z−1 + z−2
= 0.0615 −
×
1 − 0.2769z 1 1 − 0.7207z−1 + 0.4002z−2
present the cascade realization using:
(a) both the first and the second-order systems;
(b) the first-order systems with real-valued coefficients only.

⋆(a) Realization of the system H (z) when both the first and the second-order subsystems can
used is done according to the system transfer function as in Fig. 6.13.

x(n) 0.0615 y(n)

+ + + +
−1 −1
z z

+ +
1 0.2769 0.7207 −1 2
z

−0.4002 1

Figure 6.13 Cascade realization of a system.

(b) For the first-order subsystems, the realization should be done based on

1 + z −1 1
H (z) = 0.0615 × (1 + z −1 ) × (1 + z −1 ) × ,
1 − 0.2769z−1 1 − 0.7207z−1 + 0.4002z−2
Ljubiša Stanković Digital Signal Processing 253

with
1 1
=
1 − 0.7207z−1 + 0.4002z−2 (1 − (0.3603 + j0.5199)z−1 )(1 − (0.3603 − j0.5199)z−1 )
1
=
1 − 2 × 0.3603z−1 + 0.36032 z−2 + 0.51992 z−2
1 1 1
= = .
0.51992 z−2 + (1 − 0.3603z−1 )2 (1 − 0.3603z−1 )2 1 + ( 0.5199z−1−1 )2
1−0.3603z

In this way, the system can be written and realized in terms of the first-order subsystems,

1 + z −1 1 + z −1 1 + z −1 1
H (z) = 0.0615 − − − −1
1 1
1 − 0.2769z 1 − 0.3603z 1 − 0.3603z 1 + 1 0.5199z 0.5199z−1
1−0.3603z−1 1−0.3603z−1

as shown in Fig. 6.14.

x(n) 0.0615
+ + + +

z−1 z−1

0.2769 0.3603

y(n)
+ + +

z−1 yp=0.5199
yp yp
−1 + −1 +
z z
0.3603

0.3603 0.3603

Figure 6.14 Discrete-time system realized using the first-order subsystems.

6.1.5 Parallel realization

This realization is implemented based on a transfer function written in the form

B00 + B10 z−1 + B20 z−2 B + B1K z−1 + B2K z−2

H (z) = − −
+ · · · + 0K = H0 (z) + · · · + HK (z).
1
1 − A10 z − A20 z 2 1 − A1K z−1 − A2K z−2
In the case of a parallel realization the error in one subsystem does not influence the other subsystems.
If an error in the signal calculation appears in one parallel subsystem, then it will influence the output
signal, but will not influence the outputs of other parallel subsystems.
254 Realization of Discrete Systems

x(n) y(n)
+ + +
B00
z−1

+ +
A B
10 −1 10
z

A B
20 20

+ +
B01
z−1

+ +
A B
11 11
z−1

A B
21 21

+ +
B0K
z−1

+ +
A B
1K −1 1K
z

A B
2K 2K

Figure 6.15 Parallel realization of a discrete-time system.

Example 6.5. For the system

−0.7256 + 0.2542z−1 0.7256 − 0.084z−1

H (z) = − −
+
1
1 − 1.1078z + 0.5482z 2 1 − 0.9246z−1 + 0.2343z−2
present a parallel and a cascade realization using the second-order subsystems.
Ljubiša Stanković Digital Signal Processing 255

⋆ The parallel realization follows directly from the system transfer function definition. It is
presented in Fig. 6.16.

x(n) −0.7256 y(n)

+ + +

z−1

1.1078 −1 0.2542
z

−0.5482
0.7256
+ +
−1
z
+
0.9246 −0.084
z−1

−0.2343

Figure 6.16 Parallel realization of a discrete-time system.

For the cascade realization, the system transfer function should be written in a form of the
product of the second-order transfer functions,

0.0373z−1 + 0.0858z−2 + 0.0135z−3

H (z) =
1 − 1.1078z−1 + 0.5482z−2 1 − 0.9246z−1 + 0.2343z−2
z −1 0.0373 + 0.0858z−1 + 0.0135z−2
= .
1 − 1.1078z−1 + 0.5482z−2 1 − 0.9246z−1 + 0.2343z−2
The cascade realization is presented in Fig. 6.17.

x(n) 0.0373 y(n)

+ + +
−1 −1
z z

+ + +
1.1078 −1 1 0.9246 −1 0.0858
z z

−0.5482 −0.2343 0.0135

Figure 6.17 Cascade realization of a discrete system.

256 Realization of Discrete Systems

6.1.6 Inverse realization

For each of the previous realization, an inverse form may be implemented by switching the input and
the output signal and changing the flow directions of the signal. As an example, consider the direct
realization II from Fig. 6.4. This realization, with separated delay circuits is shown in Fig. 6.18. Its
inverse form is presented in Fig. 6.19.

x(n) B0 y(n)
+ +

z−1 z−1
+ +
A1 B1
z−1 z−1
+ +
A2 B2

z−1 z−1

AN BM

Figure 6.18 Direct realization II with separated delay circuits.

y(n) B0 x(n)
+ +

z−1 z−1
+ +
A1 B1
z−1 z−1
+ +
A2 B2

z−1 z−1

AN BM

Figure 6.19 Inverse realization of the direct realization II.

It is easy to conclude that the inverse realization of the direct realization II has the same transfer
function as the direct realization I. Since both realization I and realization II have the same transfer
functions it follows that the inverse realization has the same transfer function as the original realization.
Ljubiša Stanković Digital Signal Processing 257

6.2 FIR SYSTEMS AND THEIR REALIZATIONS

In general, transfer functions of discrete-time systems are obtained in the form of a ratio of two
polynomials. The polynomial in the transfer function denominator defines poles. In the time domain,
this means a recursive relation, relating the output signal at the current instant with the previous output
signal values. Realization of this kind of systems is efficient, as described in the previous section. When
the output signal is a linear combination of the input signal, x (n), and its delayed versions, x (n − m),
only, the systems does not have recursions. Its difference equations is

y(n) = B0 x (n) + B1 x (n − 1) + · · · + B M x (n − M).

This system is characterized with a finite impulse response, and it is referred to as the FIR system. This
system is always stable. The FIR systems can also have a linear phase.

6.2.1 Linear Phase Systems and Group Delay

In an implementation of a discrete-time system it is important to modify the amplitude of the Fourier

transform of the input signal in a desired way. At the same time, we should take care about the phase
function changes in the input signal. In an ideal case of the signal filtering, the phase function of the
input signal should remain the same, meaning a zero-phase transfer function. A linear phase form of
the transfer function
Im{ H (e jω )}
arg{ H (e jω )} = arctan{ } = −ωq (6.8)
Re{ H (e jω )}
is also acceptable in these systems. They will have a constant group delay

d(arg{ H (e jω )})
τg = − =q
dω
and will not distort the impulse response with respect to the zero-phase system. The impulse response
will only be delayed in time for a constant q.

Example 6.6. Consider an input signal of the form

M
x (n) = ∑ A m e j ( ωm n + θ m ) .
m =1

After passing through a system with the frequency response H (e jω ), this signal is changed to
M jωm
y(n) = ∑ Am | H (e jωm )|e j(ωm n+θm +arg{ H (e )})
.
m =1

In general, the phase of every signal component is changed in a different way for arg{ H (e jωm )},
causing the signal distortion due to different delays corresponding to different frequencies. If the
phase function of the frequency response is linear then all signal component phases are changed
in the same way for arg{ H (e jωm )} = −ωm q. They corresponding to a constant delay for all
components. A delayed signal, without distortion, is obtained
M
y ( n ) = y0 ( n − q ) = ∑ Am | H (e jωm )|e j(ωm (n−q)+θm ) ,
m =1
258 Realization of Discrete Systems

where y0 (n) would be the response if the phase of the transfer function were 0. In the case of a
linear phase arg{ H (e jω )} = −ωq the phase delay

arg{ H (e jω )}
τϕ = − =q
ω
and the group delay τg are the same. In general, the group delay and the phase delay are different.
The group delay, as the notion dual to the instantaneous frequency, is introduced and discussed in
the first chapter of this book.

Consider a system with a real-valued impulse response h(n). Its frequency response is
N −1 N −1 N −1
H (e jω ) = ∑ h(n)e− jωn = ∑ h(n) cos(ωn) − j ∑ h(n) sin(ωn). (6.9)
n =0 n =0 n =0

Combining the linear phase condition (6.8) with the form in (6.9), we get

Im{ H (e jω )} ∑nN=−01 h(n) sin(ωn)

− tan(ωq) = =− N ,
jω
Re{ H (e )} ∑n=−01 h(n) cos(ωn)
or
N −1
∑ h(n)[sin(ωq) cos(ωn) − cos(ωq)sin(ωn)] = 0.
n =0
The last equation can be written as
N −1
∑ h(n) sin(ω (n − q)) = 0. (6.10)
n =0

The middle point of the interval where h(n) 6= 0 is n = ( N − 1)/2. If q = ( N − 1)/2, then
sin(ω (n − q)) is an odd function with respect to n = ( N − 1)/2. The summation (6.10) is zero
if the impulse response h(n) is an even function with respect to n = ( N − 1)/2. Hence, the solution
to (6.10) is
N−1
q=
2
h(n) = h( N − 1 − n), 0 ≤ n ≤ N − 1.
Since the Fourier transform is unique, this is the unique solution for the linear phase condition. It is
illustrated for an even and odd N in Fig. 6.20. From the symmetry condition, it is easy to conclude that
there is no a causal linear phase system with an infinite impulse response.

6.2.2 Windows

When a system obtained from the design procedure is an IIR system and the requirement is to implement
it as an FIR system, in order to get a linear phase or to guaranty the system stability (when small
changes of the coefficients are possible), then the most obvious way is to truncate the desired impulse
response hd (n) of the resulting IIR system. The impulse response of the FIR system is

 hd (n), for 0 ≤ n ≤ N − 1
h(n) =

0, elsewhere.
Ljubiša Stanković Digital Signal Processing 259

q=16
h(n)

N−1=32

0 16 32
n

q=16.5
h(n)

N−1=33

0 16.5 33
n

Figure 6.20 The impulse response of a system with the linear phase for an odd and even N.

This form can be written as

h ( n ) = h d ( n ) w ( n ),
where 
 1 for 0 ≤ n ≤ N − 1
w(n) =

0 elsewhere
is a rectangular window function. In the Fourier domain, the desired impulse response truncation by
a window function will mean a convolution of the desired frequency response with the frequency
response of the window function

H (e jω ) = Hd (e jω ) ∗ω W (e jω ).

Since the rectangular window function has the Fourier transform of the form
N −1
sin(ωN/2)
W (e jω ) = ∑ e− jωn = e− jω ( N −1)/2 ,
n =0
sin(ω/2)

its convergence is slow, with significant oscillations. This oscillations will cause oscillations in
the resulting frequency response H (e jω ), Fig. 6.21. By increasing the number of samples N, the
convergence speed will increase. However the oscillations amplitude will remain the same, Figs.6.21
(d) and (f). Even with N → ∞ the amplitude oscillations will remain, Figs.6.21 (b). This effect is called
the Gibbs phenomenon.

Example 6.7. The desired frequency response of a system is Hd (e jω ), with the IIR hd (n) for
−∞ < n < ∞. Find the FIR system impulse response hc (n) that approximates the desired
transfer function with a minimum mean absolute squared error.
260 Realization of Discrete Systems

⋆The mean squared absolute error is

Zπ 2
1
e2 = Hd (e jω ) − Hc (e jω ) dω.
2π
−π

According to Parseval’s theorem

Zπ 2 ∞
1
2
e = Hd (e jω ) − Hc (e jω ) dω = ∑ |hd (n) − hc (n)|2 .
2π n=−∞
−π

Without loss of generality, assume that the most significant values of hd (n) are within
− N/2 ≤ n ≤ N/2 − 1. The impulse response hc (n) can take nonzero values only within
− N/2 ≤ n ≤ N/2 − 1. Therefore,
N/2−1 − N/2−1 ∞
e2 = ∑ |hd (n) − hc (n)|2 + ∑ |hd (n)|2 + ∑ |hd (n)|2 .
n=− N/2 n=−∞ n= N/2

Since the last two terms are hc (n) independent and all three terms are non negative, the
error e2 is minimum if

hc (n) = hd (n), − N/2 ≤ n ≤ N/2 − 1.

If we want to have a causal realization of the FIR system then

h(n) = hc (n − N/2).

A
shift in time
does not change the amplitude of the desired frequency response, since

jω
H (e ) = Hc (e jω ).

In order to reduce the oscillations in the frequency response amplitude other windows are
introduced. They are presented within the introductory chapters, trough the examples. Here we will list
the basic windows (more details on the window functions will be given in Part V).
Triangular (Bartlett) window is defined as

 |n+1− N/2|
1 − N/2 , for 0 ≤ n ≤ N − 1
w(n) =

 0, elsewhere.

Avoiding window discontinuities at the ending points, the convergence of its transform is improved.
Since this window may be considered as a convolution of the two rectangular windows
1
w(n) = [u(n) − u(n − N/2)] ∗n [u(n) − u(n − N/2)]
N/2
its the Fourier transform is the product of the corresponding rectangular window Fourier transforms

1 − jω ( N/2−1) sin2 (ωN/4)

W (e jω ) = e .
N/2 sin2 (ω/2)
Ljubiša Stanković Digital Signal Processing 261

Figure 6.21 Impulse response of a FIR system obtained by truncating the desired IIR response (a), (b) using two
rectangular windows of different widths (c)-(f), and using a Hann(ing) window (g),(h).
262 Realization of Discrete Systems

Hann(ing) window defined by



 21 1 + cos((n − N/2) 2π
N ), for 0 ≤ n ≤ N − 1
w(n) =


0, elsewhere

would be continuous in the continuous-time domain. In that domain its first derivative would be
continuous as well. Thus, its Fourier domain convergence is further improved with respect to the
rectangular and the Bartlett windows. The Fourier transform of this window is related to the Fourier
transform of the rectangular window as W (e jω )/2 + W (e j(ω +2π/N )/4 + W (e j(ω −2π/N )/4.

Hamming window is a slight modification of the Hann(ing) window


 0.52 + 0.48 cos((n − N/2) 2π N ) for 0 ≤ n ≤ N − 1
w(n) =

0, elsewhere.

It loses the continuity property (in the continuous-time domain). Its convergence for very large values of
ω will be slower than in the Hann(ing) window case. However, as it will be shown later, its coefficients
are derived in such a way that the first side-lobe is canceled out at its mid point. Then, the immediate
convergence, after the main lobe, is much better than in the Hann(ing) window case.
Other windows are derived with other different constraints. Some of them will be reviewed in
Part V of this book as well.

6.2.3 Design of a FIR System in the Frequency Domain

Suppose that the desired system frequency response is given in the frequency domain. If we want to get
an N point FIR system that approximates the desired frequency response, then it can be obtained by
sampling the desired frequency response Hd (e jω ) at ω = 2πk/N, k = 0, 1, 2, . . . , N − 1, that is

H (k) = Hd (e jω )|ω =2πk/N .

Then, the impulse response of the FIR system is

h(n) = IDFT{ H (k)}.

This procedure is illustrated on a lowpass filter design, Fig. 6.22. Note that at the discontinuity points,
high oscillations will occur in the resulting H (e jω ). The oscillations can be avoided by smoothing the
transition intervals. Smoothing by a Hann(ing) window in the frequency domain is shown in Fig. 6.23.

6.2.4 Realizations of the FIR systems

The FIR systems can be realized in the same way as the IIR systems presented in the previous section,
without using the recursive coefficients. A common way of presenting the direct realization of the FIR
system is shown in Fig. 6.24. It is often referred to as an adder with the weighting coefficients h(n).
A realization of liner phase FIR system that uses the coefficients symmetry h(0) = h( N − 1),
h(2) = h( N − 2), . . . property is shown in Fig. 6.25.
Realization of a frequency sampled FIR filter may be done using the relation between the
z−transform and the DFT of a signal.
If we want to realize a FIR system with N nonzero samples, then it can be expressed in term of
the DFT of the frequency response (samples of the transfer function H (z) along the unit circle) as
Ljubiša Stanković Digital Signal Processing 263

1.5 1.5

1 1

0.5 0.5

0 0

-0.5 -0.5
-2 0 2 -2 0 2

1.5 1.5

1 1

0.5 0.5

0 0

-0.5 -0.5
-2 0 2 -2 0 2

Figure 6.22 Realization of a FIR system with N samples in time, obtained by sampling the desired frequency
response with N samples. A direct sampling (left) and the sampling with smoothed transition (right),

0.5

0.25

-16 -12 -8 -4 0 4 8 12

Figure 6.23 A Hann(ing) window for smoothing the frequency response in the frequency domain (left) and in the
time domain (right).
264 Realization of Discrete Systems

x(n) x(n−1) x(n−2) x(n−N+1)

z−1 z−1 z−1

h(0) h(1) h(2) h(N−1)

+ + +
y(n)

Figure 6.24 Direct realization of a FIR system.

x(n) x(n−1) x(n−2) x(n−(N/2−1))

z−1 z−1 z−1

h(0) h(1) h(2) h(N/2−1)

+ + + + z−1

z−1 z−1 z−1

x(n−N+1)
y(n)
+ + +

Figure 6.25 Direct realization of a FIR system with a linear phase.

follows. For a FIR filter we may write

N −1
H (k) = ∑ h(n)e− j2πnk/N
n =0

1 N −1
H (k)e j2πnk/N .
N k∑
h(n) =
=0

Then, the transfer function H (z) calculated using the values of h(n), 0 ≤ n ≤ N − 1, is
N −1
1 N −1 N −1 1 N −1 1 − z− N e j2πk
∑ h(n)z−n = ∑ ∑ H (k)e j2πnk/N z−n =
N k∑
H (z) = −1 j2πk/N
H (k)
n =0
N k =0 n =0 =0 1 − z e

with H (k) = H (z) for z = exp( j2πk/N ), and k = 0, 1, 2, . . . , N − 1.

Example 6.8. For a system whose impulse response is the Hamming window function of the length
N = 32 present the FIR filter based realization.
Ljubiša Stanković Digital Signal Processing 265

⋆For the Hamming window with N = 32, the impulse response is given by
π
h(n) = 0.52 + 0.48 cos((n − 16) ), for 0 ≤ n ≤ 31.
16
The DFT values are H (0) = 0.52 × 32, H (1) = −0.24 × 32, H (31) = H (−1) = −0.24 × 32,
and H (k) = 0 for other k within 0 ≤ k ≤ 31. Therefore,

1 1 − z−32 1 1 − z−32 e j2π 1 1 − z−32 e− j2π

H (z) = −
H (0) − H (1) − H (31)
32 1 − z 1 −
32 1 − z e1 j2π/32 32 1 − z−1 e− j2π/32
!
1 H (0) 2H (1)(1 − cos(π/16)z−1 )
= (1 − z−32 ) −
− .
32 1−z 1 1 − 2 cos(π/16)z−1 + z−2

This is a cascade of he system

H1 (z) = (1 − z−32 )/32

and the system H2 (z) + H3 (z) where

H2 (z) = H (0)/(1 − z−1 )

and
1 − cos(π/16)z−1
H3 (z) = −2H (1) .
1 − 2 cos(π/16)z−1 + z−2

Example 6.9. For the system whose frequency response Hd ( jΩ) in the continuous-time domain is

Hd ( jΩ) = π − |Ω| ,

for |Ω| ≤ π, with the corresponding Hd (e jω ) in the discrete-time domain (∆t = 1 is assumed,
Fig. 6.26) find the FIR filter impulse response with N = 7 and N = 8:
(a) Sampling of the desired frequency response Hd (e jω ) in the frequency domain.
(b) Calculating hd (n) = IFT{ Hd (e jω )} and taking its N the most significant values,
h(n) = hd (n) for − N/2 ≤ n ≤ N/2 − 1 and h(n) = 0 elsewhere (using rectangular window).
(c) Comment the error in both cases.

⋆(a) Sampling in the frequency domain is illustrated in Fig. 6.26. The values of the FIR system
frequency response, in this case, are the samples of Hd (e jω ),

 π (1 − 2 Nk ), for 0 ≤ k < N/2
jω
H (k) = Hd (e ) = .
ω =2πk/N 
π (2 Nk − 1), for N/2 ≤ k ≤ N − 1.

The sampling is illustrated in the second row of Fig. 6.26 for N = 7 and N = 8. Impulse response
of the FIR filter is

1 N −1
H (k)e j2πnk/N .
N k∑
h(n) = IDFT{ H (k)} =
=0
266 Realization of Discrete Systems

For N = 7
π 10π 2π 6π 2π 2π 2π
h(n) = + cos( n) + cos(2 n) + cos(3 n), 0 ≤ n ≤ 6.
7 49 7 49 7 49 7
For N = 8
π 3π 2π π 2π π 2π
h(n) = + cos( n) + cos(2 n) + cos(3 n), 0 ≤ n ≤ 7.
8 16 8 8 8 16 8
These impulse responses are shown in Fig. 6.26 (third row). The frequency response of the FIR
filter is
H (e jω ) = FT{h(n)}.
Its values are equal to the desired frequency response at the sampling points

H (e jω ) = Hd (e jω ) .
ω =2πk/N ω =2πk/N

Outside these points, the frequency responses significantly differ (calculate, for example the values
H (e j0 ), H (e jπ/2 ), and H (e jπ )). Here, there is no significant discontinuity in the frequency
response. It means that the frequency response smoothing, using a window (Hann(ing) or
Hamming window in the time domain), would not significantly improve the result.
(b) The impulse response of the desired system is
Zπ
1
hd (n) = IFT{ Hd (e jω )} = (π − |ω |) e jωn dω
2π
−π
Zπ
2 1 − cos(nπ )
= (π − ω ) cos(ωn)dω = .
2π πn2
0

Using the first N = 7 samples in the time domain we get

 1−cos(nπ )
 πn2
, for −3 ≤ n ≤ 3
h(n) =

0. elsewhere.

For N = 8, the impulse response h(n) is the same with −4 ≤ n ≤ 3.

The frequency response of this FIR filter is

H (e jω ) = FT{h(n)},

and it is shown in Fig. 6.27.

(c) The error in the frequency sampling approach in (a) is zero at the desired frequency
points. However, since the frequency response is equal to the samples of the impulse response of
an infinite duration there will be aliasing of the impulse response, resulting in the error outside
the sampling points. For the case of windowing the impulse response in (b), the aliasing in the
frequency response is avoided since the impulse response is truncated. However, the truncation
causes an error in the resulting frequency response. In this case the error distribution is not
the same as in the case in (a). The mean squared error Er is calculated and presented in Fig.
6.28, along with the errors in the absolute value of the frequency responses in both cases. As
expected from the theory, the impulse response truncation produced lower mean squared error in
the estimation.
Ljubiša Stanković Digital Signal Processing 267

3 3

2 2

1 1

0 0
-5 0 5 -5 0 5

3 3

2 2

1 1

0 0
-5 0 5 -5 0 5

2 2

1 1

0 0

-1 -1
0 2 4 6 0 2 4 6

3 3

2 2

1 1

0 0
-5 0 5 -5 0 5

Figure 6.26 Design of a FIR filter by the frequency sampling of the desired frequency response.
268 Realization of Discrete Systems

-1
-15 -10 -5 0 5 10 15

2 2

1 1

0 0
-4 -2 0 2 4 -4 -2 0 2 4

3 3

2 2

1 1

0 0
-5 0 5 -5 0 5

Figure 6.27 Design of a FIR filter by windowing the impulse response of an IIR filter.

0.2
0.008092
0.1

-0.1

-0.2
0 1 2 3 4 5 6

0.2
0.0018945
0.1

-0.1

-0.2
0 1 2 3 4 5 6

Figure 6.28 Error in the case of the frequency response sampling (top) and the IIR impulse response truncation
(bottom), along with the corresponding mean square error (Er ) value.
Ljubiša Stanković Digital Signal Processing 269

6.3 PROBLEMS

Problem 6.1. For the system whose transfer function is

16(z + 1)z2
H (z) =
(4z2 − 2z + 1)(4z + 3)
plot the cascade, parallel and direct realization.

Problem 6.2. Given the discrete system with

y(n) = x (n) + x (n − 1) + x (n − 2) + y(n − 1) − y(n − 2) − 3y(n − 3).

Plot its direct realization I, direct realization II, parallel realization, and cascade realization.

Problem 6.3. Find the transfer function of the discrete system presented in Fig. 6.29.

x(n) y(n)
+ + + +

z−1 z−1
+ + + +
2 1/2 1/3
z−1 z−1

1/3 −1/3 −1/4

Figure 6.29 Discrete-time system

Problem 6.4. Find the transfer function of the discrete system presented in Fig. 6.30.

z−1 z−1

x(n) y(n)
+ + + + +

z−1 z−1

+ + + +
2 1/2 1/3
z−1 z−1

1/3 −1/3 −1/4

Figure 6.30 Discrete-time system

270 Realization of Discrete Systems

Problem 6.5. For the system

1 − 0.2z−1 + 0.02z−2 1 − 1.8z−1 + 1.45z−2

H (z) = ,
1 − 1.7z−1 + 1.285z−2 1 − 0.1z−1 + 0.0125z−2
present its cascade realization. Order the subsystems in the system so that the subsystem which is less
sensitive to possible quantization comes first.

Problem 6.6. If the transfer function of the system is

4z2 4z + 4
H (z) = ,
4z2 − 2z + 1 4z + 3
plot its cascade and parallel realization. Write down the difference equation which describes this
system.

Problem 6.7. For the system defined by the transfer function

1 + z −2
H (z) =
1 + 2z−1 + 2z−2 + z−3
plot the cascade realization.

Problem 6.8. For the system presented in Fig. 6.31 find the transfer function.

H1 ( z )
x(n) rsinθ
+
+

rcosθ z−1

y(n)

+
−rsinθ

rcosθ z−1

H2 ( z )

Figure 6.31 Discrete-time system

Problem 6.9. The discrete system is defined by the following two equations
1 1 2
y ( n ) + y ( n − 1) + w ( n ) + w ( n − 1) = x ( n )
4 2 3
5 5
y(n) − y(n − 1) + 2w(n) − 2w(n − 1) = − x (n),
4 3
Ljubiša Stanković Digital Signal Processing 271

where x (n) is the input signal, y(n) is the output signal, and w(n) is a signal within the system. What
is the frequency and impulse response of the system?

Problem 6.10. Show that the FIR system

1 + 2z − z2 + 4z3 − z4 + 2z5 + z6
H (z) =
z6
has a linear phase function. Find its group delay.

Problem 6.11. Let h(n) be an impulse response of a causal system with the Fourier transform H (e jω ).
A real-valued output signal y1 (n) = x (n) ∗ h(n) of this system is reversed, r (n) = y1 (−n), and
passed through the same system, resulting in the output signal y2 (n) = r (n) ∗ h(n). The final output
is reversed again y(n) = y2 (−n). Find the phase of the frequency response function of the overall
system.

Problem 6.12. For the system whose frequency response in the continuous-time domain is


 2 for |ω | < π2



Hd ( jΩ) = 1 for π2 < |ω | < 3π4





0 elsewhere,

with the corresponding Hd (e jω ) in the discrete-time domain obtained with ∆t = 1, find the FIR filter
impulse response with N = 15 and N = 14:
(a) Sampling the desired frequency response Hd (e jω ) in the frequency domain,
(b) Calculating hd (n) = IFT{ Hd (e jω )} and taking its N the most significant values, h(n) =
hd (n) for − N/2 ≤ n ≤ N/2 − 1 and h(n) = 0 elsewhere (rectangular window).
(c) Comment the sources of error in both cases.

6.4 EXERCISE

Exercise 6.1. Given the discrete system with

1 1 1
y ( n ) = x ( n ) − x ( n − 1) + x ( n − 2) + y ( n − 1) − y ( n − 2) − y ( n − 3),
2 3 4
plot its direct realization I, direct realization II, parallel realization, and cascade realization.

Exercise 6.2. For the system whose transfer function is

z2 − 2
H (z) =
(z − 1)(z − 2)
plot the direct realization I, direct realization II, parallel realization, and cascade realization.

Exercise 6.3. For the system whose transfer function is

3z−2 + 6
H ( z ) = −3
z − 2z−2 + 3z−1 − 6
a) plot the direct realization I, direct realization II, cascade realization, and parallel realization.
b) Find ∑∞ n=−∞ h ( n ), where h ( n ) is the impulse response of the system.
272 Realization of Discrete Systems

x(n) y(n)
+ +

z−1

4 −1
z−1

−5

+ +
0
−1
z

1/2 2

Figure 6.32 Discrete-time system.

Exercise 6.4. Find the impulse response of the discrete system presented in Fig. 6.32.

Exercise 6.5. Using the impulse invariance method with the sampling interval ∆t = 0.1, transform the
continuous-time system given with the transfer function
1 + 5s
H (s) =
8 + 2s + 5s2
into a discrete-time system, and plot the direct and the cascade realization of the system. Is the obtained
discrete-time system stable?

Exercise 6.6. Using the bilinear transform with the sampling interval ∆t = 1, transform the system
given with the transfer function
2+s
H (s) =
8 + 2s + 5s2
into a discrete-time system, and plot the direct and the cascade realization of the system. Is the obtained
discrete system stable?

Exercise 6.7. Using the bilinear transform, with the sampling interval ∆t = 0.2 transform the
continuous-time system given with the transfer function
3s + 6
H (s) =
(s + 1)(s + 3)
into discrete-time system, and plot its direct realization II.

Exercise 6.8. For the system whose frequency response in the continuous-time domain is
(
|Ω|
2 − π/2 for |ω | < π2
Hd ( jΩ) =
0 elsewhere

with the corresponding Hd (e jω ) in the discrete-time domain obtained for ∆t = 1, and presented in Fig.
6.33, find the FIR filter impulse response with N = 7 and N = 8:
Ljubiša Stanković Digital Signal Processing 273

3 3

2 2

1 1

0 0
-5 0 5 -5 0 5

Figure 6.33 The desired system in the continuous-time domain (left) and discrete-time domain (right).

(a) Sampling the desired frequency response Hd (e jω ) in the frequency domain,

(b) Calculating hd (n) = IFT{ Hd (e jω )} and taking its N the most significant values, h(n) =
hd (n) for − N/2 ≤ n ≤ N/2 − 1 and h(n) = 0 elsewhere.
(c) Comment the sources of the error in both cases.
274 Realization of Discrete Systems

6.5 SOLUTIONS

Solution 6.1. In order to plot the direct form of realization, the transfer function should be written in
a form suitable for this type of realization,

16(z + 1)z2 1 + z −1
H (z) = =
(4z2 − 2z + 1)(4z + 3) (1 − 2 z−1 + 14 z−2 )(1 + 43 z−1 )
1

1 + z −1
= . (6.11)
1 + 14 z−1 − 81 z−2 + 3 −3
16 z

According to the previous relation, direct realizations I and II follow. They are presented in Fig.
6.34 and Fig. 6.35, respectively.

x(n) y(n)
+ +

z−1 z−1

+
−1/4
z−1

+
1/8
−1
z

−3/16

Figure 6.34 Direct realization I of the discrete-time system in (6.11).

x(n) y(n)
+ +

z−1

+
−1/4
z−1

+
1/8
z−1

−3/16

Figure 6.35 Direct realization II of the discrete-time system in (6.11).

Ljubiša Stanković Digital Signal Processing 275

For the cascade realization, the transfer function is written in the form

1 + z −1
H (z) =
(1 − 21 z−1 + 14 z−2 )(1 + 43 z−1 )
1 + z −1 1
= = H1 (z) H2 (z).
1 − 21 z−1 + 14 z−2 1 + 34 z−1

The cascade realization, implemented as a product of two blocks, has the form shown in Fig. 6.36.

x(n) y(n)
+ + +
−1 −1
z z

+
1/2 −1 −3/4
z

−1/4

Figure 6.36 Cascade realization of the discrete-time system in (6.11).

In order to plot a parallel realization, the transfer function should be written in a form of partial
fractions expansion, which is suitable for this type of realization,

1 + z −1 Az−1 + B C
H (z) = = + .
(1 − 1 −1
2z + 1 −2 3 −1
4 z )(1 + 4 z ) 1 − 12 z−1 + 41 z−2 1 + 34 z−1

Calculating the coefficients A = 1/19, B = 22/19 and C = −3/19, we get

22 1 −1 3
19 + 19 z − 19
H (z) = + .
1 − 12 z−1 + 41 z−2 1 + 43 z−1

This equations is used to plot the parallel realization, Fig. 6.37.

Solution 6.2. Using the z-transform properties, the given difference equation can be written as

Y (z) = X (z) + X (z)z−1 + X (z)z−2 + Y (z)z−1 − Y (z)z−2 − 3Y (z)z−3 .

According to the definition of the transfer function, we get

Y (z) 1 + z −1 + z −2
H (z) = = . (6.12)
X (z) 1 − z−1 + z−2 + 3z−3
Direct realizations I and II, presented in Fig. 6.38 and Fig. 6.39, respectively, follow from the previous
equation.
For the cascade realization, the transfer function should be written in the form of a product of two
blocks
1 + z −1 + z −2 1
H (z) = = H1 (z) H2 (z).
1 − 2z−1 + 3z−2 1 + z−1
This form is now suitable for the cascade realization given in Fig. 6.40.
276 Realization of Discrete Systems

x(n) y(n)
+ +
22/19
−1
z
+
1/2 −1 1/19
z

−1/4

+
−3/19
−1
z

−3/4

Figure 6.37 Parallel realization of the discrete-time system in (6.11).

x(n) y(n)
+ +
−1 −1
z z
x(n−1) y(n−1)
+ +
1
−1 −1
z z
x(n−2) y(n−2)
+
−1
−1
z
y(n−3)
−3

Figure 6.38 Direct realization I of the discrete-time system in (6.12).

x(n) y(n)
+ +

z−1

+ +
1
−1
z

+
−1
−1
z

−3

Figure 6.39 Direct realization II of the discrete-time system in (6.12).

Ljubiša Stanković Digital Signal Processing 277

x(n) y(n)
+ + +

z−1 z−1

+ +
2 −1
z−1

−3

Figure 6.40 Cascade realization of the discrete-time system in (6.12).

For the parallel realization, we will write the transfer function in the form of partial fractions with
real-valued coefficients
1 1 −1
6 2z + 65
H (z) = + .
1 + z −1 1 − 2z−1 + 3z−2
Its realization is now straightforward.

Solution 6.3. The system can be recognized as a cascade of two subsystems and its transfer function
can be written as a product of the transfer functions of these two blocks

H (z) = H1 (z) H2 (z),

where H1 (z) denotes the first block and H2 (z) denotes the second block. The first subsystem with the
transfer function H1 (z) can be considered as a direct realization II, with the input to output relation

1 1 1
y1 (n) = 2y1 (n − 1) + y1 (n − 2) + x (n) + x (n − 1) − x (n − 2),
3 2 3
as shown in Fig. 6.41. Using the z-transform properties, its transfer function is

Y1 (z) 1 + 21 z−1 − 31 z−2

H1 (z) = = .
X (z) 1 − 2z−1 − 31 z−2

x(n) y (n) x1(n) y2(n) y(n)

1
+ + + + +
−1 −1
z z

+ + + +
2 −1 1/2 −1 1/3
z z

1/3 −1/3 −1/4

Figure 6.41 A discrete-time system.

278 Realization of Discrete Systems

Now consider the second block whose transfer function is H2 (z). This block can be considered
as a parallel realization of two blocks, H2 (z) = H21 (z) + H22 (z) where H21 (z) = 1.
The second transfer function is the transfer function that corresponds to a direct realization II, of
a subsystem described by
1 1
y2 ( n ) = y2 ( n − 1) + y2 ( n − 2) + x1 ( n ) + x1 ( n − 1) − x1 ( n − 2).
3 4
Thus, the transfer function of this subsystem is

Y2 (z) 1 + 13 z−1 − 41 z−2

H22 (z) = = .
X1 ( z ) 1 − z −1 − z −2
The transfer function of the second block is now

1 + 31 z−1 − 41 z−2
H2 (z) = H21 (z) + H22 (z) = 1 + .
1 − z −1 − z −2
Finally, the transfer function of the whole system is
!
1 + 21 z−1 − 31 z−2 1 + 31 z−1 − 14 z−2
H (z) = H1 (z) H2 (z) = 1+ .
1 − 2z−1 − 13 z−2 1 − z −1 − z −2

Solution 6.4. This realization can be considered as a cascade realization of two blocks H1 (z) and
H2 (z),
H (z) = H1 (z) H2 (z).
The first block is a direct realization II, whose transfer function is

1 + ( 12 + 1)z−1 − 13 z−2
H1 (z) = .
1 − 2z−1 − 31 z−2

Previous relation holds since the upper delay block (above the obvious direct realization II block)
has the same input and output as the first delay block below it.
The block with transfer function H2 (z) can be considered as a parallel realization of two blocks,
similarly as in previous example, with, H21 (z) and H22 (z), defined by

1 + 13 z−1 − 14 z−2
H21 (z) = ,
1 − z −1 − z −2
and
H22 (z) = z−1 .
Hence, the transfer function of the second block is

1 + 31 z−1 − 14 z−2
H2 (z) = H21 (z) + H22 (z) = + z −1 .
1 − z −1 − z −2
Now, the resulting transfer function can be written in the form
!
1 + ( 12 + 1)z−1 − 13 z−2 1 + 31 z−1 − 41 z−2 −1
H (z) = H1 (z) H2 (z) == +z .
1 − 2z−1 − 31 z−2 1 − z −1 − z −2
Ljubiša Stanković Digital Signal Processing 279

Solution 6.5. The transfer function can be written as

H (z) = H1 (z) H2 (z).

It can be expressed, using the roots of the numerator and denominator polynomials, as

1 − (0.1 + j0.1)z−1 1 − (0.1 − j0.1)z−1
H (z) =
1 − (0.85 − j0.75)z−1 1 − (0.85 + j0.75)z−1

1 − (0.9 + j0.8)z−1 1 − (0.9 − j0.8)z−1
× .
1 − (0.05 − j0.1)z−1 1 − (0.05 + j0.1)z−1

The subsystems should be positioned as

1 − 1.8z−1 + 1.45z−2 1 − 0.2z−1 + 0.02z−2

H1 (z) = H2 (z) = ,
1 − 1.7z−1 + 1.285z−2 1 − 0.1z−1 + 0.0125z−2
since the zero-pole pairs with small values of imaginary parts should come later. They are more
sensitive to the quantization of coefficients and they will more probably cause this kind of error. Larger
imaginary parts of roots are less sensitive to these effects, as discussed in Section 6.1.3. The cascade
realization is given in Fig. 6.42.

x(n) y(n)
+ + + +

z−1 z−1

+ + + +
1.7 −1 −1.8 0.1 −1 −0.2
z z

−1.285 1.45 −0.125 0.02

Figure 6.42 Cascade realization of the system with blocks ordered in such a way that the whole system is less
sensitive to possible quantization error

Solution 6.6. For a cascade realization, the transfer function is expressed in the form

1 1 + z −1
H (z) = . (6.13)
1− 1 −1
2z + 1 −2
4z 1 + 34 z−1

Its realization is presented in Fig. 6.43

For a parallel realization, the transfer function can be expanded into partial fractions form
22 1 −1 3 −1
19 + 19 z − 19 z
H (z) = + . (6.14)
1 − 21 z−1 + 14 z−2 1 + 43 z−1

This realization is shown in Fig. 6.44.

The transfer function can be written in the form
1 + z −1
H (z) = 1 −1
.
1+ 4z − 81 z−2 + 3 −3
16 z
280 Realization of Discrete Systems

x(n) y(n)
+ + +

z−1 z−1

+
−3/4 1/2
z−1

−1/4

Figure 6.43 A cascade realization of the system in (6.13).

x(n) 22/19 y(n)

+ + +
−1
z

1/2 1/19
z−1

−1/4
−3/19
+

z−1

−3/4

Figure 6.44 Parallel realization of the discrete-time system in (6.14).

The difference equation describing this system is

1 1 3
y ( n ) = x ( n ) + x ( n − 1) − y ( n − 1) + y ( n − 2) − y ( n − 3).
4 8 16

Solution 6.7. The transfer function form which corresponds to the cascade realization of the system is

(1 + z −2 )
H (z) = .
( z −1 + 1)(1 + z−1 + z−2 )
In order the use the smallest number of the delay circuits, it can be expressed in the form

1 (1 + z −2 )
H (z) = H1 (z) H2 (z) = . (6.15)
(1 + z −1 ) (1 + z −1 + z −2 )
This form corresponds to the cascade realization presented in Fig. 6.45.
Ljubiša Stanković Digital Signal Processing 281

x(n) y(n)
+ + +

z−1 z−1

+
−1 −1
z−1

−1

Figure 6.45 Cascade realization of the discrete-time system in (6.15).

Solution 6.9. The z-transforms of these difference equations are

1 1 2
Y (z)(1 + z−1 ) + W (z)(1 + z−1 ) = X (z)
4 2 3
5 5
Y (z)(1 − z−1 ) + 2W (z)(1 − z−1 ) = − X (z).
4 3
By eliminating W (z) we get

1 5 1
Y (z)[(2 + z−1 )(1 − z−1 ) − (1 − z−1 )(1 + z−1 )]
2 4 2
4 −1 5 1 −1
= X (z)[ (1 − z ) + (1 + z )].
3 3 2
The transfer function is
Y (z) 3 − 12 z−1
H (z) = = ,
X (z) 1 − 34 z−1 + 18 z−2
with the difference equation describing this system
3 1 1
y(n) − y(n − 1) + y(n − 2) = 3x (n) − x (n − 1).
4 8 2
The frequency response is
3 − 21 e− jω
H (e jω ) = 3 − jω
.
1− 4e + 81 e− j2ω
Based on
Y (z) 3 − 21 z−1 4 1
H (z) = = 3 −1 1 −2
= 1 −1
− ,
X (z) 1 − 4z + 8z 1 − 2z 1 − 41 z−1
the impulse response is
h(n) = [4(1/2)n − (1/4)n ]u(n).

Solution 6.8. The transfer function of the subsystem denoted by H1 (z) follows from

y(n) = r sin θx1 (n − 1) + r cos θy(n − 1)

282 Realization of Discrete Systems

where x1 (n) is the input signal to this subsystem, whose transfer function is

Y (z) z−1 r sin θ

H1 (z) = = .
X1 ( z ) 1 − r cos θz−1
The transfer function of the other subsystem is

z−1 r sin θ
H2 (z) = − .
1 − r cos θz−1
For the feedback holds
H1 (z)( X (z) + Y (z) H2 (z)) = Y (z).
This relation produces

Y (z) H1 (z) z−1 r sin θ (1 − r cos θz−1 )

H (z) = = = .
X (z) 1 − H1 (z) H2 (z) 1 − 2r cos θz−1 + r2 z−2

Solution 6.10. The system impulse response is

h(n) = δ(n) + 2δ(n − 1) − δ(n − 2) + 4δ(n − 3) − δ(n − 4) + 2δ(n − 5) + δ(n − 6).

This impulse response satisfies the property

h ( n ) = h ( N − 1 − n ), 0 ≤ n ≤ N − 1,

with N = 7, which implies the phase function linearity. Thus, the group delay q is

N−1
q= = 3.
2

Solution 6.11. Let h(n) be an impulse response of a causal system with the Fourier transform
H (e jω ). A real-valued output signal y1 (n) = x (n) ∗ h(n) of this system is reversed, r (n) = y1 (−n),
and passed through the same system, resulting in the output signal y2 (n) = r (n) ∗ h(n). The final
output is reversed again y(n) = y2 (−n). Find the phase of the frequency response function of the
overall system. The frequency domain form of the system y1 (n) = x (n) ∗ h(n) is

Y1 (e jω ) = H (e jω ) X (e jω ).

For the operation r (n) = y1 (−n) in the time domain, the frequency domain form is

R(e jω ) = Y1∗ (e jω ) = H ∗ (e jω ) X ∗ (e jω )

When this signal passes through the same system y2 (n) = r (n) ∗ h(n) we have

Y2 (e jω ) = R(e jω ) H (e jω ) = H ∗ (e jω ) H (e jω ) X ∗ (e jω )

Finally, the signal is reversed, y(n) = y2 (−n), producing

Y (e jω ) = Y2∗ (e jω ) = H (e jω ) H ∗ (e jω ) X (e jω ).

So we get
Y (e jω ) = | H (e jω )|2 X (e jω ).
Ljubiša Stanković Digital Signal Processing 283

Obviously, the phase function of the overall system, | H (e jω )|2 , is equal to zero for all ω.

Solution 6.12. (a) Values of the FIR filter, obtained by sampling the frequency response in the
frequency domain are

H (k ) = Hd (e jω ) .
ω =2πnk/N
This sampling is illustrated in the second row of Fig. 6.46 for N = 15 and N = 14.

3 3

2 2

1 1

0 0
-5 0 5 -5 0 5

3 3

2 2

1 1

0 0
-5 0 5 -5 0 5

2 2

1 1

0 0

-1 -1
-5 0 5 -6 -4 -2 0 2 4 6

3 3

2 2

1 1

0 0
-5 0 5 -5 0 5

Figure 6.46 Design of the FIR filter using the frequency sampling of the desired frequency response.

The impulse response of the FIR filter is calculated as

1 N −1
H (k)e j2πnk/N .
N k∑
h(n) = IDFT{ H (k)} =
=0

It is shown in Fig. 6.46 (third row). The frequency response of the FIR filter is

H (e jω ) = FT{ h(n)}.
284 Realization of Discrete Systems

Its values are equal to the desired frequency response at the sampling points

H (e jω ) = Hd (e jω ) .
ω =2πk/N ω =2πk/N

(b) The impulse response of the desired system is

sin(nπ/2) sin(3nπ/4)
hd (n) = IFT{ Hd (e jω )} = + .
πn πn
Using the first N = 15 samples in the discrete-time domain we get

 hd (n), for −7 ≤ n ≤ 7
h(n) =

0, elsewhere

or for N = 16 
 hd (n), for −8 ≤ n ≤ 7
h(n) =

0. elsewhere.
The frequency response of this FIR filter is

H (e jω ) = DFT{h(n)}.

It is shown in Fig. 6.47.

-1
-15 -10 -5 0 5 10 15

2 2

1 1

0 0

-1 -1
-10 -5 0 5 10 -10 -5 0 5 10

3 3

2 2

1 1

0 0
-5 0 5 -5 0 5

Figure 6.47 The FIR filter design using N the most significant values of the impulse response (window approach).
Ljubiša Stanković Digital Signal Processing 285

(c) The error value, as a function of the frequency ω, along with the mean squared absolute error
Er is shown in Fig. 6.48.

0.037954
0.5

-0.5

0 1 2 3 4 5 6

0.028921
0.5

-0.5

0 1 2 3 4 5 6

Figure 6.48 Error in the case of the frequency response sampling (top) and the IIR impulse response truncation
(bottom), along with the corresponding mean square error (Er ) value.
Part III

Random Discrete-Time
Signals and Systems

286
Chapter 7
Discrete-Time Random Signals
ANDOM signal values cannot be defined by simple deterministic mathematical functions. Their

R values are not known in advance. These signals can be described by stochastic tools only.
Here, we will restrict the analysis to the discrete-time random signals. The first-order and
second-order statistics will be considered.

7.1 BASIC STATISTICAL DEFINITIONS

Statistics is a science or practice dealing with the collection, analysis, interpretation, and presentation
of numerical data, inferring parameters from the whole set of data or their representative sample. A
statistic is a single numerical fact obtained from the analysis of the considered set of data and used for
the whole data set description.

7.1.1 Mean Value – Sample Average

The first-order statistics is the starting point in describing random signals. The mean value, or the data
sample average, of a random signal is one of the parameters of this statistics. If we have a set of signal
samples,
X = { x (n)| n = 1, 2, . . . , N }, (7.1)
the mean value of this set of signal samples is calculated as
1
µ̂ x = mean{ x (n)| n = 1, 2, . . . , N } = ( x (1) + x (2) + · · · + x ( N )). (7.2)
N
For notation simplicity, we will also use µ̂ x = mean{ x (n)}, meaning the mean of the dataset { x (n)}
for all indices n where the signal is available.
To distinguish the calculated (statistically estimated) value µ̂ of a signal parameter from the true
one µ (if all possible signal realizations were available) we will use the hat (ˆ) symbol.

Example 7.1. Consider a random signal x (n) whose one realization is given in Table 7.1. Find
the mean value of this signal. Find how many samples of the signal are within the intervals
[1, 10], [11, 20],. . . , [91, 100]. Plot the number of occurrences of signal x (n) samples within these
intervals as a function of the interval range.

287
288 Discrete-Time Random Signals

Table 7.1
A realization of random signal

54 62 58 51 70 43 99 52 57 76
56 53 38 61 28 69 87 41 72 80
23 26 66 47 69 71 69 81 68 79
31 55 52 23 60 34 83 39 66 59
37 12 54 42 67 95 89 67 42 63
35 55 54 55 49 77 18 64 73 70
67 56 42 66 50 47 49 25 50 57
61 84 48 67 71 74 35 59 60 42
40 77 52 63 57 42 44 64 36 71
66 39 50 31 11 75 45 62 60 55

⋆The realization of signal x (n) defined in Table 7.1 is presented in Fig. 7.1.

120
110
100
90
80
70
60
50
40
30
20
10
0
0 10 20 30 40 50 60 70 80 90 100

Figure 7.1 A realization of the random signal x (n).

The mean value of all signal samples is

1 100
100 n∑
µ̂ x = x (n) = 55.76.
=1

From Table 7.1 or its visualized presentation in Fig. 7.1, we can conclude that, for example, there
is no any signal sample whose value is within the interval [1, 10]. Within [11, 20] there are two
signal samples (x (42) = 12 and x (95) = 11). In a similar way, the number of signal samples
Ljubiša Stanković Digital Signal Processing 289

within other intervals are counted and the result is shown in Fig. 7.2. This kind of random signal
presentation is called a histogram of x (n), with the defined intervals.
25

0
0 10 20 30 40 50 60 70 80 90 100

Figure 7.2 Histogram of the random signal x (n) from Fig. 7.1, with 10 intervals defined by [10i + 1, 10i + 10],
i = 0, 1, 2, . . . , 9.

Example 7.2. For the signal x (n) from the previous example assume that a new random signal y(n)
is formed as
x (n) + 5
y(n) = int ,
10
where int {·} denotes the nearest integer. This means that y(n) = 1 for 1 ≤ x (n) ≤ 10, y(n) = 2
for 11 ≤ x (n) ≤ 20, . . . , y(n) = i for 10(i − 1) + 1 ≤ x (n) ≤ 10i, up to i = 10. What is the
set of possible values of y(n)? Find and graphically present the number of occurrences of every
possible value of y(n) in this signal realization. Find the mean value of the new signal y(n) and
discuss the result.

⋆ The signal y(n) is shown in Fig. 7.3. It takes the values from the set {2, 3, 4, 5, 6, 7, 8, 9, 10}.
For the signal y(n), instead of the histogram, we can plot a diagram of the number of
occurrences of every value that y(n) can take, as in Fig. 7.4. The mean value of y(n) is

1 100
100 n∑
µ̂y = y(n) = 6.13.
=1

The mean value can also be written, by grouping the same values of y(n), as

1
µ̂y = (1 · n1 + 2 · n2 + 3 · n3 + · · · + 10 · n10 ) =
100
n n n n
= 1 · 1 + 2 · 2 + 3 · 3 + · · · + 10 · 10 ,
N N N N
290 Discrete-Time Random Signals

0
0 10 20 30 40 50 60 70 80 90 100

Figure 7.3 Random signal y(n).

where N = 100 is the total number of the available signal values and ni is the number showing how
many times each of the values i appeared in y(n). If there is a sufficient number of occurrences
for every outcome value i, then
n
Py (i ) = i
N
can be considered as an estimate of the probability that the value i appears. In that sense
10
µ̂y = 1 · Py (1) + 2 · Py (2) + 3 · Py (3) + · · · + 10 · Py (10) = ∑ y(i) Py (i)
i =1

with
10 10
n
∑ Py (i) = ∑ Ni = 1.
i =1 i =1
Values of the probability estimates Py (i ) are shown in Fig. 7.4.

In general, the mean value for every signal sample could be different. For example, if the signal
values represent the highest daily temperature during a year then the mean value is highly dependent
on the considered sample. In order to calculate the mean value of temperature, we have to have several
realizations of these random signals (measurements over M years), denoted by { xi (n)}, where the
argument n = 1, 2, 3, . . . , N is the cardinal number of the day within a year and i = 1, 2, . . . , M is the
index of realization (year index). The mean value is then calculated as

1 1 M
(7.3)
M i∑
µ̂ x (n) = ( x1 (n) + x2 (n) + · · · + x M (n)) = x i ( n ),
M =1

for every n. In this case, we have a set (a signal) of mean values {µ̂ x (n)}, for n = 1, 2, . . . , 365.
Ljubiša Stanković Digital Signal Processing 291

25 0.25

20 0.2

15 0.15

10 0.1

5 0.05

0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

Figure 7.4 Number of appearances of every possible value of y(n) (left) and the estimates of the probabilities
that the random signal y(n) takes a value i = 1, 2, . . . , 10 (right).

Example 7.3. Consider the signal x (n) whose realizations are given in Table 7.2. The values of x (n)
are equal to the monthly average of the maximum daily temperatures in a city measured from
year 2001 to 2015. Find the mean of this temperature for each month over the considered period
of years. What is the mean value of the temperature over all months and years? What is the mean
temperature for every year?

⋆The signal for years 2001 to 2007 is given in Fig. 7.5. The mean temperature for the nth month,
over the considered years, is

1 15
15 i∑
µ̂ x (n) = x20i (n),
=1

where the notation 20i is symbolic in the sense that 2001, 2002, . . . 2015 holds for i =
01, 02, . . . , 15. The mean value signal µ̂ x (n) is shown in the last panel of Fig. 7.5. The mean value
over all months and years is
12 15
1
∑
15 · 12 n=1 i∑
µ̂ x = x20i (n) = 19.84.
=1

The mean value for each of the considered years is

1 12
12 n∑
µ̂ x (20i ) = x20i (n).
=1

The mean value calculated as the sample average is commonly used due it calculation simplicity.
Later, it will be shown that the sample average is optimal in the estimation of the true mean value of a
signal sample when its realizations are corrupted by a quite common disturbance called Gaussian noise
(it is interesting not notice that Gauss introduced his famous distribution as the best framework for the
sample average estimator, see Section 7.4.5).
292 Discrete-Time Random Signals

Table 7.2
Average of maximum temperature values within months over 15 years, 2001-2015.

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
10 4 18 17 22 29 30 28 27 17 17 5
6 7 11 23 22 32 35 33 22 26 22 8
10 11 10 16 21 26 32 31 23 19 17 4
3 11 13 19 22 26 34 29 26 22 12 9
7 10 13 21 27 29 30 34 24 20 16 11
7 11 17 17 27 25 37 34 33 22 14 14
7 12 13 19 23 32 34 38 21 21 12 10
12 5 9 20 21 37 34 34 27 22 20 7
7 12 13 23 27 33 29 31 25 21 6 11
8 12 10 17 27 33 38 32 23 20 15 9
8 10 13 24 23 33 33 31 27 21 16 8
4 6 15 18 25 26 27 33 23 23 13 11
3 6 16 17 27 28 30 32 29 24 12 10
11 12 14 18 22 29 34 34 23 21 20 11
6 13 8 22 22 29 30 34 23 18 15 8

The mean value, calculated as the sample average using (7.2) or (7.3), is the result of the
following minimization problem. Given a set of the random sample x (n) realizations, { xi (n)}, where
i = 1, 2, . . . , M is the index of a realization (in (7.2), xi (n) = x (i )). The aim is to estimate the true mean
signal value µ(n) by µ̂(n), such that its squared distance (deviation) from the available realizations
xi (n), i = 1, 2, . . . , M, is minimum, that is

µ̂ x (n) = min ( x1 (n) − α)2 + ( x2 (n) − α)2 + · · · + ( x M (n) − α)2
α
= min ||x − α||22 = min f (α),
α α

where x = [ x1 (n), x2 (n), . . . , x M (n)] T and f (α) = ||x − α||2 is the two-norm of the vector x − α. The
result of this minimization is obtained from
d
( x1 ( n ) − α )2 + ( x2 ( n ) − α )2 + · · · + ( x M ( n ) − α )2 = 0 (7.4)
dα
in the form given by (7.3) or (7.2).
The sample average estimation is very sensitive to possible wrongly recorded realizations of a
sample x (n) or to the realization with a very high disturbance due to some exceptional circumstances.
These signal realizations, which significantly differ from the true value of the signal sample, are called
outliers, in contrast to the realizations with relatively small errors called inliers.
The sample average calculation will produce a completely wrong (unbounded) result if at least
one outlier happens in the considered set of realizations { xi (n)}. The smallest possible fraction of
samples needed to be replaced by outliers, in order to make an estimator unbounded, is called the
breakdown point of the estimator. For the sample average (7.3) or (7.2), the breakdown point is the
Ljubiša Stanković Digital Signal Processing 293

45 45
35 35
25 25
15 15
5 5
-5 -5
1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12

Figure 7.5 Several realizations of a random signal x20i (n), for i = 01, 02, . . . , 07 and the mean value µ x (n) for
every sample (month) over 15 available realizations.

smallest possible, 1/M (where M is the number of available realizations), since only one sample
(outlier) can make it unbounded.
The estimators which are robust to possible outliers in the data are defined and analyzed within
robust statistics. The simplest tool in this area will be considered next.

7.1.2 Median

In addition to the sample average, a sample median is used as a statistic to describe of a set of random
values. The median of a dataset is a value in the middle of the set of available samples, after the
members of the set are sorted. If we denote the sorted values of x (n) as s(n)

s(n) = sort{ x (n)}, n = 1, 2, . . . , N

294 Discrete-Time Random Signals

then the median value is

N+1
median{ x (n)| n = 1, 2, . . . , N } = s , for an odd N.
2

If N is even, then the median is defined as the mean value of two samples the nearest to ( N + 1)/2,

s N2 + s N2 + 1
median{ x (n)| n = 1, 2, . . . , N } = , for an even N.
2

Example 7.4. Find the median of the sets

(a) A = {−1, 1, −2, 4, 6, −9, 0},
(b) B = {−1, 1, −1367, 4, 35, −9, 0}, and
(c) The signal x (n) from Example 7.1.

⋆(a) After sorting the values in the set A we get A = {−9, −2, −1, 0, 1, 4, 6}. Therefore,
median(A ) = 0.
(b) Similarly, median(B ) = 0. The mean values of these data would significantly differ.
(c) The sorted values of x (n) are shown in Fig. 7.6. Since the number of samples of signal
x (n) is N = 100, there is no single sample in the middle of the sorted sequence. The middle is
between the sorted samples 50 and 51. Thus, the median is defined here as the mean value of the
50th and 51st sorted sample.

120
110
100
90
80
70
60
50
40
30
20
10
0
0 10 20 30 40 50 60 70 80 90 100

Figure 7.6 Sorted values and the median of x (n).

The median will not be influenced by a possible small number of big outliers (signal values
being significantly different from the values in the rest of the data). In the worst case, we have to
replace N/2 of the realizations in order to be certain that the middle signal sample is among the
Ljubiša Stanković Digital Signal Processing 295

outliers and the median result will not be an inlier. Therefore, the breakdown point of this estimator is
( N/2)/N = 1/2.
The sample average estimator was introduced by minimizing the squared distance (deviation)
from the available realizations xi (n). Since the square of large errors is very large, this kind of estimator
is highly influenced by the outliers. A common way to reduce the influence of large errors is to use the
absolute value of the difference, instead of the squared distance in the minimization function (7.4), that
is,
min | x1 (n) − α| + | x2 (n) − α| + · · · + | x M (n) − α| .
α
The same holds for the case when we consider x (n), n = 1, 2, . . . , N, with

min | x (1) − α| + | x (2) − α| + · · · + | x ( N ) − α| .
α

Next, we will show that the result of this minimization is the median of the considered set

median { xi (n)} = min | x1 (n) − α| + | x2 (n) − α| + · · · + | x M (n) − α| ,
i =1,2,...,M α

where median { xi (n)} is used to denote median{ xi (n)| i = 1, 2, . . . , M }.

i =1,2,...,M
Consider the cost function

f (α) = | x1 (n) − α| + | x2 (n) − α| + · · · + | x M (n) − α| = ||x − α||1 (7.5)

and assume, without loss of generality, that the samples in x = [ x1 (n), x2 (n), . . . , x M (n)] T are already
sorted, x1 (n) ≤ x2 (n) ≤ . . . , x M (n) and that M is odd. The minimum of this function cannot be
obtained as in (7.4) since this function is not differentiable at the points α = x1 (n), α = x2 (n), . . . ,
α = x M (n). However, the function f (α) is differentiable for all other values of α and continuous for
any α. We will use this property to establish the intervals of α where it decreases and increases. The
derivative of the function | xi (n) − α| is equal to
(
d | xi ( n ) − α | −1, for α < xi (n)
=
dα 1, for α > xi (n).

Therefore, the derivative of f (α) within the interval on the left of the smallest signal value, α < x1 (n), is
equal to the sum of derivatives d| xi (n) − α|/dα = −1 of all terms, and it is equal to d f (α)/dα = − M.
If we move to the right along the α axis, to the interval x1 (n) < α < x2 (n), then the derivative of
| x1 (n) − α| is changed to 1, while all other M − 1 terms, have the derivatives equal to −1. This
means that d f (α)/dα = − M + 2, in this interval. If we continue and move next to the interval
x2 (n) < α < x3 (n), and so on, we get


 − M, for α < x1 (n)



 − M + 2, for x1 (n) < α < x2 (n)



 .
.

.
d f (α) 
= −1, for x( M−1)/2 (n) < α < x( M+1)/2 (n)
dα 


 1, for x( M+1)/2 (n) < α < x( M+3)/2 (n)



 ..



 .


M, for α > x M (n).

as illustrated in Fig. 7.7 for x = [ x1 (n), x2 (n), . . . , x7 (n)] T = [−0.9, −0.5, 0, 0.2, 0.7, 0.8, 1] T .
296 Discrete-Time Random Signals

Obviously, the cost function f (α) is a decreasing function, d f (α)/dα < 0, for α < x( M+1)/2 (n)
and an increasing function, d f (α)/dα > 0, for α > x( M+1)/2 (n). Since the function f (α) is
continuous, this proves that
median { xi (n)} = min( f (α)).
i =1,2,...,M α

1.5

0.5

0
-0.9 -0.5 0 0.2 0.7 0.8 1

3
-0.9 -0.5 0 0.2 0.7 0.8 1

Figure 7.7 Median as the solution to the L1 -norm minimization problem.

When M is even, then d f (α)/dα = 0 will be obtained for the interval x M/2 (n) < α <
x M/2+1 (n). This means that the cost function decreases for α < x M/2 (n), it is a constant within the
interval x M/2 (n) < α < x M/2+1 (n), and then increases for α > x M/2+1 (n). In the case of an even
M, the mean value of x M/2 (n) and x M/2+1 (n) is used as the sample median.
In some cases the number of outliers is small. Then, the median will neglect many inlier signal
values that could produce a good estimate of the mean value. In these cases, the best choice would be
to use not only the mid-value in the sorted signal, but several samples of the signal, around its median
and to calculate their (trimmed) mean, for an odd N, as
L
1 N+1
LSmean{ x (n)| n = 1, 2, . . . , N } = ∑ s +i .
2L + 1 i=− L 2

With L = ( N − 1)/2, all signal values are used and LSmean{ x (n)| n = 1, 2, . . . , N } is the standard
mean of a signal. With L = 0, the value of LSmean{ x (n)k n = 1, 2, . . . , N } is equal to the sample
median. In general, this way of the mean estimation is the L-statistics (α-trimmed) based estimation.
Ljubiša Stanković Digital Signal Processing 297

7.1.3 Variance and Standard Deviation

The next important parameter in statistics is a measure of the deviation of realizations of a random
sample from the mean value. The most commonly used parameter for the description of this statistical
property is the standard deviation (called the spread) or its squared value called the variance. For a
random signal x (n) whose values are available in M realizations, the variance is calculated as the mean
squared deviation of the signal values from the corresponding true mean values, µ x (n),

1
σ̂x2 (n) = | x1 (n) − µ x (n)|2 + · · · + | x M (n) − µ x (n)|2 . (7.6)
M
The standard deviation is a square root of the variance. The standard deviation can be estimated
as a square root of the mean of squares of the centered data,
r
1
σ̂x (n) = | x1 (n) − µ x (n)|2 + · · · + | x M (n) − µ x (n)|2 . (7.7)
M

If the mean value is estimated using the same set of data, µ̂ x (n) = M 1
∑iM =1 xi ( n ), the previous
estimate (which assumes the true mean value µ x (n)) tends to produce lower values of the standard
deviation (biased standard deviation). Thus, an adjusted version, the sample standard deviation, is used
as an unbiased spread measure,
r
1
σ̂x (n) = | x1 (n) − µ̂ x (n)|2 + · · · + | x M (n) − µ̂ x (n)|2 . (7.8)
M−1
This form confirms the fact that in the case when only one sample is available, M = 1, we should not
be able to estimate the standard deviation (see Problem 7.2).

Example 7.5. For the signal x (n) from Example 7.1 calculate the mean value and the variance.
Compare it with the mean value and the variance of the signal z(n) given in Table 7.3.

Table 7.3
Random signal z(n)

55 57 56 54 59 52 66 54 56 60
55 55 51 56 48 59 63 52 59 61
47 48 58 53 58 59 59 61 58 61
49 55 54 47 56 50 62 51 58 56
50 44 55 50 58 58 63 58 52 57
50 55 55 55 53 60 46 57 59 59
58 55 58 58 54 53 54 48 54 56
57 62 53 58 59 60 50 56 56 50
51 60 54 57 55 52 52 57 50 59
58 51 54 49 44 60 52 57 56 55
298 Discrete-Time Random Signals

120
110
100
90
80
70
60
50
40
30
20
10
0
0 10 20 30 40 50 60 70 80 90 100

Figure 7.8 Random signal z(n) from Table 7.3.

⋆The mean value and the variance of signal x (n) are µ̂ x = 55.76 and σ̂x2 = 314.3863. The
standard deviation is σ̂x = 17.7309. It is a measure of the signal value deviations from the mean
value. For the signal z(n), the mean value is µ̂z = 55.14 (very close to µ̂ x ), while the variance
is σ̂z2 = 18.7277 and the standard deviation is σ̂z = 4.3275. Deviations of z(n) from its mean
value are much smaller. If the signals x (n) and z(n) were the measurements of the same physical
value, then the individual measurements from z(n) would be more reliable than the individual
measurements from x (n).

If we denote the sample standard deviation of the data set { xi (n)}, where i = 1, 2, . . . , M, as
S( x1 (n), x2 (n)), . . . , x M (n)) = σx (n), then it satisfies the scale property

S( ax1 (n) + b, ax2 (n) + b, . . . , ax M (n) + b) = | a|S( x1 (n), x2 (n)), . . . , x M (n)).

The proof is simple, using (7.8) and the property that the mean value of yi (n) = axi (n) + b is
µ̂y (n) = aµ̂ x (n) + b.
The sample standard deviation is sensitive to outliers. This can be concluded form its definition
(7.8). For the spread estimation, when outliers can be expected, the median absolute deviation (MAD)
can be used as its robust measure. The MAD is defined as

MADx (n) = median {x j (n) − median { xi (n)}},
j=1,2,...,M i =1,2,...,M

2

by analogy with the variance definition in (7.6), σx2 (n) = mean {x j (n) − mean { xi (n)} }.
j=1,2,...,M i =1,2,...,M
The MAD value is related to the sample standard deviation as

MADx (n) = 0.6745σx (n),

for the Gaussian random variable (see Problem 7.11). The breakdown point for the MAD is the same
as for the sample median.
Ljubiša Stanković Digital Signal Processing 299

7.1.4 Linear Regression Analysis

The regression analysis deals with the random variable modeling and it is widely used in various areas,
including machine learning and data prediction. The most common model is the linear regression,
where it has been assumed that the outcome random variable fits the linear model of the independent
(also random) variable. Within the signal processing framework, we will consider a continuous-time
random signal x (t) sampled at random instants tn . In linear regression, the signal model is a linear
function
x (tn ) = atn + b + ε(tn ), i = 1, 2, . . . , N
where ε(tn ) is a random variable that describes the deviations of the individual realizations, x (tn ),
from the assumed linear model, atn + b, with constant parameters a and b. The values of ε(tn ) are
unknown.
The aim is to estimate the linear model parameters a and b from the available data and to use
them for prediction or classification of new data values. Since the values of x (tn ) and tn are available,
the error function is formed as
e(n) = x (tn ) − atn − b.
The cost function, that will be used in the minimization process, is

J ( a, b) = f ( x (tn ) − atn − b).

The most common form of the cost function is defined as the sum of the squared values of the error
function,
N N 2
J ( a, b) = ∑ e2 (n) = ∑ x (tn ) − atn − b . (7.9)
n =1 n =1
This cost function is optimal if the measurement disturbances e(n) = ε(tn ) are Gaussian distributed.
The minimization of this function (least squares - LS minimization) is done using

∂J ( a, b) N
= −2 ∑ tn x (tn ) − atn − b = 0
∂a n =1

and
N
∂J ( a, b)
= −2 ∑ x (tn ) − atn − b = 0
∂b n =1
The system of equations
N N N
â ∑ t2n + b̂ ∑ tn = ∑ tn x(tn ) (7.10)
n =1 n =1 n =1
N N
â ∑ tn + b̂N = ∑ x(tn ) (7.11)
n =1 n =1

that can be written in the form A[ â b̂] T = B, produces the estimates â and b̂ of the linear regression
model parameters a and b, [ â b̂] T = A−1 B. After the system of equations is solved, the values of
parameters are estimated as
µ̂ xt − µ̂ x µ̂t
â = and b̂ = µ̂ x − âµ̂t ,
µ̂t2 − µ̂2t

where µ̂ xt = mean{ x (tn )tn }, µ̂ x = mean{ x (tn )}, µ̂t = mean{tn }, and µ̂t2 = mean{t2n }.
300 Discrete-Time Random Signals

Example 7.6. The random signal x (t), whose behavior is expected to be linear, is sampled at the
instants tn

[t1 , t2 , . . . , t N ] T = [0.95, 0.23, 0.61, 0.49, 0.89, 0.76, 0.46, 0.02] T .

The obtained signal values, x (tn ), are

[ x (t1 ), x (t2 ), . . . , x (t N )] T = [4.81, 3.13, 4.25, 4.04, 4.55, 4.76, 4.16, 3.03] T .

Find the linear regression model using the least squares approach. What is the prediction of the
signal value x (t) at t = 1.1?

⋆Elements of the matrices A and B in the system of equations

T
A â b̂ =B

used for the model parameters a and b estimation, are defined by (7.10) and (7.11),

3.15 4.41 19.50
A= , B= , with
4.41 8.00 32.73
−1
â 3.15 4.41 19.50 2.03
= = .
b̂ 4.41 8.00 32.73 2.97
The estimated linear regression model is

x (tn ) = 2.03tn + 2.97.

For tn = 1.1, we can predict x (1.1) = 5.2. The data and the results are shown in Fig. 7.9.

2
-0.2 0 0.2 0.4 0.6 0.8 1

Figure 7.9 The data x (tn ) measured at tn for n = 1, 2, 3, 4, 5, 6, 7, 8 (dots) and the linear model, x (tn ) =
2.03tn + 2.97, obtained by the least squares approach (dotted line). The predicted signal value at tn = 1.1 is
marked by the circle.
Ljubiša Stanković Digital Signal Processing 301

The matrix form of (7.9) is J (a) = ||x − Ta||22 , where

 
1 t1
1 t2 
  b
x = [ x (t1 ), x (t2 ), . . . , x (t N )] T , T = . ..  , and a = a .
 .. . 
1 tN

The solution to the minimization problem is obtained from ∂J (a)/∂a T = 0 or −2T T (x − Tâ) = 0 as

â = (T T T)−1 T T x.

The regression analysis can be generalized to the cases with more than one independent variable.
These regression forms will also be considered in Section 7.1.6, after the RANSAC method is presented
in the next section.

7.1.5 Random Sample Consensus (RANSAC)

When the mean value of the data and the spread measure (commonly standard deviation) are known,
we can define a criterion to identify the outliers in the data. The function used for this purpose is

x (n) − µ̂ x (n)
z(n) = ,
σ̂x (n)

and it is called the z-score. It is common to assume the threshold value T = 2.5 and to declare the signal
samples with |z(n)| ≤ T as inliers and |z(n)| > T as outliers. The meaning of the threshold T = 2.5
will be explained later. Since the values of the average µ̂ x (n) and the sample standard deviation σ̂x (n)
can be significantly compromised with possible outliers, it is recommended to use the median and the
corresponding MAD in the z-score.
The random sample consensus (RANSAC) is used for linear regression when the outliers in the
data are expected. Consider a set of data x (tn ) sampled at random instants tn and assume that the true
data values fit a linear model. Since a large number of outliers is expected, the linear model can be far
from most of the data samples. In the RANSAC approach we will:

1. Assume a small subset S with S randomly selected indices, tn , of samples x (tn ), n ∈ S.

2. The samples with indices in S are used to estimate the linear regression model parameters,

â ∑ t2n + b̂ ∑ tn = ∑ tn x(tn )
n ∈S n ∈S n ∈S
â ∑ tn + b̂N = ∑ x(tn ).
n ∈S n ∈S

3. After the linear regression parameters a and b are estimated from

[ â b̂] T = A− 1
S BS ,

the line
x = ât + b̂
is defined. The distances dn of all data points (tn , x (tn )), n = 1, 2, . . . , N from this line are
calculated,
| âtn + b̂ − x (tn )|
dn = √ .
1 + â2
302 Discrete-Time Random Signals

4. If a sufficient number of data points is such that their distance from the model line is lower than
an assumed distance threshold d, then all these points are included into new set of data

D = {(tn , x (tn ))| dn ≤ d},

and the final estimation of the parameters a and b (for machine learning or prediction) is obtained
with all data from D.
5. If there was no sufficient number of data points within the distance d, a new random small set of
data, i ∈ S, is taken and the procedure is repeated from Step 2.
6. The procedure is stopped when the desired number of data points within D is achieved or the
maximum number of trials is reached.

Example 7.7. Consider N = 20 random signal simples x (tn )

[ x (t1 ), x (t2 ), . . . , x (t N )] T = [6.10, 3.09, 3.23, 6.90, 3.53, 3.67, 3.64, 3.97, 3.85, 4.08,
3.68, 4.05, 4.30, 4.27, 4.90, 4.56, 4.75, 1.80, 4.57, 2.9]

taken at the corresponding random instants tn

[t1 , t2 , . . . , t N ] T = [0.022, 0.034, 0.200, 0.303, 0.307, 0.376, 0.429, 0.443, 0.519, 0.525,
0.538, 0.598, 0.704, 0.715, 0.837, 0.841, 0.899, 0.910, 0.953, 0.954].

Find the linear regression model parameters a and b for these data. Comment the results. Next, apply the
RANSAC approach as follows: Use S = 4 randomly chosen samples with indices n ∈ S = {8, 10, 18, 19}.
Find the linear regression model parameters with this subset of data. How many data points are within the
distance d = 0.25 from the obtained linear model line? If the number of the data points within these lines
is below assumed T = 10, chose another random set S. In that case use S = {5, 11, 16, 19} and repeat the
procedure. If the number of data points between these lines is not below T = 10, use all the data points within
these lines to find the linear regression model parameters.

⋆ For the given data set, the estimates of the linear regression model parameters a and b are
obtained from
−1
â 7.8107 11.1070 44.3483 −0.6707
= =
b̂ 11.1070 20.0000 81.8400 4.4645

The linear model

x (tn ) = −0.6707tn + 4.4645.
does not fit the data due to evident outliers at t1 , t4 , t18 , and t20 used in calculation, Fig. 7.10(a).
The RANSAC approach is started with the random selection of the data subset with S = 4
samples. This random subset turn out to be S = {8, 10, 18, 19}. Since, one of the outliers, at t18 ,
is used in the calculation, the parameters estimation obtained from
−1
â 2.2082 2.8310 9.8939 −1.5245
= = ,
b̂ 2.8310 4.0000 14.4200 4.6840

with the linear model x (tn ) = −1.5245tn + 4.6840 do not fit the data. This is confirmed by the
fact that only D = 8 data points are within the lines at the distance d = 0.25, as can be seen in
Fig. 7.10(b). Since the number of the data points within D is bellow T = 10, a new random set,
S = {5, 11, 16, 19}, is used. With the data corresponding to this subset, the estimation is obtained
Ljubiša Stanković Digital Signal Processing 303

7 7

6 6

5 5

4 4

3 3

2 2

1 1
0 0.5 1 0 0.5 1

7 7

6 6

5 5

4 4

3 3

2 2

1 1
0 0.5 1 0 0.5 1

Figure 7.10 (a) The data x (tn ) measured at tn for n = 1, 2, . . . , 10 (dots) and the linear model obtained by the
least squares (dotted line). The RANSAC illustration: (b) The data (dots) and the linear model obtained by the least
squares using a random subset of 4 marked samples at S = {8, 10, 18, 19} (dotted line). (c) The data (dots) and the
linear model obtained by the least squares using another random subset of 4 marked samples at S = {5, 11, 16, 19}
(dotted line). (d) The data (dots) and the linear model obtained by the least squares using all data at marked samples
in D (dotted line).

from −1
â 1.9992 2.6390 11.2537 1.8342
= =
b̂ 2.6390 4.0000 16.3400 2.8749
with the linear model x (tn ) = 1.8342tn + 2.8749 that fits the data, since D = 16 data points are
within D. Since this number is above the threshold, T = 10, the algorithm is stopped and the
linear regression model is re-estimated using all D = 16 data from the set D, producing
−1
â 5.9802 8.9180 37.7188 1.9502
= =
b̂ 8.9180 16.0000 64.1400 2.9218

and the final estimated linear regression model x (tn ) = 1.9502tn + 2.9218.

The probability that a subset of S = M data points is free from the I outliers in a data sequence
with N samples, is calculated in Example 7.10. This probability can be used to estimate the expected
number of iterations in the RANSAC approach.
304 Discrete-Time Random Signals

7.1.6 Ridge Regression

The regression can be generalized to more than one independent variable (multivariable) cases, when
the considered sample, x (n), is a function of the random variables t1 (n), t2 (n), . . . , t M (n), that is,
x (n) = x (t1 (n), t2 (n), . . . , t M (n)). The regression can be written in the form of a multidimensional
linear model,

x (n) = a1 t1 (n) + a2 t2 (n) + · · · + a M t M (n) + ε(n), n = 1, 2, . . . , N

where ti (n), i = 1, 2, . . . , M, are independent variables for the sample x (n). This system of equations
can be written in the matrix/vector form as

x = Ta + Ξ, where
       
x (1) t1 (1) t2 (1) . . . t M (1) a1 ε (1)
 x (2)   t1 (2) t2 (2) . . . t M (2)   a2   ε (2) 
       
x =  . , T =  . .. .. ..  , a =  ..  , and Ξ =  ..  .
 ..   .. . . .   .   . 
x( N ) t1 ( N ) t2 ( N ) . . . t M ( N ) aM ε( N )

Common goal is to minimize the squared error between the data, x, and the model, Ta,

J (a) = ||x − Ta||22 = (x − Ta) T (x − Ta) = (x T − a T T T )(x − Ta).

The solution to this (least squares) minimization problem follows from ∂J (a)/∂a T = 0, as

−2T T (x − Tâ) = 0 or T T x = T T Tâ.

The estimation of the regression parameters a = [ a1 , a2 , . . . , a N ] T that minimize the cost function J (a),
are obtained from
â = (T T T)−1 T T x = pinv{T}x, (7.12)
where pinv{T} = (T T T)−1 T T is the co called pseudo-inverse of the matrix T.
The sensitivity of the reconstructed coefficients to the random variations in data x (n), caused
by the noise ε(n), highly depends on the condition number of the matrix T T T, whose inverse is to
be calculated. For a high condition number of this matrix, corresponding to a relatively small value
of the determinant det{T T T}, a small noise ε(n) in the input data causes very high variations of the
resulting parameters â = [ â1 , â2 , . . . , â M ] T in the model (ill-posed problem). In order to regularize the
inversion (and to limit possible extremely large elements in the inverse matrix) a small value λ is added
before the inversion, and the vector of parameters is calculated using

â = (T T T + λI)−1 T T x. (7.13)

It can easily be shown that this form of a is the solution to the minimization of the cost function
J (a) = ||x − Ta||22 , when the energy of the coefficients, a = [ a1 , a2 , . . . , a M ] T , constraint is added. The
energy constraint in the minimization keeps the energy of a = [ a1 , a2 , . . . , a M ] T as low as possible (in
order to avoid high values of its elements, due to the possible instability of the inversion of T T T). This
is the reason why the regression estimation is called shrinkage estimation as well. The constrained cost
function is of the form
J (a) = ||x − Ta||22 + λ||a||22 ,
where ||a||22 is the L2 -norm of a. The minimum value of this cost function is obtained from

∂J (a)
= −2T T (x − Ta) + 2λa = 0
∂a T
Ljubiša Stanković Digital Signal Processing 305

in the form given by (7.13).

If the constraint is defined as the L1 -norm of the coefficients a = [ a1 , a2 , . . . , a N ] T , as in (7.5),
then the cost function is
J (a) = ||x − Ta||22 + λ||a||1 .
In contrast to the L2 -norm constraint, which enforces small values (energy) of a = [ a1 , a2 , . . . , a M ] T ,
the L1 -norm constraint enforces sparse solution for a = [ a1 , a2 , . . . , a M ] T (solution with as many zero-
valued elements in a = [ a1 , a2 , . . . , a M ] T as possible). The minimization of this cost function is known
as the least absolute shrinkage and selection operator (LASSO). The LASSO solution cannot be
obtained in a closed form.
This minimization problem is of great importance in compressive sensing and machine learning
and will be considered in detail in Part VI. For the Bayesian framework interpretation see Example
7.28. The bias and variance of the estimated model parameters using the ridge regression and the
least-squares are considered in Problem 7.3.

Example 7.8. The random variable x = [ x (1), x (2), . . . , x ( N )] T is measured at N = 10 instants. It

is known that the random variable x (n) is a linear function of M = 7 independent variables
t(n) = [t1 (n), t2 (n), . . . , t M (n)], The values of the independent variables are given in the matrix
T, where t(n) are its rows, and the observed random variable x (n) is given in the vector x. Find
the linear model parameters a = [ a1 , a2 , . . . , a M ] T using the ridge regression and λ = 0.01.
   
1.1661 1.1607 0.5404 0.1656 -0.8154 -0.7506 0.4005 3.1399
 0.0884 0.3245 0.4657 0.0397 1.1129 -0.1554 1.1253   -0.9308 
 0.2168 0.6992 -0.2677 0.9177 -0.4409 -0.5107 -0.4384   0.8620 
   
 -0.5292 0.9924 0.8458 -0.7220 -0.0592 0.9896 1.1560   -0.9898 
   
 0.2099 0.8470 -0.5224 0.6903 0.2075 0.1795 0.5974   
T=  and x =  0.2084  .
 -0.3515 -1.2374 0.1422 1.0570 -0.5864 0.2041 0.2922   -0.1228 
   
 0.6215 1.4229 0.5253 0.6104 0.3214 -1.1383 0.3013   0.9128 
   
 0.7764 0.9572 0.5978 -0.3710 1.0122 -1.2106 -0.7862   0.5337 
 -1.3638 -0.6115 -0.1246 0.1535 -0.0534 0.6406 0.3441   -2.6691 
-0.4296 -0.7374 0.6755 0.3165 -0.3022 0.4979 -0.9710 -0.5552

⋆ The estimation of the linear model parameters, obtained from the ridge regression (7.13), are

â = (T T T + 0.01I)−1 T T x = [1.9787, −0.0023, −0.0047, −0.0186, −1.0095, −0.0208, 0.006] T .

The LASSO minimization, with the same penalty factor λ = 0.01, would produce the estimation

â = lasso(x, T, 0.01) = [1.9767, 0, 0, 0, −0.9776, 0, 0] T ,

enforcing as many zero-valued elements in a as possible. Due to this property (crucial in sparse
signal processing and compressive sensing), the LASSO minimization would be able to produce
the solution even in the case when the number of observations is smaller than the number of
elements in a. If we keep just the first N = 6 < M = 7 rows in T and x we would get

â = lasso(x, T, 0.01) = [1.9931, 0, 0.0004, 0, −0.9772, 0, 0] T .

In this case det{T T T} = 0, since N < M.

306 Discrete-Time Random Signals

A specific form of linear regression is the polynomial fitting

x (tn ) = a0 + a1 tn + a2 t2n + · · · + a M tnM + ε(tn ), n = 1, 2, . . . , N.

This regression is still linear since linearity holds with respect to the model parameters a1 , a2 , . . . , a N .
The independent variables matrix is given by
 
1 t1 t21 ... t1M
1 t2 t22 ... M
t2 
 
T = . , (7.14)
 .. 
1 tN t2N ... M
tN

with all other vectors, results, and comments as in the previous multivariable case. Within the polynomial
fitting framework, the regularization constraints on the solution, prevent over-fitting the model and
keep the parameters low (see Problem 7.3).

7.2 BASIC PROBABILITY DEFINITIONS

Probability theory is a scientific discipline dealing with the analysis of random phenomena through a
set of axioms. The outcomes of a random event are determined by chance. Probability is a measure of
the likelihood of an event to occur.
For the calculation of the parameters of the first-order statistics, it is sufficient to know the
probability or the probability density function of a random variable, as its basic probabilistic description.

7.2.1 Probability

Assumes that a random signal, x (n), may take one of discrete values (amplitudes), ξ i , from the set
A = {ξ 1 , ξ 2 , . . . , ξ N }. Then, we deal with probabilities that the random signal, x (n), at an instant n
takes a specific value ξ i from the set of all possible values,

Probability { x (n) = ξ i } = Px(n) (ξ i ). (7.15)

The probability function Px(n) (ξ ) satisfies the following conditions (axioms of probability theory):
(1) 0 ≤ Px(n) (ξ ) ≤ 1 for any ξ.
(2) For the events x (n) = ξ i and x (n) = ξ j , i 6= j, which exclude each other
n o
Probability x (n) = ξ i or x (n) = ξ j = Px(n) (ξ i ) + Px(n) (ξ j ).

(3) The sum of probabilities that x (n) takes any value ξ i form the set A of all possible values of
ξ is a certain event. Its probability is equal to 1, that is

∑ Px(n) (ξ ) = 1.
ξ ∈A

An impossible event has zero probability.

An example of a signal when the probabilities are estimated after the experiment (a posteriori) is
already presented within Example 7.2. A posteriori probability that the signal x (n) takes a value ξ i is
defined as a ratio of the number Nξ i of appearances of the event x (n) = ξ i and the total number of the
performed experiments N
Nξ i
Px(n) (ξ i ) =
N
Ljubiša Stanković Digital Signal Processing 307

for a sufficiently large N and Nξ i .

In some cases, it is possible to find the probability of an event before the experiment is performed.
For example, if a signal is equal to the number appearing in die tossing (assuming a fair balanced
die), then the signal may take one of the values from the set ξ i ∈ A = {1, 2, 3, 4, 5, 6}. In this case, the
probability of each event is known in advance (a priori), and it is equal to P(ξ i ) = 1/6.

Independence. Two events (random signal samples) are independent of each other, if the probability
that one event occurs (one signal sample takes a specific value) does not affects the probability of the
other event occurring (does not affect the value of the other signal sample). If the signal samples x (n)
and x (m) are statistically independent random variables then
n o
Probability x (n) = ξ i and x (m) = ξ j = Px(n) (ξ i ) Px(m) (ξ j ).

Exclusiveness (disjointness). Two random events (random signal sample values) are mutually
exclusive or disjoint if they cannot both occur at the same time. If the signal sample values x (n) = ξ i
and x (n) = ξ j are mutually exclusive events then (Property (2))
n o
Probability x (n) = ξ i or x (n) = ξ j = Px(n) (ξ i ) + Px(n) (ξ j ).

Example 7.9. Consider a random signal whose values are equal to the numbers appearing in a die
tossing. The set of possible signal values is ξ i ∈ A = {1, 2, 3, 4, 5, 6}. Find the probability that
the signal sample takes the value x (n) = 2 or the value x (n) = 5, that is

Probability { x (n) = 2 or x (n) = 5} .

Find the probability that the signal sample at an instant n takes the value x (n) = 2 and that in the
next tossing the signal takes the value x (n + 1) = 5, that is

Probability { x (n) = 2 and x (n + 1) = 5} .

⋆Events that x (n) = 2 and x (n) = 5 are obviously mutually exclusive. Thus, the probability of
two mutual exclusive events is equal to the sum of their individual probabilities,
1 1 1
Probability { x (n) = 2 or x (n) = 5} = Px(n) (2) + Px(n) (5) = + = .
6 6 3
The events that x (n) = 2 and x (n + 1) = 5 are statistically independent. In this case

11 1
Probability { x (n) = 2 and x (n + 1) = 5} = Px(n) (2) Px(n) (5) = = .
66 36

Conditional probability. Conditional probability is the probability that an event A occurs, given that
another event B has already occurred. The conditional probability of A, given B, is written in the form
P( A| B). The probability that both events A and B occur is

Probability { A and B} = P( A| B) P( B),

where P( B) is the probability that the event B has occurred, while P( A| B) denotes the probability that
the event A occurs subject to the condition that the event B already occurred.
308 Discrete-Time Random Signals

Example 7.10. Assume that the length of random signal x (n) is N and that the number of samples
disturbed by an extremely high noise is I. The observation set of signal samples is taken as a
subset of M < N randomly positioned signal samples. What is the probability that within M
randomly selected signal samples there are no samples affected by the high noise? If N = 128,
I = 16, and M = 32 find how many sets of M samples without any sample corrupted by the high
noise can be expected in 1000 realizations (trials).

⋆Probability that the first randomly chosen sample is not affected by the high noise could be
calculated as a priori probability,
N−I
P (1) = P ( B ) =
N
since there are N samples in total and N − I of them are noise-free. After the first noise-free
sample is chosen, in the remaining ( N − 1) signal samples there are ( N − 1 − I ) noise-free
sample. The probability of choosing a noise-free sample is now P( A| B) = ( N − 1 − I )/( N − 1).
The probability that the second randomly chosen sample is not affected by the high noise, given
that the first randomly chosen sample is not affected, is equal to the product of the probabilities,
N−1− I N− I
P (2) = P ( A ) = P ( A | B ) P ( B ) = .
N−1 N
Here we used the conditional probability property.
Then, we continue the process of random sample selection. In the same way we can calculate
the probability that all of M randomly chosen samples are not affected by the high noise as
M −1
N− I N−1− I N − ( M − 1) − I N−I−i
P( M) = ... = ∏ .
N N−1 N − ( M − 1) i =0
N−i

For N = 128 signal samples, with I = 16 samples affected with high noise, the probability that
M = 32 randomly selected samples are noise-free is equal to P(32) = 0.0071. If we repeat the
whole procedure 1000 times, by selecting M = 32 samples, we can expect

P(32) × 1000 = 7.1,

that is about 7 realizations where none of M signal samples is disturbed by the high noise. One
high noise-free realization is expected in 140 realizations.
In literature, it is common to use the following calculation for the expected number of the
iterations to get a high-noise free realization. The probability that one randomly selected sample
is high noise-free (inlier) is ( N − I )/N ). It is then assumed that this probability can be used for
M samples (that the sample is returned and may be chosen again). The probability that there is at
least one high-noise sample in M samples is [1 − (( N − I )/N )) M ]. Finally, the probability of a
high noise-free realization in Nit such trials is

ln(1 − P)
P = 1 − [1 − (( N − I )/N )) M ] Nit where Nit = .
ln(1 − (( N − I )/N )) M )

This calculation is correct if ( N − I − M)/( N − M) ∼ ( N − I )/N, otherwise instead of

−1
(( N − I )/N )) M we should use P( M) = ∏iM=0 ( N − I − i ) / ( N − i ).
Ljubiša Stanković Digital Signal Processing 309

Bayes’ theorem. Consider two events A and B. From the probability that both of these events happen

Probability { A and B} = P( A| B) P( B) = Probability { B and A} = P( B| A) P( A),

Bayes’ relation follows

P( B| A) P( A)
P( A| B) = . (7.16)
P( B)
Assume now that there are N possible events Ai , then i = 1, 2, . . . , N

P ( B | Ai ) P ( Ai )
P ( Ai | B ) = .
P( B)

Assume also that the events Ai , i = 1, 2, . . . , N are independent and exhaustive. It means that two
events Ai and A j , i 6= j, cannot happen at the same time (independence) and that one of the events Ai ,
i = 1, 2, . . . , N, must happen (exhaustiveness). In that case, we may write

P( B) = Probability { B and (certain event)} = Probability { B and ( A1 or A2 or . . . A N )}

= P ( B | A1 ) P ( A1 ) + P ( B | A2 ) P ( A2 ) + · · · + P ( B | A N ) P ( A N ).

Bayes’ theorem is given by

P ( B | Ai ) P ( Ai )
P ( Ai | B ) = .
P ( B | A1 ) P ( A1 ) + P ( B | A2 ) P ( A2 ) + · · · + P ( B | A N ) P ( A N )
Since the Bayesian approach is of great importance in modern data processing we will comment
on the terms in more detail.
• The event B is assumed (the evidence has happened). The probability P( B) shows how probable
is the assumed event (evidence) under all possible events (hypotheses) Ai . It can be written in
the form P( B) = P( B| A1 ) P( A1 ) + P( B| A2 ) P( A2 ) + · · · + P( B| A N ) P( A N ).
• The specific event Ai is called the hypothesis, assuming that the event B has occurred. The
probability P( Ai ) shows how probable is the event Ai (hypothesis), independent of the event B
occurrence.
• Probability P( B| Ai ) indicates how probable is the event B (evidence), given that the event Ai
(hypothesis) is true.
• The result P( Ai | B) is the probability of the event Ai (how the hypothesis Ai is probable) given
the fact that the evidence B occurred.

Example 7.11. Consider four images, denoted by A1 , A2 , A3 , and A4 . In two images (A1 and A2 )
there are 20% of red pixels, in the third image (A3 ) there are 30% of red pixels, while in the
fourth image (A4 ) there are 50% of red pixels. One image is chosen randomly and one of its
pixels is observed. The chosen pixel is red (evidence B). What is the probability that the image
A4 was chosen?

⋆The probability that the image A4 was chosen (hypothesis A4 ) when the red pixel is obtained
as the evidence (denoted by B) is equal to

• The probability of the red pixel P( B) being obtained, under all possible events (hypotheses) Ai ,
i = 1, 2, 3, 4, is

P( B)= P( B| A1 ) P( A1 )+P( B| A2 ) P( A2 )+P( B| A3 ) P( A3 )+P( B| A4 ) P( A4 )

20 1 20 1 30 1 50 1
= + + + = 0.3.
100 4 100 4 100 4 100 4
This probability is obtained as a sum over all events Ai , and is called the marginal probability.
• The probability that the hypothesis A4 occurred, independently of the pixel color (independent
of the event B) is
P( A4 ) = 1/4 = 0.25,
since there are four images whose choice is equally probable, P( A1 ) = P( A2 ) = P( A3 ) =
P( A4 ) = 1/4.
• Probability P( B| A4 ) indicates how probable is the event B (red pixel evidence) given that the
event A4 (hypothesis that the fourth image is already chosen) is true. Its value is
50
P ( B | A4 ) = = 0.5,
100
since there are 50% red pixels in image A4 .
• The resulting probability of the image A4 being chosen, given the evidence B that the red pixel
occurred, is equal to
0.5 × 0.25
P ( A4 | B ) = = 0.4167.
0.3

Example 7.12. A system used for virus testing has reported its expected reliability. The probability of
correct, positive or negative, results for a tested person is very high and it is equal to 0.978. The
probability of a false-positive result (the tested person does not have the specified virus, but the
test is positive) is PF+ = 0.018. The probability of false-negative results (the tested person does
have the specified virus, but the test is negative) is PF− = 0.004. What is the expected number of
positive results in 1000 randomly tested persons from a country if the expected number of the
contaminated people by the virus is: (a) 1 per 1000 people (p = P(V ) = 10−3 ); (b) 1 per 10,000
people (p = P(V ) = 10−4 ); (c) 1.75 per 100 people (p = P(V ) = 0.0175); and (d) 25 per 100
of the selected population set for testing (formed by a prior evaluation of other symptoms)?
A randomly selected person is tested for the virus and the result is positive. Find the
probability that this person is contaminated by the virus in all four previous cases.

⋆ The probability of a positive result is equal to the sum of the probability that the tested person
does not have the virus and that the result is positive, (1 − P(V )) PF+ , and the probability that
the person does have the virus and the result is positive, P(V )(1 − PF− ). This means that the
test of a randomly selected person is positive with the probability

P(+) = (1 − P(V )) PF+ + P(V )(1 − PF− ) = (1 − p) PF+ + p(1 − PF− ).

For the given expected number of the virus contaminated people, p = P(V ), we get: (a)
P(+) = 0.019, meaning that there will be 19 positive results in 1000 randomly tested persons,
although the expected rate is 1 per 1000, in this case. Therefore, most of the test results are
Ljubiša Stanković Digital Signal Processing 311

false-positive. (b) P(+) = 0.0181, confirming that the false-positive dominates the test again.
(c) In this case, P(+) = 0.035, meaning that half of the positive tested are indeed the people
contaminated with the virus, since this probability, with an ideal test, would indicate that there are
3.5 contaminated people per 100. (d) For the selected symptomatic set of people, with a relatively
high probability of the virus occurrence, we get P(+) = 0.2625, meaning that the agreement
with the expected number of the contaminated people in this set is high.
The conclusion is that a random screening of the population would not produce a satisfactory
result if the occurrence rate of the virus (disease) is low and the test is not an ideal one with the
zero false-positive and false-negative rates. Testing should be done on a preselected set of people.
This conclusion will be even more obvious from the next Bayes’ analysis.
When a randomly selected person is tested and the result is positive, then the a posterior
probability of the event that the person is virus contaminated, given the positive test, P(V |+), is

P(+|V ) P(V ) (1 − PF− ) P(V ) (1 − PF− ) p

P(V |+)= = = ,
P(+) (1 − P(V )) PF+ + P(V )(1 − PF− ) (1 − p) PF+ + p(1 − PF− )

where P(+|V ) is the probability of the positive test, given that the person is virus contaminated,
p = P(V ) is the probability that a random person has the virus, and P(+) is the probability that
the test is positive, including both cases that the person does and does not have the virus.
For the three considered cases, the values of the probability P(V |+) values are: (a)
P(V |+) = 0.0525, (b) P(V |+) = 0.0055, and (c) P(V |+) = 0.4964. (d) For the selected set of
people, with a significant probability of having the virus, we get P(V |+) = 0.9486, meaning a
high reliability of the test results (if the test is positive the person is contaminated).
These results confirm the conclusion that the random test should not be done, unless the
probability of the virus (disease) in the tested people is increased using other symptoms, meaning
that the set of the tested people will contain the virus (disease) with a significant probability.

In the previous example, we assumed the probability of p = P(V ). Commonly it is not known
and should be estimated based on the posterior evidence that k out of N test were positive, with the
given testing system. This problem will be considered in Section 7.4.3.

7.2.2 Expected Value and Variance

The mean (average) value is calculated over a set of available samples, resulting from an experiment
and it is also of random nature. If the probabilistic description of a random signal (variable) is known,
then we can predict the mean (average) value of this signal without using its specific random realization
or performing experiments with random trials. This analytically obtained value is called the expected
value and it represents the true value of the mean that would be obtained with a large number of
experiments. The expected value is deterministic.

Expected value. The expected value of the signal sample x (n) is calculated as a sum over the set of
possible amplitudes, ξ ∈ A = {ξ 1 , ξ 2 , . . . , ξ N }, weighted by the corresponding probabilities,

µ x (n) = E{ x (n)} = ∑ ξPx(n) (ξ ). (7.17)

ξ ∈A

Variance. The variance of a random signal sample x (n) which takes the values ξ from the discrete set
A = {ξ 1 , ξ 2 , . . . , ξ N }, with the known probabilities, Px(n) (ξ i ), is defined as

σx2 (n) = E{| x (n) − µ x (n)|2 } = ∑ |ξ − µ x (n)|2 Px(n) (ξ ).

ξ ∈A
312 Discrete-Time Random Signals

Example 7.13. A random signal x (n) can take values from the set ξ ∈ A = {0, 1, 2, 3, 4, 5}. It is
known that for k = 0, 1, 2, 3, 4 the probability of x (n) = k is twice higher than the probability of
x (n) = k + 1. Find the probabilities Px(n) (ξ k ) = P{ x (n) = k}. Find the expected value and the
variance of this random signal.

⋆Assume that P{ x (n) = 5} = A for k = 5. Then the probabilities that x (n) takes a value k are

ξk = k 0 1 2 3 4 5
Px(n) (ξ k ) = P{ x (n) = k} 32A 16A 8A 4A 2A A

Constant A can be found from

5
∑ Px(n) (ξ k ) = 1.
k =0
Its value is A = 1/63. Now we have
5
19
µ x (n) =µ x = ∑ kPx(n) (ξ k ) = 21
k =0
5 2
19 626
σx2 (n) =σx2 = ∑ k− Px(n) (ξ k ) = .
k =0
21 441

7.2.3 Probability Density Function

If a random signal can take continuous values in amplitude then we cannot define the probability that
one exact signal amplitude value ξ is taken by the signal sample x (n). In this case, the probability
density function p x(n) (ξ ) should be used. This function defines the probability that the nth signal
sample x (n) takes a value within an infinitesimally small interval dξ around ξ,

Probability {ξ ≤ x (n) < ξ + dξ )} = p x(n) (ξ )dξ. (7.18)

Properties of the probability density function p x(n) (ξ ) are:

(1) It is nonnegative, p x(n) (ξ ) ≥ 0 for any ξ/
(2) Since Probability {−∞ < x (n) < ∞} = 1, then

Z∞
p x(n) (ξ )dξ = 1.
−∞

The probability of an event that the signal x (n) value is within a < x (n) ≤ b is

Zb
Probability { a < x (n) ≤ b} = p x(n) (ξ )dξ.
a
Ljubiša Stanković Digital Signal Processing 313

Cumulative probability distribution Fx (χ) is defined as the probability that the signal x (n) value is
lower than χ,
Zχ
Fx (χ) = Probability { x (n) ≤ χ} = p x(n) (ξ )dξ.
−∞
Obviously, limχ→−∞ Fx (χ) = 0, limχ→+∞ Fx (χ) = 1,

Zb
Probability { a < x (n) ≤ b} = p x(n) (ξ )dξ = Fx (b) − Fx ( a),
a

and Fx (b) ≥ Fx ( a) if b > a. The probability distribution is a nondecreasing function.

The probability density function p x(n) (ξ ) is equal to the derivative of the probability distribution
Fx (ξ ),
dFx (ξ )
p x (n) (ξ ) = .
dξ
The expected value of the random variable x (n), in terms of the probability density function, is
Z∞
µ x (n) = E{ x (n)} = ξ p x(n) (ξ )dξ. (7.19)
−∞

For the case of random signals whose samples take continuous amplitude values, the variance is
defined by
Z∞ 2

σx2 (n) = ξ − µ x(n) p x(n) (ξ )dξ,
−∞
where p x(n) (ξ ) is the probability density function.

Example 7.14. Consider a real-valued random signal x (n) with samples whose values are uniformly
distributed over the interval −1 ≤ x (n) < 1.
(a) Find the expected value and the variance of the signal x (n) samples.
(b) The signal y(n) is obtained as y(n) = x2 (n). Find the expected value and the variance
of signal y(n).

⋆Since the random signal x (n) is uniformly distributed within the interval −1 ≤ x (n) < 1, its
probability density function is of the form

A for − 1 ≤ ξ < 1
p x (n) (ξ ) =
0 elsewhere.
R∞
The value of constant A, A = 1/2, is obtained from −∞ p x(n) (ξ )dξ = 1. The expected value
and the variance are given by

Z∞ Z1 Z∞ Z1
1 1 2 1
µ x (n) = ξ p x(n) (ξ )dξ = ξdξ = 0, σx2 (n) = (ξ − µ x (n))2 p x(n) (ξ )dξ = ξ dξ = .
2 2 3
−∞ −1 −∞ −1
314 Discrete-Time Random Signals

The probability that the amplitude of signal y(n) is not larger than an assumed χ is, by
definition, the probability distribution of y(n). Its form is
√ √
Fy (χ) = P{y(n) ≤ χ} = P{ x2 (n) ≤ χ} = P{− χ ≤ x (n) ≤ χ}
 
 0R √
 for χ ≤ 0  √0 for χ ≤ 0
χ
= √ p x (n) ( ξ ) dξ for 0 < χ ≤ 1 =
− χ
χ for 0 < χ ≤ 1

 
1 for χ > 1 1 for χ > 1,

since y(n) ≤ χ, when x2 (n) ≤ χ, as illustrated in Fig. 7.11.

2 2 2

1 1 1

0 0 0

-1 -1 -1
-1 0 1 -1 0 1 -1 0 1

Figure 7.11 Illustration of the probability distribution Fy (χ) calculation for y(n) = x2 (n), when −1 ≤ x (n) ≤ 1.

The probability density function is obtained as the derivative of the probability distribution
Fy (χ), that is 
 √ 1
for 0 < ξ ≤ 1
dF (ξ )  2 ξ
py(n) (ξ ) = =
dξ 
 0 otherwise.
The expected value and the variance of the signal y(n) are

Z1 Z1
1 1 1 1 4
µy (n) = µy = ξ √ dξ = , σy2 (n) = σy2 = (ξ − )2 √ dξ = .
2 ξ 3 3 2 ξ 45
0 0

Example 7.15. Find the probability density function of y(n) for an arbitrary monotonous function
y(n) = f ( x (n)), with inverse f −1 (y(n)) = x (n), if the probability density function of x (n) is
p x ( n ) ( ξ ).
What is the form of py(n) (ξ ) if x (n) is a random variable with the uniform probability
density function, within the interval [−π/2, π/2) and y(n) = tan( x (n)), with the inverse
function x (n) = arctan(y(n))?
Ljubiša Stanković Digital Signal Processing 315

⋆ The probability distribution of the random signal y(n) is given by

f −Z1 (χ)
−1
Fy (χ) = P{y(n) ≤ χ} = P{ f ( x (n)) ≤ χ} = P{ x (n) ≤ f (χ)} = p x(n) (ξ )dξ
−∞

or
Fy (χ) = Fx ( f −1 (χ)).
The probability density function is

dFy (ξ ) dFx ( f −1 (ξ )) d f −1 ( ξ )

py(n) (ξ ) = = = p x(n) ( f −1 (ξ )) .
dξ dξ dξ
This relation can also be obtained from the fact that the probability contained in a differential
area must be invariant under the change of variables, that is,

| py(n) (ξ )dξ | = | p x(n) ( f −1 (ξ ))d f −1 (ξ )|.

For the random variable x (n), with the uniform probability density function within the
interval [−π/2, π/2), the random variable y(n) = tan( x (n)) is distributed from −∞ to ∞. Its
probability density function is
d(arctan(ξ )) 1 1

py(n) (ξ ) = p x(n) (arctan(ξ )) = ,
dξ π 1 + ξ2

since p x(n) (ξ ) = 1/π, for −π/2 ≤ ξ < π/2.

The random signal y(n) may take high values with a significant probability, since its
probability distribution is Fy (χ) = (1/2 + arctan(χ)/π ).
The random signals whose probability distribution is not exponentially bounded, that is,
they have heavier (higher value) tails than the exponential distribution, are called the signals
with a heavy-tailed probability density function. In this case, the condition for the heavy-tailed
function, limχ→∞ ( Fy (∞) − Fy (χ))e aχ = ∞ for all a > 0, is satisfied (see Section 7.4.11).

As an introduction to the second-order statistics (that will be considered in the next section),
consider two signals x (n) and y(n), with continuous amplitude values. The probability that the nth
signal sample x (n) takes a value within ξ ≤ x (n) < ξ + dξ and that y(m) takes a value within
ζ ≤ y(m) < ζ + dζ is

Probability{ξ ≤ x (n) < ξ + dξ ), ζ ≤ y(m) < ζ + dζ )} = p x(n),y(m) (ξ, ζ )dξdζ,

where p x(n),y(m) (ξ, ζ ) is the joint probability density function. The probability of an event a ≤ x (n) < b
and c ≤ y(m) < d is

Zb Zd
Probability { a ≤ x (n) < b, c ≤ y(m) < d} = p x(n),y(m) (ξ, ζ )dξdζ.
a c

For mutually independent signals p x(n),y(m) (ξ, ζ ) = p x(n) (ξ ) py(m) (ζ ). A special case of the previous
relations is obtained when y(m) = x (m).
316 Discrete-Time Random Signals

Example 7.16. The signal x (n) is defined by x (n) = a(n) + b(n) + c(n), where a(n), b(n), and
c(n) are mutually independent random signals with the uniform probability density function over
the interval [−1, 1). Find the probability density function of the signal x (n), its mean µ x , and the
variance σx2 .

⋆Consider a sum of two independent random signals s(n) = a(n) + b(n). The probability that
s(n) = a(n) + b(n) ≤ χ can be calculated from the joint probability distribution of a(n) and
b(n) as

Fs (χ) = P{s(n) ≤ χ} = Probability{−∞ < a(n) < ∞ and − ∞ < a(n) + b(n) ≤ χ}
Z∞ χZ−ζ Z∞ χZ−ζ
= p a(n),b(n) (ξ, ζ )dξdζ = pb(n) (ζ ) p a(n) (ξ )dξdζ.
−∞ −∞ −∞ −∞

Now, we can calculate the probability density function of s(n) as a derivative of Fs (χ), that is

Z∞ χZ−ζ
dFs (χ) d
ps(n) (χ ) = = pb(n) (ζ ) p a(n) (ξ )dξdζ
dχ dχ
−∞ −∞
Z∞
= pb(n) (ζ ) p a(n) (χ − ζ )dζ = pb(n) (χ) ∗χ p a(n) (χ),
−∞

meaning that the probability density function of the sum of two independent random variables is
a convolution of the individual probability density functions. In a similar way, we can include the
third signal and obtain

p x ( n ) ( χ ) = p c ( n ) ( χ ) ∗ χ p b ( n ) ( χ ) ∗ χ p a ( n ) ( χ ),
 ( χ +3)2

 for − 3 ≤ χ ≤ −1

 16

 3− χ2

 8 for − 1 < χ ≤ 1
p x (n) (χ) = ( χ −3) 2

 for 1 < χ ≤ 3

 16



 0 for |χ| > 3.

The mean value and the variance can be calculated using p x(n) (χ), or in a direct way using the
linearity property, as

µ x = E{ x (n)} = E{ a(n)} + E{b(n)} + E{c(n)} = 0

σx2 = E{( x (n) − µ x )2 } = E{( a(n) + b(n) + c(n))2 }
= E{ a ( n )2 } + E{ b ( n )2 } + E{ c ( n )2 } + 2 ( µ a µ b + µ a µ c + µ b µ c )
1 1 1
= + + = 1.
3 3 3
Ljubiša Stanković Digital Signal Processing 317

Example 7.17. Consider two independent random signals x (n) and y(n), with probability density
functions p x(n) (ξ ) and py(n) (ξ ). A new random signal is defined is such a way that it takes the
lower value of the signals x (n) and y(n) at every instant n,

z(n) = min{ x (n), y(n)}.

Find the probability distribution and the probability density function of the random signal z(n).
What is the probability density function of z(n) if

1 −ξ/β x 1 −ξ/β y
p x (n) (ξ ) = e u(ξ ) and py(n) (ξ ) = e u(ξ )?
βx βy

⋆ Since the random signal z(n) takes the lower of the values x (n) and y(n), the probability that
z(n) = min{ x (n), y(n)} is lower than or equal to an assumed χ is equal to the probability that
at least one of the random samples x (n) and y(n) is bellow this assumed χ, that is

P{z(n) ≤ χ} = P{min{ x (n), y(n)} ≤ χ}

= 1 − P{ x (n) > χ and y(n) > χ} = 1 − P{ x (n) > χ} P{y(n) > χ}.

Since
P{ x (n) > χ} = 1 − Fx (χ) and P{y(n) > χ} = 1 − Fy (χ),
we get the probability distribution of the random variable z(n) in the form

Fz (χ) = P{z(n) ≤ χ} = 1 − (1 − Fx (χ))(1 − Fy (χ))

= Fx (χ) + Fy (χ) − Fx (χ) Fy (χ).

The probability density function follows as the derivative of the probability distribution,

dFz (ξ )
pz(n) (ξ ) = = p x(n) (ξ ) + py(n) (ξ ) − p x(n) (ξ ) Fy (ξ ) − Fx (ξ ) py(n) (ξ )
dξ
= p x(n) (ξ )(1 − Fy (ξ )) + py(n) (1 − Fx (ξ )).

For the specific probability density functions,

1 −ξ/β z
pz(n) (ξ ) = e u(ξ )
βz

with
1 1 1
= + ,
βz βx βy
since Fx (ξ ) = (1 − exp(−ξ/β x ))u(ξ ) and Fy (ξ ) = (1 − exp(−ξ/β y ))u(ξ ).
See also Problem 7.8 and its solution.
318 Discrete-Time Random Signals

7.3 SECOND-ORDER STATISTICS

The correlations and covariances, as the most important parameters of the second-order statistics, will
be analyzed in this section and related to the spectral power density of random signals.

7.3.1 Correlation and Covariance

Second-order statistics deals with two samples of random signals.

If the probability that a real-valued random signal x (n) takes a value ξ 1 and that x (m) takes ξ 2
is Px(n),x(m) (ξ 1 , ξ 2 ), then the autocorrelation is defined by

r xx (n, m) = E{ x (n) x ∗ (m)} = ∑ ∑ ξ 1 ξ 2 Px(n),x(m) (ξ 1 , ξ 2 ). (7.20)

ξ1 ξ2

For a real-valued random signal with continuous amplitudes of its samples and the second-order
probability density function p x(n),x(m) (ξ 1 , ξ 2 ), the autocorrelation is

Z∞ Z∞
r xx (n, m) = ξ 1 ξ 2 p x(n),x(m) (ξ 1 , ξ 2 )dξ 1 dξ 2 . (7.21)
−∞ −∞

If the real-valued random variables x (n) and x (m) are statistically independent, then

p x(n),x(m) (ξ 1 , ξ 2 ) = p x(n) (ξ 1 ) p x(m) (ξ 2 )

and
r xx (n, m) = µ x (n)µ x (m).
For a signal { xi (n)}, n = 1, 2, . . . , N and i = 1, 2, . . . , M, being the index of realization of this
signal, the autocorrelation function is estimated by

1 M
xi (n) xi∗ (m). (7.22)
M i∑
r̂ xx (n, m) =
=1

Autocorrelation matrix. For the signal xi (n), i = 1, 2, . . . , M, in the vector form

xi = [ xi (1), xi (2), . . . , xi ( N )] H ,

the autocorrelation matrix is estimated from

 
x i (1)
1 M 1 M 
 x i (2)  ∗
R̂ x = ∑ xi xiH = ∑  .  xi (1), xi∗ (2), . . . , xi∗ ( N )
M i =1 M i=1  .. 
xi ( N )
 
r̂ xx (1, 1) r̂ xx (1, 2) . . . r̂ xx (1, N )
 r̂ xx (2, 1) r̂ xx (2, 2) . . . r̂ xx (2, N ) 
 
= .. .
 . 
r̂ xx ( N, 1) r̂ xx ( N, 2) . . . r̂ xx ( N, N )

The autocovariance function is defined by

c xx (n, m) = E{( x (n) − µ x (n)) ( x (m) − µ x (m))∗ }. (7.23)

Ljubiša Stanković Digital Signal Processing 319

It may be easily shown that

c xx (n, m) = E{( x (n) − µ x (n)) ( x (m) − µ x (m))∗ } = r xx (n, m) − µ x (n)µ∗x (m).

For m = n, the value of the autocovariance is equal to the variance

σx2 (n) = c xx (n, n) = E{| x (n) − µ x (n)|2 } = r xx (n, n) − |µ x (n)|2 . (7.24)

Diagonal elements in the covariance matrix, C x , are the variances σx2 (n).
The cross-correlation and the cross-covariance of two signals x (n) and y(n) are defined as

r xy (n, m) = E{ x (n)y∗ (m)}

and

c xy (n, m) = E{( x (n) − µ x (n)) (y(m) − µy (m))∗ } = r xy (n, m) − µ x (n)µ∗y (m). (7.25)

For the signals whose samples are available, the autocovariance is estimated using the following
relation,
1 M
( xi (n) − µ̂ x (n)) ( xi (m) − µ̂ x (m))∗ .
M i∑
ĉ xx (n, m) =
=1
The covariance matrix can be written in the form

1 M
(xi − µ̂ x })(xi − µ̂ x }) H = R̂ x − µ̂ x µ̂ xH .
M i∑
Ĉ x = Cov(x) =
=1

Example 7.18. Consider the nth set of experiments, where the independent variable in the experiment
assume N random values t1 (n), t2 (n), . . . , t N (n) and the result of experiment takes the random
values x1 (n), x2 (n), . . . , x N (n). If the linear model is assumed, xi (n) = ati (n) + b, show that
the solution for the parameter a in this linear regression problem, can be written as

ctx (n, n) E{( x (n) − µ x (n)) (t(n) − µt (n))}

a= =
ctt (n, n) E{(t(n) − µt (n))2 }

where the covariances are estimated using

1 N 1 N 1 N
ĉtx (n, n) = r̂tx (n, n) − µ̂t (n)µ̂ x (n) = ∑
N i =1
ti ( n ) xi ( n ) − ∑
N i =1
ti ( n ) ∑
N i =1
xi ( n )

1 N 2 1 N 2
ĉtt (n, n) = r̂tt (n, n) − µ̂2t (n, n) = ∑ ti ( n ) − ∑ ti ( n ) .
N i =1 N i =1

⋆ The solution to the linear regression model is obtained from the system (see Section 7.1.4)

1 N
∑
N i =1
ti (n) xi (n) − ati (n) − b = 0

1 N
∑
N i =1
xi (n) − ati (n) − b = 0
320 Discrete-Time Random Signals

This system of equations can be written in the form

artt (n, n) + bµt (n) = rtx (n, n)

aµt (n) + b = µ x (n)

with
a(r̂tt (n, n) − µ̂2t (n)) = r̂tx (n, n) − µ̂t (n)µ̂ x (n)
producing the estimate of parameter a in the form â = ĉtx (n, n)/ĉtt (n, n).

7.3.2 Stationarity and Ergodicity

Signals whose first-order and second-order statistics are invariant to a shift in time are called wide
sense stationary (WSS) signals. For the WSS signals holds

µ x (n) = E{ x (n)} = µ x
r xx (n, m) = E{ x (n) x ∗ (m)} = r xx (n − m). (7.26)

A signal is stationary in the strict sense (SSS) if all order statistics are invariant to a shift in time.

Example 7.19. Show that for a stationary real-valued signal xi (n), n = 1, 2, . . . , N, i = 1, 2, . . . , M

r̂ xx (0) ≥ r̂ xx (n) (7.27)

holds for any n 6= 0.

⋆ For any two signal samples we can write

1 M
( xi (n) − xi (m))2 ≥ 0.
M i∑
=1

This mean that

1 M 2 1 M
∑ xi (n) + xi2 (m) ≥
M i∑
2xi (n) xi (m).
M i =1 =1
For a stationary real-valued signal, we get r̂ xx (0) + r̂ xx (0) ≥ 2r̂ xx (n − m), what proves (7.27).

A random signal x (n) is wide-sense cyclostationary (WSCS) if

E{ x (n)} = E{ x (n + N )} and
E{ x (n + N ) x ∗ (m)} = r xx ( N + n, m) = r xx (n, m).

The relations introduced for the second-order statistics may be extended to the higher-order
statistics. For example, the third-order moment of a signal x (n) is defined by

Mxxx (n, m, l ) = E{ x (n) x ∗ (m) x ∗ (l )}. (7.28)

Ljubiša Stanković Digital Signal Processing 321

For stationary signals, the third-order moment is of the form

Mxxx (m, l ) = E{ x (n) x ∗ (n − m) x ∗ (n − l )}.

In order to calculate the third-order moment we should know the third-order statistics, like the third-
order probability Px(n),x(m),x(l ) (ξ 1 , ξ 2 , ξ 3 ) or probability density function.
For m = l = 0, the third-order moment of the real-valued random variable x (n), is defined by
Z∞
3
M3 = E{ x (n)} = ξ 3 p x(n) dξ.
−∞

Higher-order statistics is commonly described through cumulants.

For a random process, as collection of all realizations of a random signal along with its
probabilistic description, we say that it is ergodic if its parameters can be estimated by averaging over
time instead of the averaging over realizations. The process is ergodic in parameter β if that particular
parameter can be estimated by averaging over time instead of averaging over realizations. If a random
signal x (n) is a realization of a process ergodic in mean then

1 1
µ x (n) = lim ( x (n) + · · · + x M (n)) = lim ( xi (n) + · · · + xi (n − N + 1)).
M→∞ M 1 N →∞ N

7.3.3 Characteristic Function and Moments

The characteristic function of a random variable x (n) is defined as the expected value of the random
variable y(n) = e jθx(n) , that is
Z∞
Φ x (θ ) = E{e jθx(n) } = p x(n) (ξ )e jθξ dξ.
−∞

It can be interpreted as the Fourier transform of the probability density function p x(n) (ξ ) (with
sign + in the exponent instead of −). The characteristic function can be related to the moments of the
random variable x (n), using the Taylor series expansion of e jθξ around zero (ξ = 0),

1 2 2 1
e jθξ = 1 + jθξ − θ ξ − j θ3 ξ 3 + . . . .
2! 3!
The characteristic function can be written in the form
Z∞ Z∞ Z∞ Z∞
1 2 1 3
Φ x (θ ) = p x(n) dξ + jθ ξ p x(n) dξ − θ ξ 2 p x(n) dξ − j θ ξ 3 p x(n) dξ + . . .
2! 3!
−∞ −∞ −∞ −∞
1 1
= 1 + jθM1 − θ 2 M2 − j θ 3 M3 + . . . , (7.29)
2! 3!
where the moments Mi are defined by
Z∞
Mi = ξ i p x(n) dξ.
−∞
322 Discrete-Time Random Signals

From the series (7.29) expansion, we can conclude that

dΦ x (θ ) d2 Φ x (θ ) d3 Φ x (θ )
Φ x (0) = 1, = M1 , = M2 , = M3 , . . .
jdθ θ =0 −dθ 2 θ =0 − jdθ 3 θ =0

For the sum of random variables, z(n) = x (n) + y(n), whose probability density function is equal
to the convolution of the corresponding probability density functions , pz(n) (ξ ) = p x(n) (ξ ) ∗ py(n) (ξ )
(see Example 7.16), the characteristic function is equal to the product of their individual characteristic
functions,

Φ z ( θ ) = Φ x ( θ ) Φ y ( θ ). (7.30)

From this relation, we can easily find the moments of the sum of random variables (see also Problem
7.21 and Example 7.24).

7.3.4 Power Spectral Density

For stationary signals, the autocorrelation function is a function of the difference of time arguments,

r xx (n) = E{ x (n + m) x ∗ (m)}.

The Fourier transform of the autocorrelation function of a WSS signal is the power spectral density
∞
Sxx (e jω ) = ∑ r xx (n)e− jωn (7.31)
n=−∞
Zπ
1
r xx (n) = Sxx (e jω )e jωn dω. (7.32)
2π
−π

Integral of Sxx (e jω ) over frequency,

Zπ
1
Sxx (e jω )dω = r xx (0) = E{| x (n)|2 }, (7.33)
2π
−π

is equal to the expected power of the random signal.

Example 7.20. Find the expected value, the autocorrelation, and the power spectral density of the
random signal
K
x (n) = ∑ ak e j(ω n+θ ) ,
k k

k =1
where θk are random variables uniformly distributed over −π < θk ≤ π. All random variables
are statistically independent. Frequencies ωk are −π < ωk ≤ π for every k.

⋆The expected value is

K K Zπ
1 j ( ωk n + θ k )
µx = ∑ a k E{ e j ( ωk n + θ k ) } = ∑ ak e dθk = 0.
k =1 k =1
2π
−π
Ljubiša Stanković Digital Signal Processing 323

The autocorrelation is
K K K
r xx (n) = E{ ∑ ak e j(ωk (n+m)+θk ) ∑ ak e− j(ω m+θ ) } = ∑ a2k e jω n ,
k k k

k =1 k =1 k =1

while the power spectral density for −π < ω ≤ π is

K
Sxx (e jω ) = FT{r xx (n)} = 2π ∑ a2k δ(ω − ωk ).
k =1

The average signal power of a signal x (n) has been defined as (2.9)

1 N D E
PAV = lim ∑ | x (n)|2 = | x (n)|2 .
N →∞ 2N + 1 n=− N

This relation leads to another definition of the power spectral density of random discrete-time
signals, given by
2
1 2 1 N
jω jω − jωn
Pxx (e ) = lim E{X N (e ) } = lim E{ ∑ x ( n ) e }. (7.34)
N →∞ 2N + 1 N →∞ 2N + 1 n=− N

Different notation is used since the definitions of power spectral density (7.31) and (7.34), in general,
will not produce the same result. We can write
N N
1
Pxx (e jω ) = lim E{ ∑ ∑ x (m) x ∗ (n)e− jω (m−n) }.
N →∞ 2N + 1 m=− N n=− N

For a stationary signal

N N
1
Pxx (e jω ) = lim ∑ ∑ r xx (m − n)e− jω (m−n) .
N →∞ 2N + 1 m=− N n=− N

Double summation is performed within a square in the two-dimensional domain defined by

− N ≤ m ≤ N, − N ≤ n ≤ N, Fig. 7.12. Since the terms within double sum are functions of (m − n)
only, then the summation could be performed along the lines where (m − n) = k is constant. For
(m − n) = k = 0 the summation line is the main diagonal of the area − N ≤ m ≤ N, − N ≤ n ≤ N.
Along this diagonal there are 2N + 1 points where r xx (m − n)e− jω (m−n) = r xx (0). For the nearest
subdiagonals of − N ≤ m ≤ N, − N ≤ n ≤ N, when (m − n) = k = ±1, there are 2N points where
r xx (m − n)e− jω (m−n) = r xx (±1)e± jω . For arbitrary lines (m − n) = ±k, with |k| ≤ 2N, there are
2N + 1 − |k| terms with r xx (m − n)e− jω (m−n) = r xx (±k)e∓ jkω . This means that we can write
2N
1
Pxx (e jω ) = lim ∑ (2N + 1 − |k|)r xx (k)e− jωk
N →∞ 2N + 1 k =−2N
2N 2N
|k|
= lim ∑ (1 − )r xx (k)e− jωk = lim ∑ w B (k)r xx (k)e− jωk .
N →∞
k=−2N
2N + 1 N →∞
k =−2N

The function w B (k) corresponds to the Bartlett (triangular) window over the calculation interval.
324 Discrete-Time Random Signals

-5

-10

-15
-15 -10 -5 0 5 10 15

Figure 7.12 Illustration of the power spectral density domain and the autocorrelation function r xx (m − n).

If the values of the autocorrelation function r xx (k) are such that the second part of the sum
∑k |k|/(2N + 1)r xx (k)e− jωk is negligible as compared to ∑k r xx (k)e− jωk then
2N
Pxx (e jω ) = lim ∑ r xx (k)e− jωk = FT{r xx (n)} = Sxx (e jω ).
N →∞
k =−2N

This is true for r xx (k) = Cδ(k) or r xx (k) being nonzero within the region |k| < k0 , such that
k0 /(2N + 1) is negligible. Otherwise Pxx (e jω ) is a smoothed version of Sxx (e jω ). Note that Pxx (e jω )
is always nonnegative, by definition (for a numeric illustration see Example 7.52). Estimation of the
power spectral density will be revisited in Section 7.5.6.

Periodically extended signals. Another way to estimate the power spectral density is to assume that a
WSS signal x (n), n = 0, 1, 2, . . . , N − 1, is periodically extended. Then
N −1 N −1
1 1
Pxx (k) = E{| X (k)|2 } = E{ ∑ ∑ x (m) x ∗ (n)e− j2πk(m−n)/N }
N N m =0 n =0
1 N −1 N −1 1 N −1 N −1
∑ ∑ E{ x (m) x ∗ (n)}e− j2πk(m−n)/N = ∑ rxx (m − n)e− j2πk(m−n)/N
N m∑
=
N m =0 n =0 =0 n =0
1 N −1 N −1+ m N −1
= ∑ ∑ r xx (i )e− j2πki/N = ∑ r xx (i )e− j2πki/N = Sxx (k).
N m =0 i = m i =0

Since the signal, x (n), is periodically extended, the autocorrelation, r xx (n), is periodically extended as
well. This means that r xx ( N ) = E{ x (m + N ) x ∗ (m)} = E{ x (m) x ∗ (m)} = r xx (0), r xx ( N + 1) =
E{ x (m + N + 1) x ∗ (m)} = E{ x (m + 1) x ∗ (m)} = r xx (1), and so on, producing the last equality in
the previous derivation.
Ljubiša Stanković Digital Signal Processing 325

The power spectrum matrix

1
Px = E{XX H }
N
with elements
1
E{ X (k) X ∗ (l )},
Pxx (k, l ) =
N
is a diagonal matrix for the WSS signal x (n), with the elements on the diagonal equal to Sxx (k). To
show this property of the WSS stationary signals consider
N −1 N −1
1 1
Pxx (k, l ) = E{ X (k) X ∗ (l )} = E{ ∑ ∑ x (m) x ∗ (n)e− j2π (km−ln)/N }
N N m =0 n =0
1 N −1 N −1 1 N −1 N −1
∑ ∑ r xx (m − n)e− j2π (km−ln)/N = ∑ rxx (m − n)e− j2πk(m−n)/N e j2π(l −k)n/N
N m∑
=
N m =0 n =0 =0 n =0
1 N −1 N −1+ n N −1
= ∑ ∑ r xx (i )e− j2πki/N e j2π (l −k)n/N = ∑ r xx (i )e− j2πki/N δ(l − k) = Sxx (k)δ(k − l ).
N n =0 i = n i =0

This means that we can write

1 1 1 1
Px = E{XX H } = E{(Wx)(Wx) H } = WE{xx H }W H = WR x W H .
N N N N
The autocorrelation matrix, R x , of a WSS signal is diagonalizable, since there is a diagonal matrix (the
power spectrum matrix), P x , such that

R x = NW H P x W.

Example 7.21. Consider the random signal

x (n) = ε(n) + 0.7ε(n − 1) + 0.5ε(n − 1) + 0.7ε(n − 3) + 0.9ε(n − 4),

where the WSS random signal ε(n), n = 0, 1, 2, . . . , N − 1, N = 16, with E{ε(n)} = 0 and
rεε (n) = δ(n), is periodically extended in such a way that ε(n + Nk) = ε(n), where k is an integer.
Find the autocorrelation r xx (n) within the basic period, n = 0, 1, 2, . . . , N − 1, and the power
spectral density Sxx (k). Use 10000 realizations of ε(n) to calculate Xi (k) = DFT N { xi (n)} and
plot meani { Xi (k) Xi∗ (l )}/N, for k = 0, 1, 2, . . . , N − 1 and l = 0, 1, 2, . . . , N − 1.

⋆ The elements of the autocorrelation function,

r xx (n) = E{ x (m + n) x ∗ (m)},

are obtained using rεε (n) = δ(n) as

r xx (0) = 1 + 0.72 + 0.52 + 0.72 + 0.92

r xx (±1) = 0.7 + 0.5 · 0.7 + 0.7 · 0.5 + 0.9 · 0.7
r xx (±2) = 0.5 + 0.7 · 0.7 + 0.9 · 0.5
r xx (±3) = 0.7 + 0.9 · 0.7
r xx (±4) = 0.9,
326 Discrete-Time Random Signals

r xx (±n) = 0, for 4 < n < 16 − 4 and r xx (n + 16k) = r xx (n). The exact value of r xx (n) and
its estimation using 10000 realizations
10000 N −1
1
with x ( N + n) = x (n),
10000N i∑ ∑ x i ( n + m ) x i ( m ),
r̂ xx (n) =
=1 m =0

are shown in Fig. 7.13(c). The value of meani { Xi (k) Xi∗ (l )}, averaged over 10000 realizations,
is given in Fig. 7.13(a). For comparison, the DFT of r xx (n) is presented on the diagonal of Fig.
7.13(b), with its exact value mean{ X (k) X ∗ (k)} shown in Fig. 7.13(d).

0 0

2 2

4 4

6 6

8 8

10 10

12 12

14 14

0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14
(a) (b)

15
3

10
2

1 5

0 0
0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14
(c) (d)

Figure 7.13 The power spectral matrix illustration. (a) The value of meani { Xi (k) Xi∗ (l )}/N averaged over
10000 realizations. (b) The DFT of r xx (n) as a diagonal matrix. (c) The exact value of r xx (n) and its estimation,
r̂ xx (n), using 10000 realizations. (d) The exact value of E{ X (k) X ∗ (k)}/N from (b).
Ljubiša Stanković Digital Signal Processing 327

7.4 NOISE AND RANDOM SIGNAL EXAMPLES

In many applications, the desired signal is disturbed by various forms of random signals, caused by
numerous factors in signal sensing, transmission, and/or processing. Often, a cumulative influence of
these factors, disturbing useful signal, is described by an equivalent random signal, called noise. In
most cases, we will use a notation ε(n) for these kinds of signals. They model random, commonly
multiple sources, disturbances.
A noise is said to be white if its values are uncorrelated

rεε (n, m) = σε2 δ(n − m) (7.35)

Spectral density of this kind of noise is constant (like it is the case in the white light),

Sεε (e jω ) = FT{rεε (n)} = σε2 .

If this property is not satisfied, then the power spectral density is not constant. Such a noise is referred
to as colored.
Regarding the distribution of noise ε(n) amplitudes the most common types of noise in signal
processing are: uniform, binary, Gaussian, Rayleigh, Laplacian, Cauchy, and Poison noise.

7.4.1 Uniform Noise

The uniform noise is a discrete-time signal with the probability density function defined by
1
pε(n) (ξ ) = , for − ∆/2 ≤ ξ < ∆/2 (7.36)
∆
and pε(n) (ξ ) = 0 elsewhere, Fig. 7.14. This noise takes values within the interval [−∆/2, ∆/2) with
equal probability. The variance of uniform noise is
∆/2
Z
∆2
σε2 = ξ 2 pε(n) (ξ )dξ = .
12
−∆/2

This kind of noise is used to model rounding errors in the amplitude quantization of discrete-time
signals. Its distribution indicates that all errors within −∆/2 ≤ ξ < ∆/2 are equally probable.

1.5 1.5

1 1

0.5 0.5

0 0

-0.5 -0.5

-1 -1

-1.5 -1.5
0 10 20 30 40 50 60 0 0.5 1 1.5

Figure 7.14 A realization of the uniform noise (left) with its probability density function (right), when ∆ = 1.
328 Discrete-Time Random Signals

7.4.2 Binary, Bernoulli, and Binomial Random Signal

Random binary sequence, or binary noise, is a stochastic signal which randomly takes one of the
two fixed signal values. Assume that the noise ε(n) values are, for example, {−1, 1} and that the
probability that ε(n) takes value 1 is Pε (1) = p, while Pε (−1) = 1 − p. The expected value of this
noise is
µε = ∑ ξPε (ξ ) = (−1)(1 − p) + 1 · p = 2p − 1.
ξ =−1,1
The variance is
σε2 = ∑ (ξ − µε )2 Pε (ξ ) = 4p(1 − p).
ξ =−1,1

A special case is obtained when the values from the set {−1, 1} are equally probable, that is, when
p = 1/2. Then, we get µε = 0 and σε2 = 1.
When the random signal ε(n) values are from the set {0, 1}, then this form of binary signal is
referred to as the Bernoulli random signal or Bernoulli noise. This signal takes the value ε(n) = 1 with
the probability p, while the probability of ε(n) = 0 is equal to (1 − p). The probability that the signal
sample ε(n) takes one specific value, can be written as

P(ε(n)| p) = pε(n) (1 − p)1−ε(n) ,

since P(ε(n) = 1| p) = p and P(ε(n) = 0| p) = 1 − p. The expected value of the Bernoulli noise is
µε = p, while the variance is

σε2 = ∑ (ξ − µε )2 Pε (ξ ) = (0 − p)2 (1 − p) + (1 − p)2 p = p(1 − p).

ξ =0,1

Example 7.22. Consider a set of N → ∞ balls. An equal number of balls is marked with 1 (or white)
and 0 (or black). A random signal x (n), n = 0, 1, 2, 3, corresponds to drawing of four balls in a
row. This signal has four samples x (0), x (1), x (2), and x (3). The signal values x (n) are equal
to the marks on the drawn balls. Write all possible realizations of x (n). If k is the number of
appearances of value 1 in the signal, write the probabilities for each value of k.

⋆Signal realizations, xm (n), with the number k being equal to the number of appearances of
digit 1 in every signal realization, are given in the next table.

x m (0) 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
x m (1) 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1
x m (2) 0 0 1 1 0 0 1 1 0 0 1 1 0 0 1 1
x m (3) 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
3
y(m) = ∑ xm (n) = k 0112122312232334
n =0

Possible values of k are 0, 1, 2, 3, 4 with the corresponding probabilities

P(0) = 1 · 21 21 21 12 , P(1) = 4 · 21 12 21 21 , P(2) = 6 · 21 12 21 21 ,

P(3) = 4 · 12 21 21 12 , and P(4) = 1 · 21 21 12 21 .
Ljubiša Stanković Digital Signal Processing 329

These probabilities can be considered as the terms of the binomial expression

( p + q)4 = 40 p4 + 41 p3 q + 42 p2 q2 + 43 pq3 + 44 q4

with p = 1/2 and q = 1 − p = 1/2. For the case when N is a finite number see Problem 7.9.

An interesting form of the random variable, resulting from the experiment when the random
varaible can take only two possible values {−1, 1} or {0, 1} or {No, Yes} or { A, B}, is the binomial
random variable. Binomial random variable is equal to the number, k, of successes (1, or Yes or B)
in a sequence of N independent binary experiments, each of which yields success with probability p.
This random varaible obeys the binomial distribution which is the basis for the popular binomial test of
statistical significance.
The binomial random variable, k, has been introduced through the previous simple example.
In general, if the signal x (n) takes the value B from the set { A, B} with the probability p, then the
probability that there is exactly k values of B in a specific order, within a sequence of N samples of x (n),
is pk (1 − p) N −k . For k = 1, it is possible to achieve this result in N specific orders (combinations, see
Example 7.22). When k = 2, then there are N ( N − 1) = ( N2 ) such combinations. In general, for any k,
there are ( Nk ) orders (combinations) that x (n) takes k times value B, that is
N!
P(k) = N
k p k (1 − p ) N − k = p k (1 − p ) N − k . (7.37)
k!( N − k)!

This is a binomial coefficients form of ( p + q) N = ( p + (1 − p)) N . The expected value of the number
of appearances y(m) = k of the event B or “Yes” in N samples is
N N
N!
µ y = E{ y } = ∑ kP(k) = ∑ k k!( N − k)! pk (1 − p) N −k .
k =0 k =0

Since the first term in the summation is 0, we will shift the summation for one and reindex it to
N −1
N ( N − 1) !
µ y = E{ y } = ∑ (k + 1) (k + 1)!(( N − (k + 1))! pk+1 (1 − p) N −(k+1)
k =0
N −1
( N − 1) !
= Np ∑ pk (1 − p)( N −1)−k .
k =0
k! (( N − 1 ) − k ) !

The sum in the last expression is equal to 1 since

N −1 N −1
N −1 N−1 k ( N − 1) !
1 = ( p + (1 − p)) =∑ p (1 − p)( N −1)−k = ∑ p k (1 − p ) N −1− k
k =0
k k =0
k! ( N − 1 − k)!

resulting, with p + (1 − p) = 1, into

µy = E{y} = N p.

As we could write from the beginning, the expected value of the number of appearances of an event B,
whose probability is p, in N realizations is E{y} = N p. This derivation was performed not only to
prove this fact, but it will lead us to the next step in deriving the variance of the event y, by using the
330 Discrete-Time Random Signals

expected value of the product of y and y − 1,

N N
N!
E{y(y − 1)} = ∑ k(k − 1) P(k) = ∑ k(k − 1) k!( N − k)! pk (1 − p) N −k .
k =0 k =0

Since the first two terms are 0, we can reindex the summation as
N −2
N!
E{y(y − 1)} = ∑ (k + 2)(k + 1) (k + 2)!( N − 2 − k)! pk+2 (1 − p) N −2−k
k =0
N −2
( N − 2) !
= N ( N − 1) p2 ∑ p k (1 − p ) N −2− k .
k =0
k! ( N − 2 − k)!

The relation
N −2
( N − 2) !
∑ pk (1 − p) N −2−k = ( p + (1 − p)) N −2 = 1
k =0
k! ( N − 2 − k ) !
is used to get
E{y(y − 1)} = N ( N − 1) p2 .
The variance of y follows from

σy2 = E{y2 } − (E{y})2 = E{y(y − 1)} + E{y} − (E{y})2 = N p(1 − p).

Therefore, in a sequence of N values of signal x (n) that can take values {0, 1}, the expected
value of the number of appearances of 1, y = ∑nN=1 x (n), divided by N, is

1 N y Np
(7.38)
N n∑
E{ z } = E{ x (n)} = E{ } = = p.
=1 N N

The variance of the normalized number of appearances of the value 1 is

1 2 N p (1 − p ) p (1 − p )
σz2 = 2
σy = 2
=
N N N
By increasing the number of the total values N, the variance will be lower and a finite set x (n) will
produce a more reliable mean value p (see Example 7.33).
Notice that the random variable z is the mean of the independent random variables x (n).

7.4.3 Bayesian Inference for Binary and Binomial Random Signal

Here we will reconsider Bayes’ theorem and use Bayesian analysis to test hypotheses about the
probabilistic model parameters in the case of binary random signals. Bayesian inference is a method
in which Bayes’ theorem is used to update the probability for a hypothesis as more evidence samples
(events) become available.
Assume that the random signal x (n) takes the values +1 and 0 (positive or negative test result;
’head’ or ’tail’ in the coin-tossing; value 1 or −1) with probabilities p and 1 − p, respectively, that are
not known.
The event B consists of N observed samples x (n), n = 1, 2, . . . , N, and the number k of x (n) = 1
occurrences in this sequence. The aim is to estimate the probability p based on the results obtained in
the observed event B (that is, based on k and N).
Ljubiša Stanković Digital Signal Processing 331

The classical frequentist approach to the estimation of the probability p is based on (7.38) and

1 N k
N n∑
p̂ = x (n) =
=1 N

assuming that x (n) takes the value of 1 with the probability p and the value of 0 with the probability
q = 1 − p. This problem is elaborated in detail in Example 7.33.
Here, we will consider the problem of the probability p estimation within the Bayes framework.
In this case, Bayes’ relation can be written in the form

P( B| p) P( p)
P( p| B) = , (7.39)
P( B)
where the hypothesis is that the probability p takes a particular value from the given set of all possible
values and that the event B occurred assuming this probability. The terms in this expression are:
• Prior P( p) for the hypothesis p. It has to be assumed based on our possible knowledge about
the resulting p:
1. If all values of p are equally probable, then the uniform prior is assumed, P( p) = C, for a
discrete set of possible values

p ∈ {0, ∆p, 2∆p, . . . , 1}

with the step, for example, ∆p = 0.01 and C is a constant (as it will be shown, not
relevant).
2. If we expect that the value of p is close to 0.5, then we can assume, for example, the prior
P( p) = 2C (1 − 2| p − 0.5|) for 0 ≤ p ≤ 1, with the step ∆p for p, or
3. The Gaussian function prior form
2
1 − ( p−0.5)
P( p) = C √ e 2σ2 , calculated at the discrete values of p, with the step ∆p.
σ 2π

• The likelihood factor P( B| p) is equal to the probability that the event B occurred for the assumed
value of p. For the random binary signal, the event B denotes a realization which consists of k
samples x (n) = 1 and N − k samples when x (n) = 0. The probability of this event B, for the
assumed p, is

P( B| p) = Nk pk (1 − p) N −k . (7.40)

• The probability of observing the data specified by the event B (k times x (n) = 1 and N − k
times x (n) = 0), summed over all hypotheses (all possible values of p) is

P( B) = ∑ Nk pk (1 − p) N −k P( p).
p

It is common to avoid the last (marginal) probability, P( B), in (7.39) which is not dependent on p
(plays the role of a normalization factor), and to consider only the so called posterior

P ( p | B ) ∝ P ( B | p ) P ( p ).

In order to give probabilistic interpretation, the values P( p| B) can be normalized so that

∑ P( p| B) = 1
p
332 Discrete-Time Random Signals

for all considered discrete values of p.

Example 7.23. Here, will calculate the posterior of the hypothesis p, P( p| B) ∝ P( B| p) P( p), for the
binary signal x (n) when the evidence B consists of N signal samples, with x (n) = 1 appearing k
times. The posterior P( p| B) is updated as N increases. The following events are analyzed:
(a) The event B of N = 6 samples x (n), with k = 2 samples taking the value x (n) = 1.
(b) The event B when the number of available samples (observations) is increased to N = 50
and k = 9 times x (n) = 1 is obtained.
(c) The event B with a large number of available samples, N = 1000, when k = 220 nonzero
samples, x (n) = 1, are observed.
For the hypothesis p use:
(i) The uniform prior P( p) = C and √
(ii) the Gaussian prior P( p) = C exp(−( p − 0.5)2 /(2σ2 ))/(σ 2π ), with σ = 0.05,
and the set of values 0 ≤ p ≤ 1 with the step ∆p = 0.01. The value of constant C is not
relevant, since the results are normalized.

⋆ The results for the posterior, P( p| B) ∝ P( B| p) P( p), with P( B| p) defined in (7.40), are shown
in Fig. 7.15, for the uniform prior, P( p) (left) and foe the Gaussian prior P( p) (right), for various
p and given N and k in (a), (b), and (c), respectively.
We can see that the hypothesis’ probability is influenced by the prior distribution. When
the evidence is large (with large N, as in (c)) both cases produce highly concentrated probability
descriptions (denoted by (c)) close to the expected value of the parameter p.
For the probabilistic interpretation, the values of P( p| B) in Fig. 7.15 should be normalized
so that ∑ p P( p| B) = 1 for every considered case (bottom panels).

The maximum position of the likelihood factor P( B| p) (maximum likelihood estimation) can ne
found in an analytic way. From
dP( B| p)
=0
dp
follows k(1 − p) = ( N − k ) p or p = k/N. This solution holds for the maximization of the posterior
P( p| B) for the uniform prior P( p). However, for other prior functions, the maximum of the likelihood
factor depends on the prior function, especially for low N. The influence of the prior will be analyzed
next.

Log-Likelihood Function. A specific form of the binary random signal, called the Bernoulli random
signal, will be used to introduce few more important concepts in Bayesian analysis.
For the Bernoulli random signal, the probability that the signal sample x (n) takes one specific
value, with the assumed parameter value p (hypothesis), can be written in a compact form as

P( x (n)| p) = p x(n) (1 − p)1− x(n) ,

since P( x (n) = 1| p) = p and P( x (n) = 0| p) = 1 − p.

For N statistically independent samples, given in the vector x = [ x (1), x (2), . . . , x ( N )] T , the
likelihood function is the joint probability of N random samples x (n), and it is equal to the product of
Ljubiša Stanković Digital Signal Processing 333

1 1

0.5 0.5

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

0.3 0.3

0.2 0.2

0.1 0.1

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1

Figure 7.15 Bayesian approach based estimation of the probability p that a nonzero sample, x (n) = 1, is obtained
in the binary random signal for the different number of available samples (realizations) N and the number of nonzero
samples k: (a) N = 6, k = 2, (b) N = 50, k = 9, and (c) N = 1000, k = 220. The uniform prior P( p) is used for
the left panels and the Gaussian prior centered at p = 0.5 for the right panels. The step in p was ∆p = 0.01. All
shown probabilities are normalized to one (top panels). For the probabilistic interpretation, the values of P( p| B)
should be normalized so that ∑ p P( p| B) = 1 for each of the considered case (bottom panels).

their individual probabilities, that is

N N N
P(x| p) = ∏ p x ( n ) (1 − p )1− x ( n ) = p ∑ = n 1 x (n) (1 − p)∑n=1 (1− x(n)) .
n =1

The aim is to find the maximum of P(x| p). We may use the derivative of P(x| p), or the derivative
of its logarithm, since the logarithm is a monotonous function and will not change the maximum
position for the positive considered function. In general, we have

d ln( f ( x )) f ′ (x)
= ,
dx f (x)

meaning that both f ( x ) and ln( f ( x )) will produce the same result for the maximum position, f ′ ( x ) = 0.
The log-likelihood function is of the form
N N
ln( P(x| p)) = ∑ x (n) ln( p) + ∑ (1 − x(n)) ln(1 − p).
n =1 n =1

The maximum of this function (maximum likelihood estimation (MLE) of the model parameter p) is
obtained from
p MLE = arg{max{ln( P(x| p))}}.
It follows by making the derivative of ln( P( B| p)) with regard of p equal to 0,

d(ln( P(x| p)))

= 0.
dp
334 Discrete-Time Random Signals

After differentiation of ln( P(x| p)) over p, from

∑nN=1 x (n) ∑ N (1 − x (n))

− n =1 =0
p 1− p
we get
∑nN=1 x (n) k
p MLE = =
N N
if exactly k signal samples in x = ( x (1), x (2), . . . , x ( N )) take the value x (n) = 1. This event is denoted
by B in the previous example, with the probability P( B| p) = P(x| p).

Frequentist versus Bayesian inference. In the previous analysis, we get a specific value for the
hypothesis parameter p (this is the so-called frequentist inference). It does not give the probability of
the hypothesis parameter p, as it was the case in Fig. 7.15. The Bayes analysis, presented in Fig. 7.15,
is based on the posterior P( p| B) ∝ P( B| p) P( p) which results not in a single value of the parameter
p, but in its probabilistic description (Bayesian inference). This probabilistic description includes our
prior belief P( p) and the current experiment outcome (evidence), P( B| p).

The maximum a posterior (MAP) solution is obtained as the position of the maximum of the
logarithm of the a posterior probability P( p| B) ∝ P( B| p) P( p), that is

p MAP = arg{max{ln( P( B| p) P( p))}} = arg{max{ln( P( B| p)) + ln( P( p))}}.

Note that the result of the maximum likelihood solution p MLE is the special case of the maximum
a posterior solution p MAP , when the prior P( p) is uniform.
In many optimization approaches, the negative value of the logarithm is used as a cost function.
Then, instead of the maximum position, the position of the minimum is the problem solution

p MAP = arg{min{− ln( P( B| p) P( p))}}.

In solving the minimization problem, various gradient-based algorithms can also be used.

7.4.4 Gaussian Noise

The Gaussian (normal) noise is used to model a disturbance caused by many small independent factors.
Namely, the central limit theorem states that a sum of a large number of statistically independent
random variables, with any distribution, obeys to the Gaussian distribution.

3 3

2 2

1 1

0 0

-1 -1

-2 -2

-3 -3
0 10 20 30 40 50 60 0 0.25 0.5

Figure 7.16 A realization of Gaussian noise (left) with its probability density function (right).
Ljubiša Stanković Digital Signal Processing 335

The Gaussian zero-mean noise has the probability density function

1 2 2
pε(n) (ξ ) = √ e−ξ /(2σε ) . (7.41)
σε 2π

The variance of this noise is σε2 (see Problem 7.10). For a Gaussian distributed random signal with the
mean value µ and the variance σε2 , whose probability density function is

1 2 2
pε(n) (ξ ) = pε(n) (ξ |µ, σε ) = √ e−(ξ −µ) /(2σε ) , (7.42)
σε 2π

we can use the notation N (µ, σε2 ).

Example 7.24. Consider N random signals (variables), xi (n), i = 1, 2, . . . , N, that are independent,
and identically distributed (i.i.d.), with the unit variance and zero-mean. The probability density
functions of the random signals xi (n) are the same and equal to p x (ξ ). A new random signal
y(n) is formed as a sum

1 1 1
y ( n ) = √ x1 ( n ) + √ x2 ( n ) + · · · + √ x N ( n ).
N N N
√
The factors 1/ N are introduced so that the variance of y(n) is σy2(n) = 1.
Find the probability density function of the random signal y(n) when N → ∞.

√ √ √
⋆ The probability density function of the √ random signal xi (n)/ N is p x (ξ N ) N. The
characteristic functions of xi (n) and xi (n)/ N are, respectively, (7.29)

1 2 1 1 1
Φ x (θ ) = 1 + jθM1 − θ M2 − j θ 3 M3 + . . . = 1 − θ 2 − j θ 3 M3 + . . . ,
2! 3! 2! 3!
θ 1 θ2 1 θ3 1 θ2 1 θ3
Φ √x (θ ) = 1 + j √ M1 − M2 − j M3 + · · · = 1 − − j M + ... ,
N N 2! N 3! N 3/2 2! N 3! N 3/2 3

since M1 = 0 (zero-mean) and M2 = 1 (unit variance). √

The random signal y(n) is obtained as a sum of signals xi (n)/ N, i = 1, 2, . . . , N, and its
characteristic function is equal to the product of the individual characteristic functions, (7.30)
1 θ2 1 θ3 N
N√
Φy (θ ) = Φ x/ ( θ ) = 1 − − j M3 + . . . .
N 2! N 3! N 3/2
For large N, we can write
1 θ2 1 θ3 N θ2 N 2
lim Φy (θ ) = lim 1− −j M3 + . . . = lim 1 − = e−θ /2 ,
N →∞ N →∞ 2! N 3! N 3/2 N →∞ 2N
since
lim (1 − x/N ) N = e− x .
N →∞
336 Discrete-Time Random Signals

2
The inverse Fourier transform (with sign −) of e−θ /2 is equal to the probability density
function of the unit-variance Gaussian random variable,
Z∞
1 2 1 2
e−θ /2 − jθξ
e dξ = √ e−ξ /2 ,
2π 2π
−∞

what proves the central limit theorem (CLT) for the sum of independent, and identically distributed
(i.i.d.) random variables.

The probability that the amplitude of a zero-mean Gaussian random variable takes a value smaller
than λ is
Zλ
1 −ξ 2 /(2σε2 ) λ
Probability{|ε(n)| < λ} = √ e dξ = erf √ (7.43)
σε 2π 2σε
−λ
where
Zλ
2 2
erf(λ) = √ e−ξ dξ
π
0
is the error function.
Commonly used probabilities that the absolute value of the Gaussian random variable is within
the standard deviation, two standard deviations (two-sigma rule), or three standard deviations are
√
Probability{−σε < ε(n) < σε } = erf(1/ 2) = 0.6827, (7.44)
√
Probability{−2σε < ε(n) < 2σε } = erf( 2) = 0.9545,
√
Probability{−3σε < ε(n) < 3σε } = erf(3/ 2) = 0.9973.

0.5

0.4

0.3

0.2

0.1

0
-4 -3 -2 -1 0 1 2 3 4

Figure 7.17 Probability density function with the intervals corresponding to −σε < ε(n) < σε , −2σε < ε(n) < 2σε ,
and −3σε < ε(n) < 3σε . Value of σε = 1 is used.

Example 7.25. Given 12 measurements of a Gaussian zero-mean noise ε(n) ∈ {−0.7519, 1.5163,
−0.0326, −0.4251, 0.5894, −0.0628, −2.0220, −0.9821, 0.6125, −0.0549, −1.1187, 1.6360}.
Estimate the sample standard deviation of this data and use it to estimate the probability that the
absolute value of this noise will be smaller than 2.5.
Ljubiša Stanković Digital Signal Processing 337

⋆The standard deviation of this noise could be estimated using (7.7) with µ = 0 and M = 12
(see also Section 7.4.5). Its values is σ = 1.031. Thus, the absolute value of this noise will be
smaller than 2.5 with the probability of

Z2.5 √
1 2 2
P{|ε(n)| < 2.5} = √ e−ξ /(2·1.031 ) dξ = erf(2.5/( 2 · 1.031)) = 0.9847.
1.031 2π
−2.5

Example 7.26. The random signal x (n) is a Gaussian noise with the mean value µ x = 1 and the
variance σx2 = 1. The random sequence y(n) is obtained by omitting samples from the signal
x (n) that are either negative or higher than 1. Find the probability density function of sequence
y(n). Find its mean and variance, µy and σy .

⋆The probability density function for the sequence y(n) is

 ( ξ −1)2

 B √1 e− 2 for 0 < ξ ≤ 1
2π
py(n) (ζ ) =


0 otherwise
R∞
The constant B can be calculated from −∞ py(n) (ξ )dξ = 1, resulting in B = 2/ erf( √1 ). Now
2
we have
Z1 √
2 1 − ( ξ −1)2 2(1 − e−1/2 )
µy = ξ √ e 2 dξ = 1 − √ ≈ 0.54
erf( √1 ) 2π π erf( √1 )
0 2 2
Z1
2 1 ( ξ −1)2
σy2 = ( ξ − µ y ( n ) )2 √ √ e− 2 dξ ≈ 0.08.
erf( 2) 2π
0

7.4.5 Estimation of the Gaussian Distribution Parameters

Estimation of the Gaussian distribution parameters based on the observed set of the signal values will
be presented in this section, using the maximum likelihood approach.

Stationary signal. Consider a stationary random Gaussian distributed signal x (n) whose N samples
are available. The probability density function of a signal sample x (n) is defined by

1 ( ξ − µ )2
p x(n) (ξ |σ, µ) = √ exp (− ),
σ 2π 2σ2
where σ and µ are the assumed (unknown) parameters of the Gaussian distribution.
338 Discrete-Time Random Signals

A stationary random signal with N independent samples may be considered as an N-dimensional

random variable, with the joint probability density function

1 ( ξ − µ )2 1 ( ξ − µ )2
p x(1),...,x(n) (ξ 1 , . . . , ξ N |σ, µ) = √ exp (− 1 2 ) × · · · × √ exp (− N 2 ).
σ 2π 2σ σ 2π 2σ
(7.45)
Vector form relation of this probability density function is

1 ||ξ − µ||22
px (ξ |σ, µ) = p exp (− ),
σN (2π ) N 2σ2

where x =[ x (1), x (2), . . . , x ( N )] T, ξ =[ξ 1 , ξ 2 , . . . , ξ N ] T, and ||ξ − µ||22 = (ξ 1 − µ)2 + · · · +(ξ N − µ)2 .
Within the maximum likelihood estimation (MLE) framework, the goal is to find the un-
known (prior) parameters σ and µ so that the distribution fits best the observed data x =
[ x (1), x (2), . . . , x ( N )] T . The probability of σ and µ, given the observed random signal samples,
P(σ, µ|x), can be written using Bayes’ relation for a posterior distribution as in (7.16), (7.39)

P(x|σ, µ) P(σ, µ)
P(σ, µ|x) = .
P(x)

Since P(x) does not depend on parameters σ and µ, this (marginal) probability does not influence
the optimization with regard to the parameters σ and µ, and is commonly omitted from the analysis.
Furthermore, using the uniform priors, P(σ, µ) = c, we can write

P(σ, µ|x) ∝ Px (x|σ, µ).

The probability that the random signal x (n) takes specific values, given by

x = [ x (1), x (2), . . . , x ( N )] T ,

is related to the probability density function as

1 ||x − µ||22
P(x|σ, µ) = px (x|σ, µ)dx = p exp (− )dx.
σ N (2π ) N 2σ2

Therefore, the best fitting parameter (σ, µ) values can be obtained by maximizing

P(σ, µ|x) ∝ px (x|σ, µ).

For the Gaussian distributed random signal, the maximization is performed straightforwardly, by
differentiating the probability density (likelihood) function p(x|σ, µ) or its logarithm (log-likelihood)
function.
We will use the negative logarithm function, when the likelihood maximization problem is
equivalent to the log-likelihood minimization problem stated as
n o
(σ, µ) MLE = arg min − ln px (x|σ, µ) . (7.46)

This means that we have to minimize the cost function − ln px (x|σ, µ) , defined by

N ||x − µ||22
J (σ, µ) = − ln px (x|σ, µ) = ln(2π ) + N ln(σ) + , (7.47)
2 2σ2
Ljubiša Stanković Digital Signal Processing 339

where ||x − µ||22 is the squared two-norm (L2 -norm) of the vector x − µ,

||x − µ||22 = ( x (1) − µ)2 + ( x (2) − µ)2 + · · · + ( x ( N ) − µ)2 .

Using ∂J (σ, µ)/∂µ = 0, the parameter µ estimate follows from

2( x (1) − µ) + 2( x (2) − µ) + · · · + 2( x ( N ) − µ) = 0, as
1
µ̂ = ( x (0) + x (1) + · · · + x ( N )),
N
while using ∂J (σ, µ)/∂σ = 0, an estimate of the parameter σ is obtained from

N 1 1
− ||x − µ||22 3 = 0, as σ̂2 = ||x − µ̂||22 .
σ σ N
These are the well-known statistical relations for the mean value and the variance introduced
intuitively (frequentist inference) in Section 7.1. In Bayesian inference, we should provide P(σ, µ|x) ∝
px (x|σ, µ) P(σ, µ), for an assumed prior P(σ, µ) and a set of possible values for µ and σ, rather than
their specific values (as in the next example).

Example 7.27. The concept of finding the parameters µ and σ of the Gaussian distribution, to fit
data, is illustrated on a simple data set. Assume that four observations of a Gaussian stationary
signal x (n) are available, and given by x (1) = 0.2, x (2) = −0.3, x (3) = −0.4, and x (4) = 0.5.
Estimate the expected value, µ, and the variance, σ2 , of the Gaussian distribution from the
observed data. The data set is then increased to N = 20 available samples, whose values are given
in Fig. 7.18(right).
Find the posterior distribution P(σ, µ|x) for the discrete sets −1 ≤ µ ≤ 1 and 0.1 ≤ σ ≤ 1,
with the step 0.05, and the uniform prior P(σ, µ) = C.

⋆ The log-likelihood function of the joint distribution of the observed data is given by (7.47)

4 ( x (1) − µ )2 + ( x (2) − µ )2 + ( x (3) − µ )2 + ( x (4) − µ )2

J (σ, µ) = ln(2π ) + 4 ln(σ ) + .
2 2σ2
Differentiation of this expression with respect to µ, with ∂J (σ, µ)/∂µ = 0, produces the estimate

1
µ̂ = ( x (1) + x (2) + x (3) + x (4)) = 0,
4
while the differentiation of J (σ, µ) with respect to σ results in

4 1
− ( x (1) − µ )2 + ( x (2) − µ )2 + ( x (3) − µ )2 + ( x (4) − µ )2 3 = 0
σ σ
1p 2
σ̂ = 0.2 + 0.32 + 0.42 + 0.52 = 0.3674.
2
The Bayesian inference approach, for the uniform prior P(σ, µ) = C, would produce the
probability

1 ||x − µ||22
P(σ, µ|x) ∝ px (x|σ, µ) P(σ, µ) ∝ px (x|σ, µ) = p exp (− )
σ N (2π ) N 2σ2
340 Discrete-Time Random Signals

as shown in Fig. 7.18(left) for given x = [2, −3, −4, 5] T /10 and variable µ and σ.
In order to show the influence of the number of samples on the reliability of the result for µ
and σ we have also calculated P(σ, µ|x) ∝ px (x|σ, µ) for N = 20 available signal samples and
discrete sets −1 ≤ µ ≤ 1 and 0.1 ≤ σ ≤ 1, with the step 0.05, Fig. 7.18(right).
Both of these sets of the available samples x (n), with N = 4 and N = 20, produce almost
the same result in the frequentist inference approach, µ ≈ 0.00 and σ ≈ 0.37, while their posterior
distributions P(σ, µ|x) are quite different.

0.02 0.1

0 0
1 1
1 1
0.5 0.5
0.3 0 0.3 0
0.1 0.1
-1 -1

x = [2, −3, −4, 5] T /10, x = [2, −3, −4, 5, 8, 1, −1, 3, 0, −7, 0, −5, 2, 4, −1, 4, −6, −1, 1, −1] T /10.

Figure 7.18 Bayesian inference approach based estimation of the parameters µ and σ in the Gaussian random
signal for different numbers of the available samples (realizations) N. All shown probabilities P(σ, µ|x) are
normalized so that ∑σ ∑µ P(σ, µ|x) = 1 holds for considered cases. The uniform prior, P(σ, µ) = C, is used.

Example 7.28. A noisy random variable x (n) is a function of M independent variables ti (n),
i = 1, 2, . . . , M,

x (n) = a1 t1 (n) + a2 t2 (n) + · · · + a M t M (n) + ε(n), n = 1, 2, . . . , N.

The values of x (n), n = 1, 2, . . . , N, are obtained with the available ti (n), i = 1, 2, . . . , M,

n = 1, 2, . . . , N, and the assumed parameters a1 , a2 , . . . , a M , while ε(n) is a zero-mean Gaussian
noise.
Since ε(n) is assumed to be a zero-mean Gaussian variable, then as in (7.45), we have

1 ξ2 1 ξ2
pε(1),...,ε(n) (ξ 1 , . . . , ξ N |σ) = √ exp (− 12 ) × · · · × √ exp (− N2 ),
σ 2π 2σ σ 2π 2σ
Having in mind that

ε(n) = x (n) − a1 t1 (n) + a2 t2 (n) + · · · + a M t M (n) = x (n) − t(n)a,

where a = [ a1 , a2 , . . . , a M ] T and t(n) = [t1 (n), t2 (n), . . . , t M (n)], we get

1 ( x (1) − t (1) a )2 1 ( x ( N ) − t( N )a)

px (x|σ, a, t(n)) = √ exp (− ) × · · · × √ exp (− ).
σ 2π 2σ2 σ 2π 2σ2
Ljubiša Stanković Digital Signal Processing 341

As explained in this section, the best fitting parameters are obtained by maximizing

P(σ, a, t(n)|x) ∝ px (x|σ, a, t(n)).

The log-likelihood cost function J (σ, a, T) = − ln P(σ, a, t(n)|x) , is equal to

N ||x − Ta||22
J (σ, a, T) = ln(2π ) + N ln(σ ) + , (7.48)
2 2σ2
where T is the matrix whose rows are t(n).
The minimization with respect to a produces the MAP result a = (T T T)−1 T T x, as in
(7.12). This maximum a posterior (MAP) solution corresponds to the uniform prior P(σ, a) and
its is equal to the MLE solution. If a nonuniform prior P(σ, a) to σ and a is added, then the
posterior probability is

P(σ, a, T(n)|x) ∝ px (x|σ, a, t(n)) P(σ, a).

For the Gaussian prior to σ and a

P(σ, a) = exp(−λ||a||22 /(2σ2 )),

where λ is the parameter, the cost function is

N ||x − Ta||22 + λ||a||22

J (σ, a, T) = ln(2π ) + N ln(σ) + . (7.49)
2 2σ2
Its MAP solution is given by (7.13).
For the Laplacian prior to σ and a,

P(σ, a) = exp(−λ||a||1 /(σ2 )),

which penalizes high values of elements and enforces the solution with the maximum possible
number of the zero-valued elements in vector of coefficients a, we get

N ||x − Ta||22 + λ||a||1

J (σ, a, T) = ln(2π ) + N ln(σ) + , (7.50)
2 2σ2
whose MAP solution for a corresponds to the LASSO minimization of ||x − Ta||22 + λ||a||1 .
For the Bayesian interpretation, the posterior probability P(σ, a, T(n)|x) should be
calculated for discrete values of parameters a and σ, and the probabilistic interpretation of
the results should be given.

The case of nonstationary Gaussian random signal, when we cannot assume either that the
expected value and the variance of the samples are time-invariant or that the samples are statistically
independent is more complex. This case will be considered in Part VI.

7.4.6 Cramer-Rao Bound

Consider the signal x = [ x (1), x (2), . . . , x (n)] T and its true parameter θ whose unbiased estimate,
obtained using the data in x, is θ̂ (x). For the unbiased estimate, holds

E{θ̂ (x) − θ } = 0,
342 Discrete-Time Random Signals

by definition. This means that

∂
E{θ̂ (x) − θ } = 0 for all θ.
∂θ
Assuming that the probability density function of x, with an assumed θ, is px (x|θ ), we can write
Z∞
∂
(θ̂ (x) − θ ) px (x|θ )dx = 0 for all θ.
∂θ
−∞

After the differentiation is performed, this equation can be rewritten in the form
Z∞ Z∞
∂px (x|θ )
− px (x|θ )dx + (θ̂ (x) − θ ) dx = 0 or
∂θ
−∞ −∞

Z∞
∂px (x|θ )
(θ̂ (x) − θ ) dx = 1, (7.51)
∂θ
−∞
since the first integral is equal to 1, by definition. We know that the derivative of the logarithm of a
function px (x|θ )) is given by

∂ ln( px (x|θ )) 1 ∂px (x|θ )

= .
∂θ px (x|θ )) ∂θ

This means that relation (7.51) can be written in the form

Z∞
∂ ln( px (x|θ ))
(θ̂ (x) − θ ) px (x|θ )dx = 1.
∂θ
−∞

Now, we will adjusted the form of this relation for the Schwartz inequality application,

Z∞ h q i h ∂ ln( p (x|θ )) q i 2
x
(θ̂ (x) − θ ) px (x|θ ) px (x|θ ) dx = 1 (7.52)
∂θ
−∞
R R R
According to the Schwartz inequality, ( f ( x ) g( x )dx )2 ≤ f 2 ( x )dx g2 ( x )dx,

The inequality terms can be recognized as

Z∞
Var(θ̂ (x)) = (θ̂ (x) − θ )2 px (x|θ )dx and
−∞
Z∞
∂ ln( px (x|θ )) 2 ∂ ln( p (x|θ )) 2
x
px (x|θ )dx = E{ }.
∂θ ∂θ
−∞
Ljubiša Stanković Digital Signal Processing 343

Applying this notation, we finally get the Cramer-Rao bound for the variance of the estimated parameter
1 1
Var(θ̂ (x)) ≥ 2 = I ( θ )
∂ ln( px (x|θ ))
E{ ∂θ }

where I (θ ) is used to denote

∂ ln( p (x|θ )) 2
x
I ( θ ) = E{ }. (7.53)
∂θ
The equality in the Schwartz inequality holds if and only if the two functions in the integral in
(7.52) are proportional to each other

∂ ln( px (x|θ ))
θ̂ (x) − θ = k ,
∂θ
p
where px (x|θ ) on both sides is omitted. The constant k is obtained as k = 1/I (θ ) from the condition
that the integral in (7.52) is equal to 1 for θ̂ (x) − θ = k∂ ln( px (x|θ ))/∂θ. Therefore, for the optimal
estimator and the minimal variance, the following equality

∂ ln( px (x|θ ))
= I (θ )(θ̂ (x) − θ ) (7.54)
∂θ
holds. This relation can be used to find the optimal estimator, θ̂ (x), and the minimal variance, 1/I (θ ),
without the evaluation of the second-order derivative
∂2 ln( px (x|θ ))
= − I (θ ) (7.55)
∂θ 2

Example 7.29. Consider the signal x (n) = s(n) + ε(n), where ε(n) is a zero-mean Gaussian noise.
The aim is to estimate a parameter a of the sinusoidal signal s(n), for example, its amplitude,
frequency, or phase, from N samples of the signal, x = [ x (1), x (2), . . . , x ( N )] T . Find the
minimum variance estimator and the Cramer-Rao bound.

⋆The random variable x (n) − s(n) = ε(n) is Gaussian distributed. For N statistically indepen-
dent values of the error ε(n), with the assumed parameter a value, holds
2 ( x ( N )−s( N | a))2
1 − (x(1)−2σs(21|a)) 1
px ( x (1), . . . , x (n)| a) = √ e × · · · × √ e− 2σ2 , (7.56)
σ 2π σ 2π
or in vector form

1 ||x − s| a||22
px (x| a) = p exp (− ),
σN (2π ) N 2σ2

where s(n| a) is the considered signal with the assumed parameter a, and s| a is its vector form.
The log-likelihood function for this random signal is
N ||x − s| a||22
J (x| a) = − ln px (x| a) = ln(2π ) + N ln(σ) + . (7.57)
2 2σ2
344 Discrete-Time Random Signals

The first derivative of the expected value is given by

n ∂ ln px (x| a) o n ∂J (x| a) o 2 n N ∂s(n| a) o
E = −E = 2E ∑ ( x (n) − s(n| a)) . (7.58)
∂a ∂a 2σ n =1
∂a

(a) In the case when we want to estimate the amplitude a of the sinusoidal signal

s(n) = s(n| a) = a cos(2πnk0 /N ),

then ∂s(n| a)/∂a = cos(2πnk0 /N ) and

∂J (x| a) 1 N
− = 2 ∑ ( x (n) − a cos(2πnk0 /N )) cos(2πnk0 /N )
∂a σ n =1
N N 2
= 2 ∑ ( x (n) cos(2πnk0 /N )) − a , (7.59)
2σ n=1 N

since ∑nN=1 a cos2 (2πnk0 /N ) = aN/2. Comparing relation (7.59) with (7.54), we can conclude
that the optimal estimator and the minimum variance are, respectively, the cosine transform
N
â = g(x) = 2 ∑ x(n) cos(2πnk0 /N )
n =1

and its variance

1 2σ2
Var{ â} = σa2 =
= .
I ( a) N
The minimum variance can also be obtained from the second derivative of the log-likelihood
function, as in (7.55),

∂2 ln( px (x| a)) ∂2 J ( x | a )

2
=− = − I ( a) (7.60)
∂a ∂a2
to produce
n ∂2 J ( x | a ) o 1 n ∂ N ∂s(n| a) o
−E
∂a2
=
σ 2
E ∑
∂a n=1
( x (n) − s(n| a))
∂a
1 n N ∂2 s(n| a) ∂s(n| a) 2 o 1 N ∂s(n| a) 2
=
σ2
E ∑ ( x ( n ) − s ( n | a ))
∂a2
−
∂a
= − ∑
σ 2 n =1 ∂a
,
n =1
n o
since E ∑nN=1 ( x (n) − s(n| a)) = 0. The Cramer-Rao bound for the variance of the amplitude
estimate is
1 1 σ2
Var{ â} ≥ = = .
∂s(n| a) 2
2
∂ J (x| a)
I ( a) E{ ∂a2 } ∑nN=1 ∂a

For the sinusoidal signal s(n) = a cos(2πnk0 /N ), in this way we confirm the previous result

2σ2
Var{ â} ≥ ,
N

with ∂s(n| a)/∂a = cos(2πnk0 /N ) and ∑nN=1 cos2 (2πnk0 /N ) = N/2.

Ljubiša Stanković Digital Signal Processing 345

(b) Consider now the frequency estimation of the signal

x (n) = sin( an) + ε(n).

For this signal ∂s(n| a)/∂a = n cos( an) and the bound for the variance of the frequency
estimation is
σ2 σ2
Var{ â} ≥ 2 = N .
∑nN=1
∂s(n| a) ∑n=1 n cos2 ( an)
2
∂a

The Cramer-Rao bound is shown in Fig. 7.19 for N = 10 and N = 50, and various values of a,
with σ2 = 1.

10-4
0.025 1.5

0.02
1
0.015

0.01
0.5
0.005

0 0
-4 -2 0 2 4 -4 -2 0 2 4

Figure 7.19 Cramer-Rao bound for the variance of the frequency estimation.

Example 7.30. Consider the signal x (tn ) = atn + ε(n), n = 1, 2, . . . , N, where ε(n) is a zero-mean
Gaussian noise. The gaol is to revisit the linear regression model and the estimation of the
parameter a and its variance. What is the optimal estimator for a from the available data x (tn ),
given in the vector x for instants being elements of the vector t? What is the variance of the
optimal estimator of a (the Cramer-Rao bound)?

⋆The cost function for this random signal, with zero-mean Gaussian noise ε(n), is
N ||x − at||22
J (x| a) = − ln px (x| a) = ln(2π ) + N ln(σ) + . (7.61)
2 2σ2
The first derivative is given by

∂J (x| a) 1 N ∑ N t2 ∑nN=1 x (tn )tn

=− 2 ∑ ( x (tn ) − atn )tn =− n=21 n − a .
∂a σ n =1 σ ∑nN=1 t2n

When this expression is compared to I (θ )( g(x − θ )) in (7.54), we get the optimal estimator form

∑nN=1 x (tn )tn

â(x) =
∑nN=1 t2n
346 Discrete-Time Random Signals

and its variance is

σ2
Var( â(x)) = σa2 = .
∑nN=1 t2n

Example 7.31. We can come to the Cramer-Rao relations in an inductive way, analyzing the mean
value estimation in the Gaussian distributed random variable, presented in Section 7.4.5,

1 ||x − µ||22
px (x|µ) = p exp (− ).
σ N (2π ) N 2σ2

with the log-likelihood function (7.47) used for the estimation of the Gaussian distribution
parameters
N ||x − µ||22
J (µ) = − ln px (x|µ) = ln(2π ) + N ln(σ ) + . (7.62)
2 2σ2
The estimated mean value follows from ∂J (µ)/∂µ = 0 with

∂J (µ) ∂ ln px (x|µ) N1 N
=− =− 2 ( ∑ ( x (n)) − µ . (7.63)
∂µ ∂µ σ N n =1

The second derivative, with respect to the mean value, is

∂2 J ( µ ) N
= 2. (7.64)
∂µ2 σ

The minimum variance of the mean value estimation is

1 σ2
σµ2 = ∂2 J ( µ )
= . (7.65)
N
∂µ2

This is quite specific formulation. However, this relation holds

for a general unbiased
estimator of the parameter θ and the cost function J (x|θ ) = − ln px (x|θ ) . The minimum
variance of the estimated parameter θ is
1 1
Var{θ̂ } = σθ2 ≥ = .
E{
∂2 J ( x | θ )
} ∂ ln( px (x|θ )) 2
∂θ 2 −E{ ∂θ }

In addition, it has been shown that an unbiased estimator attains the bound for all θ if and
only if the first derivative of the log-likelihood function can be written in the form (7.54)

∂ ln px (x|θ )
= I (θ )( g(x − θ )), (7.66)
∂θ
where the estimator with minimum variance is defined by θ̂ (x) = g(x) and the minimum variance
value is Var{θ } = I (1θ ) .
Ljubiša Stanković Digital Signal Processing 347

Notice that the relation in (7.63) is of this form, with I (µ) = N/σ2 and g(x) =
1
N ∑nN=1 x (n).

Cramer-Rao bound for the minimum variances in simultaneous estimation of more than one
parameter, for example, θ = (θ1 , θ2 , . . . , θK ), from the data in x can also be derived following similar
concepts.

7.4.7 Confidence Intervals

The result of an experiment or calculation is commonly a random variable. When the estimate x (n)
is provided, the main question is the confidence in this specific value and how far the true (expected)
value of the considered physical or mathematical value could be? The confidence interval provides
a range within which the true value is estimated to lie. This interval provides the reliability of the
presented estimate.
Consider a Gaussian distributed random variable as the most common case in practice. Assume
that the experiment (or calculation) results are Gaussian distributed. The aim of the experiment is
to estimate the unknown true value µ x . For the Gaussian variable, it is known that all results of an
experiment, x (n), will be within the interval (7.44)

[µ x − 2σx , µ x + 2σx ],

with the probability

√
Probability{−2σx < x (n) − µ x < 2σx } = erf( 2) = 0.9545. (7.67)

This probability is sufficient for most of the experiments. If required, the probability can be increased
using wider intervals. Here, the unknown true value µ x and the interval bounds are fixed values, without
any randomness.
The confidence intervals are calculated for the specific outcome of the experiment, x (n), and
the a priory estimated spread measure (here the standard deviation σx ). The confidence intervals are
defined as
[ x (n) − 2σx , x (n) + 2σx ] (7.68)
Obviously, the confidence interval is not the same as the interval in (7.67), meaning that a 0.95
probability of (7.67) does not mean that any results of the experiment will be within the confidence
interval with the same probability. However, if we know that the obtained result x (n) is within the
interval in (7.67), with the probability of 0.95, then it means that the true value µ x is within the
confidence interval, [ x (n) − 2σx , x (n) + 2σx ], with the same probability, Fig. 7.20.

Example 7.32. A deterministic signal s(n), with an additive Gaussian noise ε(n), is observed at two
instants n1 and n2 . The standard deviation of the measurement method at the instant n1 was
σx (n1 ) = 0.5, while the standard deviation of the measurement method at n2 was σx (n2 ) = 0.2
(different estimation approaches were used, for example, different windows for averaging; for
the same measurement method, σx (n1 ) = σx (n2 ) would hold). The observed values in these two
measurements are denoted by x (n), and they are equal to:
(a) x (n1 ) = 1.1 and x (n2 ) = −0.2;
(b) x (n1 ) = −0.6 and x (n2 ) = 1.8.
348 Discrete-Time Random Signals

-4 -3 -2 -1 0 1 2 3 4 5 6

Figure 7.20 The interval [µ x − 2σx , µ x + 2σx ] where the Gaussian random variable x (n) lies with the probability
of 0.95, along with the confidence intervals for various x (n) from the defined interval. The common point for all
these confidence intervals is the true mean value µ x (vertical line).

Could we conclude that the true signal s(n) has changed, that is s(n1 ) 6= s(n2 ), for these
two cases (for an experiment, this is the question how can we be confident that a difference in
the true result is obtained under different experiment conditions at n1 and n2 ). The common
probability of 0.95 is assumed for the confidence interval definition.

⋆ (a) For the signal values (experiment outcomes) x (n1 ) = 0.6 and x (n2 ) = 0.1, the correspond-
ing confidence intervals are

[ x (n1 ) − 2σx (n1 ), x (n1 ) + 2σx (n1 )] = [1.1 − 1, 1.1 + 1] = [0.1, 2.2] (7.69)

and

[ x (n2 ) − 2σx (n2 ), x (n2 ) + 2σx (n2 )] = [−0.2 − 0.4, −0.2 + 0.4] = [−0.6, 0.2]. (7.70)

These two confidence intervals overlap, meaning that we can not exclude the case that both true
values, s(n1 ) and s(n2 ), are within the overlapping interval [0.1, 0.2] and that they can take the
same value within this overlapping interval.
(b) When the obtained signal values are x (n1 ) = −0.6 and x (n2 ) = 1.8, the corresponding
confidence intervals are

[ x (n1 ) − 2σx (n1 ), x (n1 ) + 2σx (n1 )] = [−0.6 − 1, −0.6 + 1] = [−1.6, −0.4] (7.71)

and

[ x (n2 ) − 2σx (n2 ), x (n2 ) + 2σx (n2 )] = [1.8 − 0.4, 1.8 + 0.4] = [1.4, 2.2]. (7.72)

These two confidence intervals are clearly separated, meaning that the true signal values, s(n1 )
and s(n2 ), are different with a sufficiently high probability.
Ljubiša Stanković Digital Signal Processing 349

Example 7.33. Consider a random signal x (n) that can take values {No, Yes} or {0, 1} with
probabilities 1 − p and p. If a random realization of this signal is available with N = 1000
samples and we obtained that the event Yes appeared k = 555 times, find the interval where the
true p will be with the probability of 0.95.
Notes: The mean value of samples x (n) is defined by

1
p̂ = x (1) + x (2) + · · · + x ( N ) = k/N.
N

For the binomial distribution, ( Nk ) pk (1 − p) N −k with x (n) ∈ {0, 1}, the expected value of p̂ is

pN
E{ p̂} = E{k}/N = = p.
N
The variance of p̂ is given by

σp̂2 = Var{ p̂} = Var{k}/N 2 = p(1 − p)/N.

The number k = x (1) + x (2) + · · · + x ( N ), as a sum of random variables, x (n), can be

considered as the Gaussian distributed variable (according to the CLT) with the expected value
pN and the variance Var{k} = p(1 − p) N, that is
2
N k 1 − (k− pN )
p (1 − p ) N − k ≃ p e 2p(1− p) N ,
k 2π p(1 − p) N

with a good rule of thumb N p ≥ 10 and N p(1 − p) ≥ 10, Fig. 7.21.

0.03 0.03

0.025 0.025

0.02 0.02

0.015 0.015

0.01 0.01

0.005 0.005

0 0
500 550 600 500 550 600

Figure 7.21 Binomial distribution for N = 1000 and p = 0.55 as a function of k (left) and the Gaussian distribution
with the mean value pN and the variance σ2 = p(1 − p) N.

⋆For the given observation, with k = 555 responses x (n) = 1, the expected value p of the
binomial distributed random variable is estimated as
k 555
p̂ = = = 0.555.
N 1000
350 Discrete-Time Random Signals

The variance of this random variable is

σp̂2 = Var{k}/N 2 = p(1 − p)/N.

For the variance estimation we should know the exact value of p, which is not the case. With the
assumption that p̂ is not far from the exact p, we can use the value of p̂ in the variance calculation
555 555
p (1 − p ) p̂(1 − p̂) p(1 − p) p̂(1 − p̂) 1000 (1 − 1000 ) 0.2470
σp̂2 = = ≃ = = ,
N N p̂(1 − p̂) N 1000 1000

and σ̂p̂ = 0.0157. Therefore, the estimated value p̂ = 0.555 is within the interval

p̂ = 0.555 ∈ p − 2σ̂p̂ , p + 2σ̂p̂ = [ p − 0.0314, p + 0.0314]

with the probability of 0.95, meaning that the true value p is within

−0.0314 ≤ 0.555 − p ≤ 0.0314 or |0.555 − p| ≤ 0.0314.

with the same probability. The true value is around 55.5%, within ±3.14% range (from 52.36%
to 58.64%) with the probability of 0.95. √
By increasing the value of N we can reduce the margin of the estimation error (σ̂p̂ ∝ 1/ N).
However, about 1000 values are commonly used for various opinion poll estimations.

Bayesian analysis. Within the Bayes’ framework, the probability of the event B (k times x (n) = 1
(Yes) and N − k times x (n) = 0 (No)), with an assumed p is equal to

N k
P( B| p) = p (1 − p ) N − k . (7.73)
k
The posterior is

P( p| B) = P( B| p) P( p)/P( B) ∝ P( B| p) P( p) ∝ P( B| p),

for the uniform prior P( p) = C.

For the given event B, when k = 555 and N = 1000, we get

1000 555
P( p| B) ∝ P( B| p) = p (1 − p)445 .
555

The value of the posterior P( p| B) is shown in Fig. 7.22 for 0 ≤ p ≤ 1 with a step of ∆p = 0.005.
From Fig. 7.22 we can concluded that the posterior, P( p| B), is maximum at p = 0.555,
while the region of significant P( p| B) values is about 7 steps ∆p = 0.005 left and right from the
maximum position, corresponding to 0.555 ± 7 · 0.005 = 0.555 ± 0.035.

Student’s t-distribution. In the previous analysis, we assumed that the standard deviation is known.
When the true value (mean) estimation is done based on a small number of samples, then the standard
deviation has to be estimated as well. For the set of samples x (n), we have the mean and the variance
estimations
1
µ̂ x (n) = x1 ( n ) + x2 ( n ) + · · · + x N ( n ) (7.74)
N
r
1
σ̂x (n) = | x1 (n) − µ̂ x (n)|2 + | x2 (n) − µ̂ x (n)|2 + · · · + | x N (n) − µ̂ x (n)|2 . (7.75)
N−1
Ljubiša Stanković Digital Signal Processing 351

0.03

0.025

0.02

0.015

0.01

0.005

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Figure 7.22 The posterior P( p| B) proportional to the binomial distribution for the event B when N = 1000 and
k = 0.55, as a function of the probability p.

The new random variable, where both µ̂ x (n) and σ̂x (n) are random, is

µ̂ x (n) − µ
z(n) = √
σ̂x (n)/ N
is t-distributed (student distribution). The t-distribution is defined for a given degree of freedom
ν = N − 1 using the gamma functions. For large ν it approaches to the Gaussian distribution, while
for ν = 1 (just two samples) it is quite heavy-tailed and equal to the Cauchy distribution (see Section
7.4.11). The interval −tν < z(n) < tν , where a t-distributed random variable z(n) takes its value with
the probability 0.95, is
Probability{−tν < z(n) < tν } = 0.95 (7.76)
for the value of tν given in Table 7.4 and for some values of ν = N − 1. We can conclude that the
confidence intervals are very wide for small N (for example, six times wider for N = 2 than for
N = 60), while they are almost the same as in the Gaussian distributed random variable for large N,
for example, N ≥ 12.

ν= N−1 1 2 3 5 12 20 60 120
tν 12.076 4.303 3.182 2.571 2.179 2.086 2.000 1.980

Table 7.4
Values of the interval bounds for the t-distribution for Probability{−tν < z(n) < tν } = 0.95.

Example 7.34. The available samples of the random signal x, with the elements x (n), are given by
(a) x = (0.93, 0.17, −0.69, −0.72),
(b) x = (0.93, 0.17, −0.69, −0.72, −0.57, −0.31, 0.27, 1.33. − 1.33, 1.40, −0.57, −0.35, −0.64).
Find the confidence intervals, where the true mean of this random signal is expected with
the probability of 0.95.
352 Discrete-Time Random Signals

⋆In this experiment, both the mean value and the variance are not known and should be estimated
based on the available data.
(a) For N = 4 and ν = N − 1 = 3, and the available realizations of x (n), we have

1
µ̂ x = x (0) + x (1) + x (2) + x (3) = −0.08,
4
r
1
σ̂x = | x (0) − µ̂ x |2 + | x (1) − µ̂ x |2 + | x (2) − µ̂ x |2 + | x (3) − µ̂ x |2 = 0.79.
3
The confidence interval of the normalized and centered random signal
µ̂ x − µ −0.08 − µ
z(n) = √ = ,
σ̂x / 4 0.39

for the probability of 0.95, is defined by −3.182 < z(n) < 3.182 (see Table 7.4), or

[−0.08 − 0.39 × 3.182, −0.08 + 0.39 × 3.182] = [−0.08 − 1.24, −0.08 + 1.24]. (7.77)

(b) When the number of realizations is increased to N = 13, we get

1 12
13 n∑
µ̂ x = x (n) = −0.08,
=0
v
u
u 1 12
σ̂x = t | x (n) − µ̂ x |2 = 0.82.
12 n∑
=0

For the random variable

µ̂ x − µ −0.08 − µ
z(n) = √ = ,
σ̂x / 13 0.23

the confidence interval, for ν = N − 1 = 12 and the probability of 0.95, is (see Table 7.4)

[−0.08 − 0.23 × 2.179, −0.08 + 0.23 × 2.179] = [−0.08 − 0.5, −0.08 + 0.5]. (7.78)

Although the same value of the mean value is obtained in both cases, with similar standard
deviations, the confidence interval in (b) shows that the experiment with N = 13 realizations
produces a more reliable estimation of the true mean µ.
Repeat Example 7.27 with the data from this example and comment on the results within
the frequentist and the Bayesian framework.

Variance stabilization – Delta method. Consider again the Bernoulli random variable from Example
7.33. The estimate of the expected value, p, of the probability that x (n) = 1 will appear in the Bernoulli
trial, is given by
k 1 N
N n∑
p̂ = = x ( n ),
N =1
where k is the number of x (n) = 1 appearances in N samples. For large N, this estimate is
approximately Gaussian distributed, Fig. 7.21, with the expected value E{ p̂} = E{k}/N = pN/N = p
and the variance
σp̂2 = Var{ p̂} = Var{k}/N 2 = p(1 − p)/N.
Ljubiša Stanković Digital Signal Processing 353

The property that p̂ − p tends to the Gaussian distributed random variable can be written as
D
p̂ − p → N (0, σp̂2 ).

The problem in the confidence interval definition for p was that the variance σp̂2 depends on the
parameter p which is to be estimated. This has been solved in Example 7.33 using p ≃ p̂. Another
approach to this problem is based on the so called Delta method. This method states that for any
differentiable function g( x ) holds
D
g( p̂) − g( p) → N (0, ( g′ ( p))2 σp̂2 ), (7.79)

for g′ ( p) 6= 0. The proof is simple since for large data set size, N, we can assume that p̂ − p is small
so that the linear Taylor series expansion for the function g( p̂) holds around p, that is

g( p̂) = g( p) + g′ ( p)( p̂ − p),

meaning that g( p̂) − g( p) behaves as ( p̂ − p), for sufficiently large N, with the deterministic
proportionality factor of g′ ( p). Since Var{ a( p̂ − p)} = a2 Var{ p̂}, from the previous relation we get

Var{ g( p̂)} = ( g′ ( p))2 Var{ p̂} = ( g′ ( p))2 σp̂2 ,

proving the Delta method.

Now the Delta method will be used to avoid the variance dependence on parameter p (variance
stabilization). The resulting variance in (7.79) is parameter p independent if

( g′ ( p))2 σp̂2 = ( g′ ( p))2 p(1 − p)/N = C

holds, where C is a constant, equal to the resulting variance.

The intuitive solution to this problem is
√ 1 1
g( p) = arcsin( p) when g′ ( p) = √ p (7.80)
2 p 1− p

and
1
( g′ ( p))2 σp̂2 = ( g′ ( p))2 p(1 − p)/N = .
4N
p
This means that the random variable arcsin( p̂) is Gaussian distributed with the expected value
√ √
arcsin( p) and the variance 1/(4N ). The confidence intervals for arcsin( p), with the probability
of 0.95, are defined by the two-sigma rule
√ p 1 p 1
arcsin( p) ∈ [arcsin( p̂) − 2 √ , arcsin( p̂) + 2 √ ]. (7.81)
4N 4N
The confidence intervals for p are then obtained taking the sinus of both bounds and squaring the result
r r
p 1 p 1
p ∈ [sin2 arcsin( p̂) − , sin2 arcsin( p̂) + ]. (7.82)
N N
√
In the case when sin arcsin( p̂) − 1/ N is negative, the zero value is used as the lower bound.
For the data from Example 7.33 we get
r r
2
√ 1 2
√ 1
p ∈ [sin arcsin( 0.555) − , sin arcsin( 0.555) + ] = [0.5235, 0.5863].
1000 1000
354 Discrete-Time Random Signals

This interval is almost the same as [0.5236, 0.5864], obtained in Example 7.33, using p ≃ p̂ in the
variance estimation.

7.4.8 Bootstrap Method

The bootstrap is a simple method for statistical inference using remarkable modern computing power,
without relying on many assumptions about the random variable. The main idea is to estimate a statistic
of the considered signal by increasing the number of signal realizations using the existing data and
resampling. Here, is the origin of the method name “pulling itself up by its own bootstrap.” In producing
many realizations, the bootstrap method relies on resampling the existing signal with replacement.
The bootstrap method can be summarized as follows:
1. Consider a signal (data set) { x (n), n = 1, 2, . . . , N }, being a part of much larger population
{ x (n), n = 1, 2, . . . , P}, P ≫ N.
The aim is to provide a statistic as an estimate of the corresponding large population parameter,
using the available data set with N samples only.
2. The original data set x (n), n = 1, 2, . . . , N is resampled into new signals of the length M. We
will consider cases with M = N/2 and M = N and the inference is performed based on these
resampled data.
A new resampled realization of the signal is formed as follows: (a) A random signal sample from
x (n), n = 1, 2, . . . , N, is picked up and assigned to x1 (1). Then this sample is “returned” to the
original data set (so that it can be picked up again, by chance – resampling with replacement).
(b) A new signal sample is randomly picked up from the original set x (n), n = 1, 2, . . . , N and
assigned to x1 (2). This sample is also “returned” to the original data set. This procedure is
repeated M times to form new resampled signal (Bootstrap Sample) x1 with M elements.
3. The desired statistic (in our example, we will consider the mean value) is estimated using x1 as
µ̂(1) = mean{x1 }.
4. The Steps 2 and 3 are repeated for every xb , b = 1, 2, . . . , B, to get

µ̂(b) = mean{xb }, b = 1, 2, . . . , B.

In practice, B can be quite large.

5. Statistics of the data set and the sampling distribution is obtained and analyzed as if we had a
large number of data subsets xb , b = 1, 2, . . . , B. This new distribution can be used to make a
further statistical inference, such as, for example, to estimate the confidence intervals for the
estimated parameter.

Example 7.35. In order to introduce the basic definitions and principles of the bootstrap method we
will revisit the introductory Example 7.1 and the signal shown in Fig. 7.1, whose values are given
in Table 7.1. Here, we will assume that this set of N = 100 signal values, x (n), is a sample
of a large population with P ≫ N elements. The aim is to estimate the mean value of a large
population using the statistics of the available data set.
In order to perform the statistical analysis using the bootstrap, new realizations should be
created by resampling the original data with replacement. An illustration of this resampling is
given in Table 7.5 for M = 20 and B = 15. The new resampled signals, xb , are obtained by
sampling the original data with replacement, as described in Step 2. Consider, for example, x7 ,
given in the seventh column of this table. Note that the signal sample x (n) = 48 is repeated,
Ljubiša Stanković Digital Signal Processing 355

although there is only one sample x (n) = 48 in the original data set, while many other signal
values do not appear at all in this realization.
A set of B = 1000 resampled realizations of the original signal, xb is formed next. The
bootstrap is applied to this data set with: (a) M = N/2 = 50 and (b) M = N = 100.
The results are shown in Fig. 7.23. We can see that the maximum of the normalized
histogram of B = 1000 values of µ̂(b) = mean{xb } is close to the sample mean value calculated
as the sample average of all 100 available data values. We can also conclude that the confidence
intervals can be estimated considering the probability distribution, Fµ , and its, for example, 0.05
and 0.95 levels.
Since the considered data in Example 7.1 are of the Gaussian nature (what is not an
assumption required by the bootstrap method) we can compare this result with the one obtained
from the variance of the mean value estimate in the Gaussian distributed
√ variable
√ (7.65),
using the standard deviation calculated in Example 7.5, σµ = σ̂x / M = 17.73/ 50 = 2.5.
For the probability of 0.90 the confidence intervals for the Gaussian distribution would be
[55.76 − 1.65σµ , 55.76 + 1.65σµ ] = [51.63, 59.88]. The confidence intervals of the mean value
estimation obtained with the bootstrap method correspond with the theoretical ones for this
distribution.

Table 7.5
Bootstrap resampling of the signal x (n), n = 1, 2, . . . , 100 from Fig. 7.1. B = 15 new signals xb , b = 1, 2, . . . , B, are
formed. Every new signal is of M = 20 length. New resampled signals are formed by randomly picking up a sample
form x (n), n = 1, 2, . . . , N, then “returning” this sample into the original set (so that it can be picked up again, by
chance), randomly picking up second sample of xb , “returning” it, and so on M times. This procedure is repeated
for every xb , b = 1, 2, . . . , B. In practice, B is commonly large.

x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15

42 66 83 70 38 69 42 42 77 49 55 80 25 12 66
56 49 36 57 77 54 52 55 38 37 95 73 84 42 66
40 52 64 59 45 56 50 54 50 36 69 35 36 64 64
71 99 31 55 71 75 67 64 66 57 69 64 18 56 61
67 72 59 61 28 41 61 25 54 42 25 49 61 26 47
44 69 50 55 80 87 39 71 99 70 50 52 44 25 38
67 62 54 57 57 52 60 48 66 44 36 51 23 50 89
41 70 48 79 71 66 35 37 42 55 64 69 62 11 77
26 55 18 68 49 84 57 66 57 37 54 60 58 26 66
66 25 23 80 60 71 48 53 12 56 52 56 54 51 63
50 59 71 42 66 23 70 71 69 44 49 53 71 62 68
37 31 66 43 40 25 45 80 83 71 71 71 57 18 28
11 54 71 26 84 69 60 52 40 56 55 57 61 50 55
69 55 89 42 50 55 39 36 53 71 38 67 63 45 62
18 95 43 77 74 34 52 67 55 67 84 66 44 52 64
40 66 66 99 59 55 59 35 57 57 67 42 66 42 80
87 71 72 34 69 76 66 51 28 39 42 42 66 57 73
66 42 23 64 12 66 48 61 51 54 53 48 55 23 41
66 47 52 69 89 50 66 60 80 73 42 12 66 63 67
69 62 63 54 71 23 61 34 37 83 95 57 77 18 28
356 Discrete-Time Random Signals

0.2 0.95
0.15

0.1

0.05

0 0.05
40 50 60 70 40 50 60 70

0.3 0.95

0.2

0.1

0 0.05
40 50 60 70 40 50 60 70

Figure 7.23 Bootstrap statistics of the mean value of a large population with the reduced set of available data
shown in Fig. 7.1. A large point on the horizontal axis stands for the sample average of the considered data set.

7.4.9 Hypothesis Testing

The hypothesis testing was introduced in statistics since its foundations are established in the first part
of the last century. The main goal of the hypothesis testing is to provide a statistical decision based
on the experimental data (random signal values). Although the answer to this kind of question can be
provided, in an indirect way, using the presented Bayes’ inference or the confidence intervals, we will
provide here its original analysis due to importance in signal processing and detection theory.
The basic concepts in the hypothesis testing are:
• Null hypothesis, H0 . It assumes that the tested event has not happened and that the experiment
result is obtained by pure chance.
• Alternative hypothesis H1 is contrary to the null hypothesis, meaning that the null hypothesis is
rejected and the experiment result is not obtained by pure chance.
• Level of significance shows how we are confident in the decision made about accepting or
rejecting the null hypothesis since the probability of 1 is not possible in this kind of testing. It is
common to assume that the level of significance is equal to α = 0.05 or α = 0.01, corresponding
to the probabilities of 0.95 or 0.99.
• Type of error I or false-negative result when the null hypothesis is rejected although this
hypothesis was true.
• Type of error II or false-positive result when the null hypothesis is accepted, while this hypothesis
was not true.

Example 7.36. Consider a multiple-choice test with 5 answers to each of N = 20 questions. Only one
of these 5 answers is correct for every question. The null hypothesis, H0 , is the assumption that
the person who answers the test does not have any knowledge of the test topic. Find the number
of the correct answers when the null hypothesis can be accepted with the probability of 0.95.
Ljubiša Stanković Digital Signal Processing 357

⋆ The probability of a correct answer to a specific question, if the null hypothesis holds, is
p = 1/5. The probability that the person will give k correct answers to N = 20 questions, with
the null hypothesis, is already calculated, (7.37), and it is equal to

20−k
P(k| H0 ) = 20 k
k p (1 − p ) .

These probabilities are calculated for every k and shown in Fig. 7.24(a). The probability
distribution is given in Fig. 7.24(b) and (c).

0
1
0.05
0.8
-5 0.04
0.6
0.03
-10 0.4 0.02

0.2 0.01
-15
0 0
0 10 20 0 10 20 5 6 7 8 9 10 11

Figure 7.24 Hypothesis testing. (a) The logarithm of the probability of k correct answers with the null hypothesis,
P(k | H0 ). (b) Cumulative probability distribution of P(k| H0 ). (c) Values of the complementary probability
distribution, being equal to the probability that more than k correct answers will be given with the null hypothesis.

Now, for the given probability we can find the limit number, k, of the positive answers if
the null hypothesis is true. Obviously, it is k = 7 for α = 0.05. This means that if the person has
given k < 7 correct answers, the decision should be that the null hypothesis (the person does have
any knowledge of the tested subject) is true, with a significance level of 0.05. The hypothesis
rejection region is k ≥ 7.
For the significance level of α = 0.01, the hypothesis rejection region would be k ≥ 8.
For example, if the tested person provided k = 10 correct answers on this multiple choice
test, the so-called p-value of this result of the experiment is equal to probability of the considered
experiment producing such an outcome or anyone more extreme,
20 20
p= ∑ P(k| H0 ) = ∑ 20
k pk (1 − p)20−k = 0.0026 < α.
k =10 k =10

This means that for k = 10 correct answers, the null hypothesis should be rejected for both
α = 0.05 or α = 0.01.
Finally, we will calculate the type of error I (false-negative result) for the case k = 10. It is
equal to the probability that we have decided to reject the null hypothesis, since the p-value is
p = 0.0026 < α, but that the person, in reality, does not have any knowledge in this area and that
null hypothesis was true. The type of error I is equal to the probability
that the null hypothesis
holds and that there were k = 10 correct answers, P(k| H0 ) = 20 k
k p (1 − p )
20−k = 0.002,

meaning that 1 person in 500 will achieve this kind of the result.
358 Discrete-Time Random Signals

In many practical cases, we may assume that the random variables (random signal samples) in
the considered experiment (in hypothesis testing, called population) are Gaussian distributed, under
the null hypothesis. This assumption also holds if the particular random variable is not Gaussian, but
the total number of samples is sufficiently large so that the distribution of the sample mean value is
approximately Gaussian, for example, as it was the case in the poll analysis in Example 7.33, and
proven in Example 7.24.
Consider a random signal, x (n), whose probability density function is

1 2
p x (ξ ) = √ e−(ξ −µx )/(2σx )
σ 2π
under the null hypothesis. The result of the experiment is the signal sample x (n) = A. Here, we may
consider three possible scenarios of practical interest for the null hypothesis rejection:
• The experiment result is not equal to the expected mean (two-sided test). This case corresponds
to the case when we want to make the decision if any constant value (positive or negative) is
added to the considered random variable under the null hypothesis. For the assumed level of
significance, the region of rejection is obtained from
λ
Probability{| A − µ x | > λ} = 1 − erf( √ ) < α.
2σx
For the significance level of α = 0.05, the rejection region for the null hypothesis is

| A − µ x | > 2σx .

• The experiment result is greater than the expected mean (right-tailed test). This scenario appears
when the aim is to establish if a certain action has increased the expected value in a positive
direction. Here, we are not interested in a possible decrease in the mean. For the assumed level
of significance, the region of rejection follows from
1 λ
Probability{ A − µ x > λ} = (1 − erf( √ )) < α.
2 2σx
For the significance level of α = 0.05, the rejection region for the null hypothesis is

A − µ x > 1.645σx . (7.83)

• The experiment result is lower than the expected mean (left-tailed test). This is opposite to the
previous one.

Example 7.37. An author was selling 34 books on average per week. To improve the sales of his book,
the author designed and implemented an advertisement campaign. The following week, he sold
41 books. Can the author reject the null hypothesis (meaning that the advertisement campaign
had no impact on book sales) with a significance level of α = 0.05?
The number of sold books with the null hypothesis obeys the Poisson distribution (Section
7.4.12 and Problem 7.23)
λk e−λ
P( x (n) = k| H0 ) =
k!
with λ = 34, that can be approximated (for large λ ≥ 20) by the Gaussian distribution, with

µ = λ = 34,
Ljubiša Stanković Digital Signal Processing 359

σ2 = λ = 34,
as illustrated in Fig. 7.27,
1 2
p(ξ | H0 ) = √ e−(ξ −34) /68
.
68π

⋆Since we are looking for a possible influence of the advertisement campaign to the increase
in the number of sold books, we are interested in the right-tailed test, when the criterion for the
hypothesis rejection is (7.83)
√
41 − 34 > 1.645 × 34,
√
with the observed value A = 41. Since 1.645 × 34 = 9.5919, the author cannot reject the null
hypothesis, meaning that the hypothesis that the advertisement campaign does not have any
influence on the number of sold books cannot be rejected.

Example 7.38. The Fourier transform of a signal is presented in Fig. 7.25(left). The Fourier transform
elements of the noise only (null hypothesis) are zero-mean Gaussian random variables with the
variance σX2 = 1. For every element of the Fourier transform, X ( k ), test the null hypothesis and

indicate the elements for which this hypothesis can be rejected with significance level α = 0.001,
meaning that we can reject the hypothesis that there is no signal component at the considered
frequency index.

⋆ For the significance level of α = 0.001 the rejection region of the null hypothesis, for the
Gaussian random variable with the mean µ X and the variance σX2 , is

λ
Probability{| X (k) − µ X | > λ} = 1 − erf( √ ) < 0.001,
2σX
| X (k) − µ X | > 3.2905σX or | X (k)| > 3.2905

Therefore, we cannot reject the null hypothesis for all Fourier transform elements (µ X = 0),
except those at k ∈ {4, 10, 66, 71, 88}, as shown in Fig. 7.25(right), where the rejection region,
for the significance level α = 0.001, is shaded.

When the mean value and the variance of the random variable in the experiment are not known in
advance, then the t-distribution (see Example 7.34) should be used.

7.4.10 Complex Gaussian Noise and Rayleigh Distribution

In many application the complex-valued Gaussian noise is used as a model for disturbance. Its form
is ε(n) = ε r (n) + jε i (n), where ε r (n) and ε i (n) are real-valued Gaussian noises. Commonly, it is
assumed that they are zero-mean, independent, with identical distributions (i.i.d.), and variance σ2 /2.
The mean value of this noise is

µε = E{ε(n)} = E{ε r (n)} + jE{ε i (n)} = 0 + j0.

360 Discrete-Time Random Signals

6 6

4 4
3.29
2 2

0 0

-2 -2
-3.29
-4 -4

-6 -6
0 20 40 60 80 100 0 20 40 60 80 100

Figure 7.25 The null hypothesis testing for the Fourier transform of a signal with zero-mean Gaussian noise with
the variance σX2 = 1 (left). The null hypothesis rejection regions (shaded) for the random variable X (k) with the
significance level of α = 0.001, corresponding to | X (k )| > 3.2905.

The variance is

σε2 = E{ε(n)ε∗ (n)} = E{ε r (n)ε r (n)} + E{ε i (n)ε i (n)} + j(E{ε i (n)ε r (n)} − E{ε r (n)ε i (n)})
= E{ε r (n)ε r (n)} + E{ε i (n)ε i (n)} = σ2 .

The amplitude of Gaussian noise |ε(n)| is an important parameter in many detection problems.
The probability density function of the complex-Gaussian noise amplitude is of the form
2ξ −ξ 2 /σ2
p|ε(n)| (ξ ) = e u ( ξ ).
σ2
The probability density function p|ε(n)| (ξ ) is called the Rayleigh distribution.
In order to prove the previous relation, consider the probability density functions of ε r (n) and
ε i (n). Since they are independent and equally distributed, we get

1 −(ξ 2 +ς2 )/σ2

pε r ε i (ξ, ζ ) = pε r (ξ ) pε i (ζ ) = e .
σ2 π
q
The probability that |ε(n)| = ε2r (n) + ε2i (n) < χ is
q ZZ ZZ
1 2 2 2
P{ ε2r (n) + ε2i (n) < χ} = pε r ε i (ξ, ζ )dξdζ = e−(ξ +ς )/σ dξdζ.
σ2 π
ξ 2 + ς2 < χ2 ξ 2 + ς2 < χ2

With ξ = ρ cos α and ζ = ρ cos α (the Jacobian of the polar coordinate transformation, J = |ρ|), we get
2
q Zχ Z2π χZ /σ2
2 2
1 − σρ 2 − χσ2
P{ ε2r (n) + ε2i (n) < χ} = 2 e ρdρdα = e−λ dλ = (1 − e )u(χ) = F|ε(n)| (χ).
σ π
0 0 0

The probability density function is

dF|ε(n)| (ξ ) 2ξ −ξ 2 /σ2
p|ε(n)| (ξ ) = = e u ( ξ ). (7.84)
dξ σ2
Ljubiša Stanković Digital Signal Processing 361

Example 7.39. A random signal is defined as y(n) = |ε(n)|, where ε(n) is the Gaussian complex
zero-mean i.i.d. noise with variance σ2 . What is the probability that y(n) ≥ A? Calculate this
probability for A = 2 and σ2 = 1.

⋆The probability density function for sequence y(n) is

2ξ − ξ 22
py ( x ) = e σ u(ξ )
σ2
The probability that y(n) ≥ A is

ZA
2ξ − ξ 22 2 2 −A
2
P{ξ > A} = 1 − P{ξ ≤ A} = 1 − 2
e σ dξ = 1 − (1 − e− A /σ ) = e σ2 .
σ
0

For A = 2 and σ2 = 1 we get P{ξ > A} ≈ 0.0183.

The Rayleigh distribution can be related to the χ-squared distribution, which is obtained as the
distribution for the sum of squares of N random Gaussian variables, xi (n), i = 1, 2, . . . , N,

z(n) = x12 (n) + x22 (n) + · · · + x2N (n).

The distribution of z(n) = |ε(n)|2 , where |ε(n)| is the Rayleigh distributed variable, is equal to the
χ-squared distribution of z(n) with N = 2 (see Example 7.14).

7.4.11 Impulsive Noises

This noise is used to model disturbances when strong impulses occur more often than in the case of
Gaussian noise. Due to possible stronger pulses, their probability density function decay toward ±∞ is
slower than in the case of Gaussian noise (a definition of the so called heavy-tailed noise is given in
Example 7.15).
The Laplacian noise has the probability density function
1 −|ξ |/α
pε(n) (ξ ) = e .
2α
It decays much slower as |ξ | increases than in the Gaussian noise case.
The Laplacian noise can be generated using

ε(n) = ε 1 (n)ε 2 (n) + ε 3 (n)ε 4 (n)

where ε i (n), i = 1, 2, 3, 4 are real-valued Gaussian independent zero-mean noises, Fig. 7.26 (for the
variance of this noise see Problem 7.20).
The parameters of the Laplace distributed signal can be estimated from data, as it is done in Section
7.4.5. For the stationary Laplacian distributed random variable x (n), x = [ x (1), x (2), . . . , x ( N )] T , with
mean µ, the likelihood maximization problem is equivalent to the log-likelihood minimization problem
again stated as
n o
(α, µ) MLE = arg min − ln px (x|α, µ) , (7.85)
362 Discrete-Time Random Signals

with
1 −| x(1)−µ|/α 1 1
px (x|α, µ) = e × · · · × e−| x( N )−µ|/α = N N e−||x−µ||1 /α (7.86)
2α 2α 2 α

Here, we have to minimize the cost function − ln px (x|α, µ) , defined by

||x − µ||1
J (α, µ) = N ln(2) + N ln(α) + , (7.87)
α
where ||x − µ||1 is the one-norm (L1 -norm) of vector x − µ. The solution to the L1 -norm minimization
problem is presented in Section 7.1.2,

µ = median{x}.

From ∂J (α, µ)/∂α = 0 follows

1 1
α= ||x − µ}||1 = ||x − median{x}||1 .
N N

Gaussian distribution

pε(ξ)
0.6

0.4

0.2

0
−5 −4 −3 −2 −1 0 1 2 3 4 5

Laplacian distribution

pε(ξ)
0.6

0.4

0.2

0
−5 −4 −3 −2 −1 0 1 2 3 4 5

Figure 7.26 The Gaussian and Laplacian noise histograms (with 10000 realizations), with corresponding
probability density function (dots).

Example 7.40. A Laplace distributed random signal is simulated as

y(n) = x1 (n) x2 (n) + x3 (n) x4 (n) + 1,

Ljubiša Stanković Digital Signal Processing 363

using N = 1001 realizations of the zero-mean Gaussian distributed random variables, x1 (n),
x2 (n), x3 (n), and x4 (n), with the same variance σx = 1.
The Laplacian distribution parameters, obtained by minimizing

J (α, µ) = N ln(2) + N ln(α) + ||x − µ||1 /α,

are
µ = median{y} = 0.98
and
α = ||y − median{y}||1 /N = 1.04,
where y = [y(1), (y2), . . . , y(1001)] T .
We can also calculate the posterior

P(α, µ|y) ∝ py (y|α, µ)

with
1
py (y|α, µ) = e−(|y(1)−µ|+|y(2)−µ|+···+|y( N )−µ|)/α
(2α) N
and present it, for a given N, using discrete sets of α and µ, as in Fig. 7.18.

The impulsive noise could be distributed in other ways, like, for example, the Cauchy distributed
noise, whose probability density function is
1
pε(n) (ξ ) = .
π (1 + ξ 2 )

The Cauchy distributed noise ε(n) is a random signal that can be obtained as a ratio of two independent
Gaussian random signals ε 1 (n) and ε 2 (n), that is, as

ε 1 (n)
ε(n) = .
ε 2 (n)
Another realization of the Cauchy random signal and the definition of the heavy-tailed noise are given
in Example 7.15.

7.4.12 Poisson Noise

The Poisson noise (or shot noise) is a random signal ε(n) which can take discrete integer values k with
the probability of

λk e−λ
P(ε(n) = k) = P(k) = for λ > 0.
k!
The mean value and the variance of ε(n) are µε = λ and σε2 = λ, respectively (see Problem 7.23).
The Poisson random variable is commonly used to model small-probability discrete events. It is typically
concerned with the number of events (for example, the number of phone calls in communications or
the actual number of particles detected in an image sensor) that occur in a certain (unit) time interval.
364 Discrete-Time Random Signals

0.2 0.2 0.2

0.15 0.15 0.15

0.1 0.1 0.1

0.05 0.05 0.05

0 0 0
0 20 40 0 20 40 0 20 40

Figure 7.27 Poisson probability for λ = 5 (left), λ = 10 (middle), and λ = 20 (right), along with the Gaussian
probability density function (crosses) with the mean value µ = λ = 20 and the variance σ2 = λ = 20.

Example 7.41. Within a long duration, continuous-time signal, an impulsive disturbance appears 15
times per minute, on average. What is the probability that there will be less than 3 impulsive
disturbances within a randomly selected continuous-time interval, whose duration is 24 seconds?

⋆ Since the analyzed interval is 24 seconds, all parameters will be reduced to 24 seconds, as the
unit of time. The average number of disturbances within every 24 seconds is 15/60 × 24 = 6.
This means that the parameter λ in the Poisson distribution is λ = 6. The probability that there
are less than 3 disturbing events in 24 seconds is then equal to the probability that there are either
0 disturbances, ε(n) = 0, or 1 disturbance, ε(n) = 1, or 2 disturbances, ε(n) = 2, within the
selected interval, that is
2
6k e −6 62 e−6
P ( ε ( n ) = 0) + P ( ε ( n ) = 1) + P ( ε ( n ) = 2) = ∑ = e−6 + 6e−6 + = 0.062.
k =0
k! 2!

This means that the event of less than 3 disturbances in 24 seconds will occur once in about 16
such intervals.
The probability of, for example, 6 or fewer disturbances in 24 seconds would be 0.6063.

7.4.13 Exponential Random Signal

A random signal with the probability density function

1 −ξ/β
p x (n) (ξ ) = e u(ξ )
β

and β > 0, is called the exponential distributed signal. The expected value of this signal is µ x = β,
since
Z∞ ∞ Z∞
ξ −ξ/β ξ
µx = e dξ = − βe−ξ/β + e−ξ/β dξ = β.
β β 0
0 0
Ljubiša Stanković Digital Signal Processing 365

The variance is σx2 = β2 .

The exponential distributed random variable is often used to model the time elapsed between
events that occur randomly over time. The main application area is in studies of the lifetime of
systems and components (system reliability and the times between events). The average lifetime for an
expectational distributed random variable is β.
The probability distribution of the exponential distributed signal is

Zχ
1 −ξ/β
Fx (χ) = e dξ = (1 − e−χ/β )u(χ).
β
0

The probability that a random variable x (n) will take a value greater than χ is

P{ x (n) > χ} = 1 − Fx (χ) = e−χ/β u(χ).

Example 7.42. A random signal x (n) is equal to the length of life of the system denoted by the index
n. The average lifetime of this system is 10 years and its life-length is exponentially distributed.
What is the probability that the signal value is x (n) > 20, meaning that the system n will last
more than 20 years?
If the system consists of three components whose life-lengths are statistically independent
and exponentially distributed, with average lifetimes β 1 = 5, β 1 = 10, and β 3 = 15 years,
respectively, and if the system fails if any of its components fails, what is the average lifetime of
the system?

⋆ The value of the parameter β in the exponential distribution is equal to the expected lifetime,
β = 10. The probability that the system lasts x (n) > 20, is

P{ x (n) > 20} = 1 − P{ x (n) ≤ 20} = 1 − Fx (20) = 1 − (1 − e−20/10 ) = e−2 = 0.1353.

The probability that the system with three statistically independent components will last
longer than χ is equal to the product of the probabilities that each of the components will last
longer than χ, that is

P { x ( n ) > χ } = P { x1 ( n ) > χ } P { x2 ( n ) > χ } P { x3 ( n ) > χ }

= (1 − Fx1 (χ))(1 − Fx2 (χ))(1 − Fx3 (χ))
−χ( β1 + β1 + β1 ) −χ( 15 + 10
1
+ 151 )
= e−χ/β1 e−χ/β2 e−χ/β3 u(χ) = e 1 2 3 u(χ) = e u(χ) = e−χ/2.7 u(χ).

The average lifetime is β = 2.7 and it is shorter than the average life of any of the components.

Example 7.43. Find the Fourier transform, characteristic function, and the moment generating function
of the exponentially distributed random variable x (n). What are the moments of this random
variable?
366 Discrete-Time Random Signals

⋆ The Fourier transform of the probability density function of an exponentially distributed

random variable is equal to
Z∞
1 −ξ/β − jθξ 1
X (θ ) = e e dξ = .
β 1 + jβθ
0

The characteristic function is equal to

1
Φ x (θ ) = X (−θ ) = ,
1 − jβθ

while the moment generating function is related to the Fourier transform as

1
Mx (θ ) = X (− jθ ) = = 1 + βθ + ( βθ )2 + +( βθ )3 + . . . .
1 − βθ

The moments are M1 = β, M2 = 2!β2 , M3 = 3!β3 , . . . , M N = N!β N .

The exponential distributed random variable exhibits memoryless property, since its probability
of exceeding the value χ + a, given that it has exceeded the value a, is

P { x ( n ) > χ + a | x ( n ) > a } = P { x ( n ) > χ },

regardless of a. The proof is simple using the conditional probability relation

P{ x (n) > χ + a and x (n) > a} = P{ x (n) > χ + a| x (n) > a} P{ x (n) > a}

and the fact that P{ x (n) > χ + a and x (n) > a} = P{ x (n) > χ + a}, for χ ≥ 0, since the event
x (n) > χ + a includes the event x (n) > a. These two relations produce

P{ x (n) > χ + a} e−(χ+ a)/β

P{ x (n) > χ + a| x (n) > a} = = = e−χ/β = P{ x (n) > χ}.
P{ x (n) > a} e− a/β
A random variable with the exponential distribution is memoryless because the past does no
influence on its future. This means that every instant is like the beginning of a new random period for
this random variable. For example, waiting time on the next call in call-center does not depend on the
time that has passed since the last call occurred.

7.4.14 Noisy Signals

In real-world scenario, the signals s(n) are commonly corrupted with additive disturbances, denoted
by ε(n). Then, processing methods are applied on the noisy signals,

x ( n ) = s ( n ) + ε ( n ),

where ε(n) is the additive noise. For a deterministic signal s(n), the expected value of the noisy signal
x (n) is equal to the sum of the deterministic signal value and the expected value of the noise, that is

E{ x (n)} = E{s(n) + ε(n)} = s(n) + µε (n).

The variance of the noisy signal is not influenced by the deterministic signal,

E{| x (n) − µε (n)|2 } = σε2 (n).

Ljubiša Stanković Digital Signal Processing 367

In some application the noise effect is multiplicative and depends on the signal itself. Then, the
noisy signal model is
x (n) = (1 + ε(n))s(n).
The expected value and the variance of the noisy signal, with multiplicative noise, are given by

E{ x (n)} = E{s(n) + ε(n)s(n)} = s(n)(1 + µε (n)),

E{| x (n) − µε (n)|2 } = |s(n)|2 σε2 (n).

Both the mean and the variance are signal-dependent in the case of multiplicative noise.
Depending on the type of noise, the results obtained so far for various disturbance forms, can be
applied to the analysis of noisy signals. This will be the topic of the next sections.

7.5 DISCRETE FOURIER TRANSFORM OF NOISY SIGNALS

In signal processing, the most common signal models are the sinusoidal signals, along with their
processing using the Fourier analysis. Influence of noise to this kind of signals and transforms will be
studied in this section.

7.5.1 Expected Value and Variance of the DFT

Consider a noisy signal

x ( n ) = s ( n ) + ε ( n ), (7.88)
where s(n) is a deterministic useful signal and ε(n) is an additive noise. The DFT of this signal is
N −1
X (k) = ∑ (s(n) + ε(n))e− j2πkn/N = S(k) + Ξ(k). (7.89)
n =0

The mean value of X (k) is

N −1 N −1
E{ X (k)} = ∑ s(n)e− j2πkn/N + ∑ E{ε(n)}e− j2πkn/N = S(k) + DFT{µε (n)}.
n =0 n =0

In the case of a zero-mean noise ε(n), when µε (n) = 0, follows

µ X (k) = E{ X (k)} = S(k). (7.90)

The variance of X (k), for a zero-mean noise, is

2
σX (k) = E{| X (k) − µ X (k)|2 } = E{ X (k) X ∗ (k)} − S(k)S∗ (k)
N −1 N −1
= ∑ ∑ E{(s(n1 ) + ε(n1 ))(s∗ (n2 ) + ε∗ (n2 ))}e− j2πk(n1 −n2 )/N
n1 =0 n2 =0
N −1 N −1 N −1 N −1
− ∑ ∑ s(n1 )s∗ (n2 )e− j2πk(n1 −n2 )/N = ∑ ∑ E{ε(n1 )ε∗ (n2 )}e− j2πk(n1 −n2 )/N . (7.91)
n1 =0 n2 =0 n1 =0 n2 =0

For a white noise, with the autocorrelation

rεε (n1 , n2 ) = E{ε(n1 )ε∗ (n2 )} = σε2 δ(n1 − n2 ),

368 Discrete-Time Random Signals

we get
2
σX (k) = σε2 N. (7.92)
If the deterministic signal s(n) is a complex sinusoid, that is

s(n) = Ae j2πk0 n/N , (7.93)

with the frequency k0 on the grid, ω0 = 2πk0 /N, then its DFT is

S(k) = ANδ(k − k0 ).

The peak signal-to-noise ratio, being relevant parameter for the DFT based estimation of the signal
frequency, is defined by

maxk |S(k)|2 A2 N 2 A2
PSNRout = 2
= 2 = 2 N. (7.94)
σX σε N σε

Its logarithmic form, expressed in dB, is 20 log10 ( AN/σε ). The value of the peak signal-to-noise ratio
increases as N increases. This result is expected, since the signal values are added in phase, increasing
the DFT amplitude N times (its power N 2 times), while the noise values are summed up in power.
The noise influence on the DFT of the real-valued sinusoid

s(n) = A cos(2πk0 n/N ) = ( Ae j2πk0 n/N + Ae− j2πk0 n/N )/2

is illustrated in Fig. 7.28.

X(k)
x(n)

n k
X(k)
x(n)

n k

Figure 7.28 Illustration of the noise-free signal, x (n) = cos(6πn/64), and its DFT, Xk (top panels). The same
signal is corrupted with an additive zero-mean real-valued Gaussian noise of variance σε2 = 1/4, and shown, along
with its DFT (bottom panels).

The input signal-to-noise ratio (SNR) for the signal in (7.93) is defined by
N −1
2
∑ | x (n)|
Ex n =0 N A2 A2
SNRin = = o = Nσ2 = σ2 . (7.95)
Eε N −1 n
2 ε ε
∑ E |ε(n)|
n =0
Ljubiša Stanković Digital Signal Processing 369

If the maximum DFT value is detected, then only its value could be used for the signal
reconstruction (equivalent to the notch filter at k = k0 being used). The DFT of the output signal
is then
Y ( k ) = X ( k ) δ ( k − k 0 ).
The output signal in the discrete-time domain is

1 N −1 1
Y (k)e j2πkn/N = X (k0 )e j2πk0 n/N .
N n∑
y(n) =
=0 N

Since X (k0 ) = AN + Ξ(k0 ) according to (7.89) and (7.92), where Ξ(k) is the noise in the frequency
domain, whose variance is equal to σε2 N, we get

Ξ(k0 ) j2πk0 n/N

y(n) = Ae j2πk0 n/N + e = x ( n ) + ε X ( n ).
N
The output signal-to-noise ratio is
N −1
2
∑ | x (n)|
Ex n =0 N A2 A2
SNRout =
Eε X
=
N −1
2 = Nσε2 = N σ2 = N · SNRin .
Ξ(k ) N N2 ε
∑ E N0 e j2πk0 n/N
n =0

Taking 10 log(◦) of both sides we get the output-to-input relation for the signal-to-noise in dB,

SNRout [dB] = 10 log N + SNRin [dB]. (7.96)

7.5.2 Spectral Estimation

In order to improve the representation and estimation performance of the Fourier transform of a noisy
signal s(n) + ε(n), the Fourier transform is commonly calculated using a window function w(n). This
topic will be studied again, in detail, in Part V, since the windows play a crucial role in time-frequency
analysis. Here, we will present the basic forms and results.
The assumed noise is additive and white, rεε = σε2 δ(n), with the zero-mean. The DFT of the
signal, multiplied by the window function, is equal to
N −1
X (k) = ∑ w(n) [s(n) + ε(n)] e− j2πkn/N .
n =0

The mean value of the windowed DFT is

N −1
µ X (k) = E{ X (k)} = ∑ w(n)s(n)e− j2πkn/N = W (k) ∗k S(k),
n =0

where W (k) = DFT{w(n)} is the DFT of the window and ∗k denotes the convolution in frequency.
The variance of X (k) is given by
N −1 N −1 N −1
2
σX (k) = ∑ ∑ w(n1 )w∗ (n2 )σε2 δ(n1 − n2 )e− j2πk(n1 −n2 )/N = σε2 ∑ |w(n)|2 = σε2 Ew ,
n1 =0 n2 =0 n =0
(7.97)

where Ew is the energy of the window.

370 Discrete-Time Random Signals

Since we will use mathematical tools that require continuous frequency, consider the Fourier
transform of discrete-time noisy signal x (n) = s(n) + ε(n),
∞
X (e jω ) = ∑ w(n) x (n)e− jωn , (7.98)
n=−∞

where w(m) is a real-valued window, such that w(0) = 1. The frequency variable will be kept in
continuous form since we will use its derivatives in the explanations that follow. The signal s(n) is
deterministic and the noise ε(n) = ε r (n) + jε i (n) is a complex-valued white Gaussian noise with
independent and identically distributed real and imaginary parts, N (0, σε2 /2)). The auto-correlation
function of this noise is
rεε (m) = E{ε(n)ε∗ (n − m)} = σε2 δ(m). (7.99)
The expected value of the Fourier transform, for the noisy signal x (n) = s(n) + ε(n), is
∞
E{ X (e jω )} = E{ ∑ w(n)[s(n) + ε(n)]e− jωn }.
n=−∞

Having in mind that E{ε(n)} = 0, follows

∞
E{ X (e jω )} = ∑ w(n)s(n)e− jωn . (7.100)
n=−∞

The expected value of the Fourier transform can be written as a convolution of the Fourier transform
W (e jω ) of the window w(n),
∞
W (e jω ) = ∑ w(n)e− jωn ,
n=−∞

and the original Fourier transform, S(e jω ), of the signal s(n), without the window
∞
S(e jω ) = ∑ s(n)e− jωn .
n=−∞

Thus,
Zπ
1
E{ X (e jω )} = S(e j(ω −α) )W (e jα )dα, (7.101)
2π
−π
where the integration is performed over the discrete-time Fourier transform period, −π < ω ≤ π.

7.5.3 Bias in the Fourier Transform of the Windowed Signals

The Fourier transform calculated with a window is biased. The window w(n) causes the bias in the
Fourier transform, since its application results in a form that differs from the original Fourier transform
without a window. By expanding S(e j(ω −α) ) in (7.101) into a Taylor series, around ω,

∂S(e jω ) 1 ∂2 S(e jω ) 2
S(e j(ω −α) ) = S(e jω ) − α+ α + ...,
∂ω 2 ∂ω 2
we get
Zπ
1 1 ∂2 S(e jω )
S(e j(ω −α) )W (e jα )dα = S(e jω ) + m2 + . . . , (7.102)
2π 2 ∂ω 2
−π
Ljubiša Stanković Digital Signal Processing 371

where
Zπ Zπ Zπ
1 jω 1 jω 1
W (e )dω = w (0) = 1,2
m1 = ωW (e )dω = 0, m2 = ω 2 W (e jω )dω.
2π 2π 2π
−π −π −π

The first frequency domain moment m1 (and all other odd moments) of W (e jω ) is equal to zero, since
W (e jω ) is an even function (as the Fourier transform of an even, real-valued window function w(n)).
From (7.102) follows that the first term is the original Fourier, while the remaining terms introduce
the Fourier transform distortion (bias). They can be approximated by
Zπ
1 1 ∂2 S(e jω ) 1 ∂2 S(e jω ) 1
S(ω − α)W (e jα )dα − S(e jω ) = m2 + . . . . ∼
= m2 = b(n, ω )m2 .
2π 8 ∂ω 2 2 ∂ω 2 2
−π
(7.103)

The bias of the Fourier transform is (approximately)

1
biasX (ω ) = b(n, ω )m2 . (7.104)
2
The Fourier transform bias is highly signal dependent. For the regions where the Fourier transform
variations in the frequency direction are small, as described by the second- and higher-order derivatives,
this bias is small and vice versa. The signal terms are multiplied by the frequency domain window
moments, mi , in the bias. These moments are small if W (e jω ) is highly concentrated around ω = 0.
The bias would be zero if there were no window, that is W (e jω ) = 2πδ(ω ), −π ≤ ω < π. In general,
a narrow W (e jω ) requires a wide window w(n) in the discrete-time domain.

7.5.4 Variance in the Fourier transform of Noisy Signals

The Fourier transform variance is defined by

2
σX = E{ X (e jω ) X ∗ (e jω )} − E{ X (e jω )} E{ X ∗ (e jω )}
n ∞ ∞ o
=E ∑ w(n)[s(n) + ε(n)]e− jωn ∑ w(n)[s∗ (n) + ε∗ (n)]e jωn
n=−∞ n=−∞
n ∞ o n ∞ o
−E ∑ w(n)[s(n) + ε(n)]e− jωn E ∑ w(n)[s∗ (n) + ε∗ (n)]e jωn
n=−∞ n=−∞
∞ ∞
= ∑ ∑ w(n)w(m)rεε (n, m)e− jωn e jωm .
n=−∞ m=−∞

A complex-valued Gaussian noise with independent and identically distributed real and imaginary parts,
N (0, σε2 /2)) ia assumed. For a white noise, the variance of the Fourier transform estimator reduces to
∞
2
σX = ∑ σε2 w2 (n) = σε2 Ew , (7.105)
n=−∞

where Ew is the energy of the window. A finite energy window is sufficient to make the variance of
X (e jω ) finite for the Gaussian, zero-mean, white noise. We can conclude that the variance increases as
the energy of the window, Ew , increases. This means that wide windows will produce big variances,
just opposite to the bias which is small for wide windows. Since narrow windows produce large bias
and wide windows are characterized by large variances in the Fourier transform estimation, a trade-off
is required to balance these two sources of the estimation error.
372 Discrete-Time Random Signals

7.5.5 Bias-to-Variance Trade-Off: Optimum Window Width

The optimum window width can be obtained by minimizing the mean squared error (MSE) defined as
a sum of the squared bias and variance

e2 = bias2X (ω ) + σX
2
( ω ). (7.106)

Example 7.44. Consider a signal s(n) whose second-order derivative of the Fourier transform is
∂S(e jω )/∂ω 2 (higher-order derivatives can be neglected), and the Hann(ing) window w(n) of
the width N is used in calculation. Find the optimum window width.

⋆ For the Hann(ing) window, Ew = 3N/8 and m2 = 2π 2 /N 2 , so using (7.103) and (7.105), we
get
!2
2∼ π
4 ∂2 S(e jω ) 3N 2
e = 4 + σ . (7.107)
N ∂ω 2 8 ε
It has been assumed that the fourth and other higher-order Fourier transform derivatives can be
neglected. From ∂e2 /∂N = 0, the approximation of the optimum window width follows
s
2 4
∼ 5 40b (ω )π
Nopt (ω ) = (7.108)
3σε2

with b(ω ) = ∂2 S(e jω )/∂ω 2 . Roughly speaking, this relation means that small values of the
window width (intensive smoothing in frequency direction) should be used at the points where
there are no variations in frequency of the Fourier transform, that is, where b2 (ω ) is small.
When b2 (ω ) is large, then the window should be wide, meaning less intensive smoothing, that is,
keeping the original Fourier transform form, for the points when its variations are high. As far as
the noise is concerned, low noise cases (small σε2 ) do not require any smoothing of the original
Fourier transform in the frequency direction. Thus, wide windows should be used. For a high
noise, the Fourier transform smoothing will improve the results.

Of course, in reality, we do not know anything about the signal or its Fourier transform in advance.
An algorithm for the estimation of Nopt (ω ), without using the value of b2 (ω ), will be presented in the
next example.

Example 7.45. The noisy signal

√ 2
x (n) = 2 cos(1.5n) + 2 cos(2.6n) + 150e−n /2
+ 4ε(n),

within −512 ≤ n ≤ 511, where ε(n) is the zero-mean, unit-variance Gaussian noise, σε = 1, is
analyzed using the Fourier transforms, X N (k), with two Hann(ing) windows, one whose width is
N = 1024 and the other with N = 128. For each frequency index k, we will use better of these
two Fourier transforms by checking the confidence intervals intersection.
To simplify the problem, a real-valued and even signal is assumed whose Fourier transform
is real-valued. The standard deviation of the real part of the√Fourier transform, X N (k), calculated
using the Hann(ing) window of the width N, is σXN = √σε 3N/8, while the confidence interval
2
Ljubiša Stanković Digital Signal Processing 373

for the Fourier transform, at a frequency ω with the index k, is

[ X N (k) − 2.5σXN , X N (k) + 2.5σXN ],

where the factor of 2.5 is used for the confidence intervals (probability of almost 0.99),
√ assuming
that the noise variance can be estimated from the data. The standard deviation σε / 2 was used
in σXN since the noise was not even and only a half of its power is in the real-valued part of the
Fourier transform.

⋆ For each frequency index k, with the corresponding continuous frequency ω = 2πk/1024,
the Fourier transform is calculated using N = 128, zero-padded up to 1024. This value is denoted
by X128 (k). Then the Fourier transform with N = 1024 is calculated and denoted by X1024 (k).
The confidence intervals are formed for these two Fourier transform values calculated with two
window widths,
h r r i
3 3
X128 (k) − 7.1 128, X128 (k) + 7.1 128
8 8
h r r i
3 3
X1024 (k) − 7.1 1024, X124 (k) + 7.1 1024 .
8 8
If these intervals intersect, then X (k) = X128 (k), otherwise X (k) = X1024 (k). Namely, if the bias
is small, then the Fourier transform X N (k) calculated using both windows will contain the true
value of the Fourier transform (of the noise-free signal). Therefore, for small bias the confidence
intervals will intersect, meaning we should use the window with a smaller variance, which is in
our experiment N = 128. If the bias is large, then it will highly depend on the window width
and will move the obtained Fourier transform X N (k) from its true position. Then, the confidence
intervals will be dominated by the bias (different for two windows) and will not contain the
true Fourier transform value, meaning that they will not intersect. Since the bias is large, in this
case, we should use a small bias window with N = 1024. The result is shown in Fig. 7.29. The
improvement in the SNR ratio is evident.
This is a simplified version of the intersection of confidence intervals (ICI) method to the
window width optimization (Katkovnik-Stankovic method for the window width optimization in
time-frequency analysis). For practical applications, the noise variance should also be estimated
from the data (see Problem 7.12).

Calculation of higher-order moments and the cross-correlation functions for the Fourier transform
of noisy signals could be found in the literature (for the correlation calculation, see the problems).

7.5.6 Periodogram

The power spectral density of signal is commonly estimated using the squared absolute value of the
Fourier transform of the signal, called periodogram,
2
1 2 1 N
jω jω − jωn
Px (e ) = X (e ) = ∑ x (n)e . (7.109)
2N + 1 2N + 1 n=− N
374 Discrete-Time Random Signals

600
400
200
0
-200 (a)
-3 -2 -1 0 1 2 3

600
400
200
0
-200 (b)
-3 -2 -1 0 1 2 3

600
400
200
0
-200 (c)
-3 -2 -1 0 1 2 3

600
400
200
0
-200 (d)
-3 -2 -1 0 1 2 3

1000

500

0
(e)
-3 -2 -1 0 1 2 3

Figure 7.29 Spectral analysis of a signal with two windows in order to approximate optimal window width for
Example 7.45.
Ljubiša Stanković Digital Signal Processing 375

The periodogram is used as an estimate of the power spectral density,

2N
Sxx (e jω ) = lim ∑ r xx (k)e− jωk = FT{r xx (n)} = Sxx (e jω ).
N →∞
k =−2N

As it has been shown in Section 7.3.4, the periodogram is equal to the power spectral density calculated
(windowed) by a Bartlett window, that is
2N
|k|
Pxx (e jω ) = lim ∑ (1 − )r xx (k)e− jωk = Sxx (e jω ) ∗ω W (e jω ), (7.110)
N →∞
k =−2N
2N + 1

|n|
where W (e jω ) = FT{(1 − 2N +1 )}. This means that the periodogram is a biased estimate of the power
spectral density, for any signal, except r xx (k) = Cδ(k).

Example 7.46. Find the power spectral density of the random signal

x (n) = ε(n) + 0.5ε(n − 4),

where ε(n) is zero-mean Gaussian noise with unite variance. Find the power spectral density
calculated using the periodogram with a window of the width N.

⋆ The autocorrelation function of this signal is

r xx (n) = (1 + 0.52 )δ(n) + 0.5δ(n + 4) + 0.5δ(n − 4)

Its power spectral density is

∞
Sxx (ω ) = FT{r xx (n)} = ∑ (1.25δ(n) + 0.5δ(n + 4) + 0.5δ(n − 4))e− jωn = 1.25 + cos(4ω ).
n=−∞

If the periodogram is used, with a window whose width is N, then

2
1 2
jω 1 N/2−1
− jωn
N/2−1
|k|
jω
Pxx (e ) = E{ X (e ) } = E{ ∑ x (n)e } = ∑ (1 − )r xx (k)e− jωk
N N n=− N/2 k=− N/2
N
N/2−1 N/2−1
|k| 4
= ∑ r xx (k)e− jωk − ∑ r xx (k)e− jωk = 1.25 + cos(4ω ) − cos(4ω ).
k =− N/2 k=− N/2
N N

The bias term is Pxx (e jω ) − Sxx (ω ) = − N4 cos(4ω ).

]

The periodogram of a noisy signal is also a biased estimator of the noise-free periodogram of
deterministic signals. Consider the signal x (n) = s(n) + ε(n), where s(n) is deterministic and ε(n) is
white complex-valued i.i.d. noise with the variance σε2 . Its periodogram is
2
1 N/2−1
jω − jωn
Px (e ) = ∑ (s(n) + ε(n))e . (7.111)
N n=− N/2
376 Discrete-Time Random Signals

The expected value of this periodogram is

2 2
jω 1 N/2−1
− jωn 1 N/2−1
− jωn
Pxx (e ) = E{ ∑ (s(n) + ε(n))e } = ∑ s(n)e + σε2 . (7.112)
N n=− N/2 N n=− N/2

The bias is equal to σε2 .

The variance of this estimator can be calculated as well (see Exercise 7.11 and relation (7.161)),

Var{ Pxx (e jω )} = 2Ps (e jω )σε2 + σε4 . (7.113)

The variance consists of two parts:

(1) σε4 , which is constant, and
(2) 2Ps (e jω )σε2 , being signal-dependent.
Since the variance of periodogram, Var{ Pxx (e jω )}, is proportional to Ps (e jω ), its highest value is
achieved at the maximum of the noise-free signal periodogram, Ps (e jω ), as illustrated in Fig. 7.30 for
the noisy chirp signal
2 π 2 1
x (n) = e−(n/256) e j 2 n /N + ε(n),
4
where −128 ≤ n ≤ 127, and ε(n) is the zero-mean Gaussian noise with a unit variance.

0
-100 -50 0 50 100 (a)

0.5

-0.5
-100 -50 0 50 100 (c)

Figure 7.30 Periodogram of a chirp signal (a), chirp-noisy signal (b), and the difference of the previous two
periodograms, being highly signal dependent, with variations (and variance) proportional to |S(k )|2 /N, (c).
Ljubiša Stanković Digital Signal Processing 377

Blackman–Tukey spectral estimator is obtained from (7.110), using a general window

N
Pxx (e jω ) = ∑ w(k)r xx (k)e− jωk . (7.114)
k =− N

The window, w(k), decays smoothly from w(0) = 1 toward zero for k = ± N. The frequency
domain form of this estimator is equal to the convolution of the true power spectral density and
the Fourier transform of the window, W (e jω ) = FT{w(n)}). In the discrete frequency domain, the
Blackman–Tukey periodogram can be calculated using

Pxx (k) = ∑ W (k − i )DFT{r xx (k)},

where r xx (k) is estimated using the standard unbiased autocorrelation estimator

1 N − k −1
N − k i∑
r̂ xx (±k) = x ( k + i ) x (i )
=0

or the standard biased autocorrelation estimator

1 N − k −1
N i∑
r̂ xx (±k) = x ( k + i ) x ( i ).
=0

The biased estimator under-estimates r xx (k) values for large |k|, however, they should be small anyway.
This estimator avoids possible large outliers in estimating r xx (k) from a small number of samples, for
large |k|.

Daniell periodogram. In order to reduce the noise influence, the smoothed versions of the periodogram
are used as the spectral estimators. The simplest smoothed form of the periodogram is
L L
1 1 1
PxS (k) = ∑ Px (k − i ) = ∑ | X (k − i )|2 .
2L + 1 i=− L 2L + 1 i=− L N

Here, the frequency domain window, W (k), takes the simplest possible form of the rectangular window
in the Blackman-Tukey method, where W (i )| X (k − i )|2 /N was used. Therefore, the Daniell spectral
estimator is a particular case of the Blackman–Tukey class of spectral estimators. It can easily be
related to the Blackman–Tukey periodogram estimator (7.110) using
N/2−1
|n|
A
Sxx (e jω ) = ∑ w(n)(1 − )r xx (n)e− jωn = W (k) ∗k Sxx (k),
n=− N/2
N

where the smoothing window in the frequency domain is the Fourier transform of the auto-correlation
|n|
function window, w(n)(1 − N ), and corresponds to

L L
1 1 1
PxS (k) = ∑ W (i ) Px (k − i ) = ∑ W (i )| X (k − i )|2 . (7.115)
2L + 1 i=− L 2L + 1 i=− L N

S-method. In the analysis of signals with varying spectral content, the Fourier transform is spread due
to the frequency variations of the spectral content within the window (see the stationary phase method in
Chapter 1). Then, instead of smoothing the periodogram in the same direction (in-direction smoothing)
378 Discrete-Time Random Signals

in (7.115), the counter-direction cross-multiplication can be done, and the spectral estimator
L
1 1
SMx (k) = ∑ W (i ) X ( k + i ) X ∗ ( k − i )
2L + 1 i=− L N

is obtained. This is the so-called S-method based spectral estimator. By increasing the width of L in
this method, we could arrive at the Wigner distribution.

Example 7.47. Estimate the power spectral density of the signal

√ 2 π 2
x (n) = cos(200πn/N ) + 2e−(n/64) e j 4 n /N + ε(n),

where −128 ≤ n ≤ 127, and ε(n) is a zero-mean Gaussian noise with the unit variance. Use
the periodogram with a Hann(ing) window of the width N = 256, the Daniell (Blackman-Tukey
smoothed) estimator, and the S-method, with the same window. In both, the Blackman-Tukey
estimator and the S-method estimator, use L = 7 and W (i ) = 1.

⋆ Spectral analysis of this random noisy signal using the periodogram, the Daniell (Blackman-
Tukey smoothed) estimator, and the S-method based estimator is shown in Fig. 7.31. The
periodogram of the noise-free signal is shown in Fig. 7.31(a). Two highly concentrated sinusoidal
components and one spread (chirp) component can be noticed. For the noisy signal, the noise
almost completely degrades the chirp component in the periodogram with the Hann(ing) window,
Fig. 7.31(b). The visibility of this component is significantly improved by smoothing the
periodogram as in the Daniell (Blackman-Tukey smoothed) estimator, given in Fig. 7.31(c).
In this case, the highly concentrated sinusoidal components are spread as well. Combining the
Fourier transform values in the counter-direction, the S-method based spectral estimation is
obtained. This estimator preserves a high concentration of the sinusoidal components while
improving the concentration of the chirp signal, as shown in Fig. 7.31(d). The S-method based
spectral estimator of the noise-free signal is given in Fig. 7.31(e).

Bartlett Method and Welch periodogram. The Fourier transform of the signal x (n), whose duration
is N, is calculated here over K shorter intervals. The duration of these intervals is M, commonly with
the step R = M (Daniell periodogram) or R = M/2 (Welch periodogram, in this case a window can
also be used). The Fourier transforms of x (n), within these shorter intervals, are

1 M −1
Xi (e jω ) = x (iR + n)e− jωn
M n∑
=0

for i = 0, 1, . . . , K. The power spectral densities | Xi (e jω )|2 are averaged to produce

1 K −1 jω
2
PxS (ω ) =
K i∑
X i ( e ) .
=0

For a numeric illustration of the Welch periodogram calculation see Example 7.52.
Ljubiša Stanković Digital Signal Processing 379

0 (a)
-100 -50 0 50 100

0 (b)
-100 -50 0 50 100

0 (c)
-100 -50 0 50 100

0 (d)
-100 -50 0 50 100

0 (e)
-100 -50 0 50 100

Figure 7.31 Spectral analysis of the random noisy signal, x (n), using the periodogram, the Daniell (Blackman-
Tukey smoothed) estimator, and the S-method based spectral estimator. Order and the description of the panels
correspond to the task order in Example 7.47.
380 Discrete-Time Random Signals

7.5.7 Detection of a Sinusoidal Signal Frequency

Consider a set of data x (n), for 0 ≤ n ≤ N − 1. Assume that this set of data are noisy samples of the
signal
s(n) = Ae j2πk0 n/N .
The additive noise ε(n) is white, complex-valued Gaussian, with zero-mean and independent real and
imaginary parts. The variance of noise is σε2 . The aim is to find the signal s(n) parameters from the
noisy observations x (n). Since the signal form is known we look for a solution of the same form, using
the model be j2πkn/N where b and k are parameters that have to determined, and

α = {b, k }

is the set of these parameters. The parameter b is complex-valued. It includes the amplitude and the
initial phase of the signal model. For every value of x (n) we may define an error as a difference of the
true value x (n) and the assumed model, at the considered instant n,

e(n, α) = x (n) − be j2πkn/N . (7.116)

Since the noise is Gaussian, the probability density function of the error is
1 2 2
p(e(n, α)) = √ e−|e(n,α)| /(2σε ) .
σε 2π
The joint probability density function, for all signal samples from the data set, is equal to the product
of the individual probability density functions
1 N −1 2 2
pe (e(0, α), e(1, α), . . . , e( N − 1, α)) = 2 N/2
e− ∑n=0 |e(n,α)| /(2σε ) .
(2πσε )

The maximum-likelihood solution for the parameters α = {b, k} in obtained by maximizing the
probability density function for given values of x (n). Maximization of pe (e(0, α), e(1, α), . . . , e( N −
1, α)) is the same as the minimization of the total squared error,
N −1 N −1 2

ǫ(α) = ∑ |e(n, α)|2 = ∑ x (n) − be j2πkn/N . (7.117)
n =0 n =0

The solution to this problem is obtained from ∂ǫ(α)/∂b∗ = 0 (see Example 1.3). It is in the form of a
standard DFT of signal x (n),

1 N −1 n o 1
b= ∑ x (n)e− j2πkn/N = mean x (n)e− j2πkn/N = X (k).
N n =0 N

A specific value of parameter k, that minimizes ǫ(α) and gives the estimate of the signal frequency
index k0 , is obtained by replacing the obtained b back into relation (7.117), defining ǫ(α),
!
N −1 N −1
ǫ(α) = ∑ | x (n) − be j2πkn/N |2 = ∑ | x (n)|2 − N | b |2 .
n =0 n =0

The minimum value of ǫ(α) is achieved when |b|2 (or | X (k)|2 ) is maximum,

k̂0 = arg{max| X (k)|2 } = arg{max| X (k)|}.

If there is no noise | x (n)| = A , k̂0 = k0 , b = A or X (k0 ) = N A, and ǫ(k0 ) = 0.

Ljubiša Stanković Digital Signal Processing 381

The same approach can be used for the signal

s(n) = Ae jω0 n .

Assuming the solution in the form be jωn , the Fourier transform of discrete-time signals would follow.
If the additive noise were, for example, impulsive with the Laplacian distribution, then the
probability density function would be
1 −|e(n,α)|/σε
p(e(n, α)) = e ,
2σε

and the solution to ǫ(α) = ∑nN=−01 |e(n, α)| minimization would follow from
n o
X (k) = N median x (n)e− j2πkn/N .
n=0,1,...,N −1

Note that the absolute value of error can be written as

|e(n, α)| = x (n) − be j2πkn/N = x (n)e− j2πkn/N − b .

Minimization of a sum of this kind of terms is discussed in Part VI.

Example 7.48. The DFT definition, for a given frequency index k, can be understood as
N −1
X (k) = ∑ (s(n) + ε(n))e− j2πkn/N
n =0
n o
=N mean (s(n) + ε(n))e− j2πkn/N (7.118)
n=0,1,...,N −1

Based on the definition of median, discuss when the DFT estimation

n o
XR (k) = N median Re (s(n) + ε(n))e− j2πkn/N (7.119)
n=0,1,...,N −1
n o
+ jN median Im (s(n) + ε(n))e− j2πkn/N
n=0,1,...,N −1

can produce better results than (7.118). Calculate the value of X (0) using (7.118) and estimate its
value by (7.119) for the signal
s(n) = exp( j4πn/N )
with N = 8, and the additive noise

ε(n) = 2001δ(n) − 204δ(n − 3).

Which of these two estimates is closer to the noise-free DFT value?

⋆If a strong impulsive noise is expected in the signal, then the mean value will be highly
sensitive to this kind of noise. As it is stated, the median based calculation is less sensitive to
strong impulsive disturbances. For the signal

s(n) = exp( jπn/2) = [1, j, −1, − j, 1, j, −1, − j]

382 Discrete-Time Random Signals

and the given noise ε(n), the value of X (0) is equal to

7
X (0) = ∑ (s(n) + ε(n)) = 0 + 2001 − 204 = 805.
n =0

The median-based estimation is

XR (0) = 8median{2002, 0, −1, −204, 1, 0, −1, 0} (7.120)

+ j8median{0, 1, 0, −1, 0, 1, 0, −1} = 0 + j0.

Obviously the median-based estimate is not influenced by this impulsive noise. In this case it
produced better estimate (the exact value) of the considered noise-free DFT element X (0).

Now we will analyze the signal frequency estimation for a single component sinusoidal signal
s(n), with unknown discrete frequency ω0 = 2πk0 /N using the DFT. Since the signal frequency is
assumed on the frequency grid, this case can be understood as the signal frequency position detection.
Available observations of the signal are

x (n) = s(n) + ε(n), for 0 ≤ n ≤ N − 1,

where ε(n) is a complex zero-mean i.i.d. Gaussian white noise, with variance σε2 . Its DFT is
N −1
X (k) = ∑ (s(n) + ε(n))e− j2πkn/N = N Aδ(k − k0 ) + Ξ(k),
n =0

with σX2 ( k ) = σ2 N and E{ Ξ ( k )} = 0. The real and imaginary parts of the DFT X ( k ), at the signal
ε 0
position k = k0 , are the Gaussian random variables, with the total variance σε2 N, or

N ( N A, σε2 N/2), N (0, σε2 N/2), (7.121)

respectively, where a real-valued A is assumed without any loss of generality.

The real and imaginary part of the DFT of noise only, X (k) = Ξ(k), for k 6= k0 are zero-mean
random variables with the same variance

N (0, σε2 N/2).

Next, we will find the probability that a DFT value of noise at any k 6= k0 is higher than the
signal DFT value at k = k0 . This case corresponds to a false detection of the signal frequency position,
resulting in an arbitrary large and uniform estimation error (within the considered frequency range).
The probability density function for the absolute DFT values outside the signal frequency, k 6= k0 ,
is Rayleigh-distributed (7.84)
2ξ −ξ 2 /(σε2 N )
q(ξ ) = e , ξ ≥ 0.
σε2 N
The DFT at a noise only position takes a value greater than χ, with probability
Z∞
2ξ −ξ 2 /(σε2 N ) χ2
Q(χ) = 2
e dξ = exp(− 2 ). (7.122)
σε N σε N
χ

The probability that a DFT of noise only is lower than χ is [1 − Q(χ)]. The total number of noise only
points in the DFT is M = N − 1. The probability that M independent DFT noise only values are lower
Ljubiša Stanković Digital Signal Processing 383

than χ is [1 − Q(χ)] M . Probability that at least one of M DFT noise only values is greater than χ, is

G (χ) = 1 − [1 − Q(χ)] M . (7.123)

The probability density function for the absolute DFT values at the position of the signal (whose
real and imaginary parts are described by (7.121)) is Rice-distributed
2ξ −(ξ 2 + N 2 A2 )/(σε2 N )
p(ξ ) = e I0 (2N Aξ/(σε2 N )), ξ ≥ 0, (7.124)
σε2 N

where I0 (ξ ) is the zero-order modified Bessel function (for A = 0, when I0 (0) = 1 the Rayleigh
distribution is obtained).
When a noise only DFT value surpasses the DFT signal value, then an error in the estimation
occurs. To calculate this probability, consider the absolute DFT value of a signal at and around ξ. The
DFT value at the signal position is within ξ and ξ + dξ with the probability p(ξ )dξ , where p(ξ )
is defined by (7.124). The probability that at least one of M DFT noise only values is above ξ in
amplitude, is equal to
G (ξ ) = 1 − [1 − Q(ξ )] M .
Thus, the probability that the absolute value of the DFT of the signal component is within ξ and ξ + dξ
and that at least one of the absolute DFT noise only values exceeds the DFT signal value is equal to
G (ξ ) p(ξ )dξ. Considering all possible values of ξ, from (7.122) and (7.123), the probability of the
wrong signal frequency detection follows as
Z∞ Z∞ M !
ξ2
PE = G (ξ ) p(ξ )dξ = 1 − 1 − exp(− 2 )
σε N
0 0
2ξ 2 2 2 2
× 2 e−(ξ + N A )/(σε N ) I0 (2N Aξ/(σε2 N ))dξ. (7.125)
σε N
An approximation of this expression can be obtained by assuming that the DFT of the signal
component is not random and that it is equal to N A (positioned at the mean value of the signals DFT),
M
N A2
PE ∼
= 1 − 1 − exp(− 2 ) . (7.126)
σε N

Analysis can easily be generalized to the case with K signal components, s(n) = ∑kK=1 Ak e jωk n .
In many cases, the discrete frequency of the deterministic signal does not satisfy the relation
ω0 = 2πk0 /N, where k0 is an integer. In these cases, when ω0 6= 2πk0 /N, the frequency estimation
result can be improved , for example, by zero-padding before the Fourier transform calculation or using
finer grid around the detected maximum. Comments on the estimation of signal frequency outside the
grid are given in Chapter III as well.

7.6 LINEAR SYSTEMS AND RANDOM SIGNALS

If a random signal x (n) passes through a linear time-invariant system, with an impulse response h(n),
then the expected value of the output signal y(n) is given by
∞ ∞
µy (n) = E{y(n)} = ∑ h(k)E{ x (n − k)} = ∑ h(k)µ x (n − k) = h(n) ∗n µ x (n). (7.127)
k =−∞ k=−∞
384 Discrete-Time Random Signals

For a stationary input signal the expected value is in the form

∞
µy = µ x ∑ h(k) = µ x H (e j0 ). (7.128)
k=−∞

The cross-correlation of the output and input signal is

∞ ∞
ryx (n, m) = E{y(n) x ∗ (m)} = ∑ E{ x (k) x ∗ (m)}h(n − k) = ∑ r xx (k, m)h(n − k).
k =−∞ k=−∞
(7.129)

For a stationary input signal, with changes of indices n − m = l and k − m = p, we get

∞
ryx (l ) = ∑ r xx ( p)h(l − p) = r xx (l ) ∗l h(l ).
p=−∞

The cross-correlation of the input signal, x (n), and the output signal, y(n), is obtained as the
convolution of the input signal autocorrelation and the impulse response, h(n). The z-transform
of both sides of this equation gives

Ryx (z) = R xx (z) H (z).

The cross-correlation of the input signal and the output signal is

∞ ∞
r xy (n, m) = E{ x (n)y∗ (m)} = ∑ E{ x (n) x ∗ (k)} h∗ (m − k) = ∑ r xx (n, k)h∗ (m − k).
k =−∞ k=−∞

For a stationary signal, with n − m = l and n − k = p, we get that the cross-correlation of the input
signal and the output signal is equal to the convolution of the input signal autocorrelation and the
reversed and conjugated impulse response, that is
∞
r xy (l ) = ∑ r xx ( p)h∗ ( p − l ) = r xx (l ) ∗l h∗ (−l ).
p=−∞

The z-transform of both sides maps this equation into the z-transform domain
∞ ∞ ∞ ∞ ∞ −k
∑ r xy (l )z−l = ∑ ∑ r xx ( p)h∗ ( p − l )z−l = ∑ ∑ r xx ( p)h∗ (k)z− p z−1
l =−∞ l =−∞ p=−∞ k=−∞ p=−∞
1
R xy (z) = R xx (z) H ∗ ( ∗ ).
z
The Fourier transform form the last equation is of the form

Sxy (e jω ) = Sxx (e jω ) H ∗ (e jω ). (7.130)

Similarly, the autocorrelation of the output signal y(n) is defined by

∞ ∞
ryy (n, m) = E{y(n)y∗ (m)} = ∑ ∑ E{ x (l ) x ∗ (k)}h(n − l )h∗ (m − k). (7.131)
k =−∞ l =−∞
Ljubiša Stanković Digital Signal Processing 385

After some straightforward transformations, we get the z-transform domain relation between the
autocorrelations of the stationary input and the output signal
1
Ryy (z) = R xx (z) H (z) H ∗ ( ∗ ).
z
The Fourier transform of output signal autocorrelation function in terms of the Fourier transform of the
input signal autocorrelation function and the system frequency response is given by
2

Syy (e jω ) = Sxx (e jω ) H (e jω ) , (7.132)

proving that Sxx (e jω ) is indeed the power density function. By taking a narrow-pass filter with the unit
2

amplitude H (e jω ) = 1 for ω0 ≤ ω < ω0 + dω, we will get the spectral density of the signal x (n)
within that small frequency range.

Example 7.49. The linear time-invariant system is defined by

y(n) = x (n) + ax (n − 1) + a2 x (n − 2).

The input signal is a zero-mean white noise ε(n) with the variance σε2 . Find the cross-correlation
of the input signal and the output signal and the autocorrelation of the output signal. For a = −1
find the power spectral density of the output signal.

⋆The system transfer function is

H (z) = 1 + az−1 + a2 z−2 .

Since the input signal is a white noise of variance σε2 its autocorrelation is, by definition,

r xx (n) = rεε (n) = σε2 δ(n).

The power spectral density of the input signal is obtained as the Fourier transform of the
autocorrelation function, that is
∞
Sxx (ω ) = ∑ r xx (n)e− jωn = σε2 .
n=−∞

The z-transform of the autocorrelation function of the input signal is obtained as

∞ ∞
R xx (z) = ∑ r xx (n)z−n = σε2 ∑ δ(n)z−n = σε2 .
n=−∞ n=−∞

The z-transform of the autocorrelation function of the output signal, for the linear time-invariant
system, is equal to

Ryy (z) = R xx (z) H (z) H ∗ (1/z∗ )

h i
= σε2 1 + a2 + a4 + a(1 + a2 )(z + z−1 ) + a2 (z2 + z−2 ) .
386 Discrete-Time Random Signals

The autocorrelation function of the output signal is equal to the inverse z-transform of Ryy (z),

ryy (n) = σε2 (1 + a2 + a4 )δ(n) + σε2 a(1 + a2 )(δ(n + 1) + δ(n − 1))

+ σε2 a2 (δ(n + 2) + δ(n − 2)).

The power spectral density of the output signal is

Syy (ω ) = Ryy (e jω ) = σε2 (1 + a2 + a4 + 2a(1 + a2 ) cos ω + 2a2 cos(2ω )),

while the z-transform of the cross-correlation of the input and output signal is

Ryx (z) = H (z) R xx (z) = (1 + az−1 + a2 z−2 )σε2 .

Its inverse z-transform of Ryx (z) is equal to the cross-correlation, ryx (n),

ryx (n) = σε2 (δ(n) + aδ(n − 1) + a2 δ(n − 2)).

For a = −1 the power spectral density function of the output signal is

Syy (ω ) = σε2 (3 − 4 cos ω + 2 cos(2ω )) = σε2 (1 − 4 cos ω + 4 cos2 ω ) = σε2 (1 − 2 cos ω )2 .

Example 7.50. For the discrete-time system defined by

y(n) − 1.3y(n − 1) + 0.36y(n − 2) = x (n)

with the random input signal x (n) = ε(n), µε = 0 and rεε (n) = δ(n), find:
(a) The expected value µy (n) and the autocorrelation ryy (n) of the output signal,
(b) The power spectral density functions Syy (ω ) and Syx (ω ).

⋆(a)The expected value of the output signal is obtained from

µy = µ x H (e j0 ) = µε H (e j0 ) = 0.

The z-transform of the autocorrelation of the output signal, y(n), is

Ryy (z) = R xx (z) H (z) H (1/z),

since H (z) is the z-transform of a real-valued signal, when

∞ ∞
H ∗ (1/z∗ ) = ∑ (h(n)(1/z∗ )−n )∗ = ∑ h(n)(1/z)−n = H (1/z).
n=−∞ n=−∞

The autocorrelation of the input signal, x (n), is

R xx (z) = 1.

The transfer function of the considered system has the following form
1 1
H (z) = = .
1 − 1.3z−1 + 0.36z−2 (1 − 0.9z−1 )(1 − 0.4z−1 )
Ljubiša Stanković Digital Signal Processing 387

Therefore, the autocorrelation of the output signal is

1
Ryy (z) =
(1 − 0.9z−1 )(1 − 0.4z−1 )(1 − 0.9z)(1 − 0.4z)
or
25 z z
Ryy (z) = − .
8 (z − 0.4)(z − 1/0.4) (z − 0.9)(z − 1/0.9)
The inverse z-transform of Ryy (z) is

25 0.9 0.4
ryy (n) = (0.9)|n| − (0.4)|n| .
8 0.19 0.84
(b) The power spectral density of the output signal is obtained in the form
1
Syy (ω ) = Ryy (z)|z=e jω = ,
(1.16 − 0.8 cos ω )(1.81 − 1.8 cos ω )

while the cross-power spectral density function Syx (ω ) is equal to Ryx (z) at z = e jω ,

Syx (ω ) = Ryx (z)|z=e jω = H (z) R xx (z)|z=e jω

1
= .
1 − 1.3 cos ω + 0.36 cos 2ω + j(1.3 sin ω − 0.36 sin 2ω )

Example 7.51. The white noise ε(n) with variance σε2 and zero mean is an input to a linear time-
invariant system. If the impulse response of the system is h(n) show that

E { x (n)y(n)} = h(0)σε2

and
∞
σy2 = σε2 ∑ |h(n)|2 = σε2 Eh ,
n=−∞
where y(n) is the output of this system.

⋆The expected value of the product of the input signal and the output signal is
( )
∞
E { x (n)y(n)} = E ∑ h(k) x (n) x (n − k) .
k =−∞

Since the impulse response is a deterministic signal

∞ ∞
E { x (n)y(n)} = ∑ h(k)E { x (n) x (n − k)} = ∑ h(k)r xx (k).
k =−∞ k =−∞

the autocorrelation function of the input signal is

r xx (n) = σε2 δ(n),

388 Discrete-Time Random Signals

producing
∞
E { x (n)y(n)} = ∑ h(k)σε2 δ(k) = h(0)σε2 .
k=−∞
The variance of output signal is defined by

σy2 = E {y(n)y∗ (n)} − E {y(n)} E {y∗ (n)}

or ( )
∞ ∞
σy2 =E ∑ h(k) x (n − k) ∑ h∗ (k ) x ∗ (n − k )
( k=−∞ ) k(=−∞ )
∞ ∞
−E ∑ h(k) x (n − k) E ∑ h∗ (k ) x ∗ (n − k ) .
k=−∞ k=−∞
The output signal is a zero-mean signal, since
∞
E {y(n)} = E {y∗ (n)} = ∑ h(k)E { x (n − k)} = 0.
k =−∞

The variance of the output signal can be written in the form

∞ ∞ ∞ ∞
σy2 = ∑ ∑ h(k)h∗ (l ) E { x (n − k) x ∗ (n − l )} = ∑ ∑ h(k)h∗ (l )r xx (l − k).
k =−∞ l =−∞ k=−∞ l =−∞

Since r xx (n) = σε2 δ(n) , that is, r xx (l − k) = σε2 δ(l − k) , only the terms with l = k remain in
the double summation expression for the variance σy2 , producing
∞
σy2 = σε2 ∑ |h(k)|2 = σε2 Eh .
k=−∞

7.6.1 Spectral Estimation of Narrowband Signals

A narrowband random signal with Np components around the frequencies ω1 , ω2 , and ω Np can be
considered, from a spectral point of view, as an output of the system whose transfer function is
G
H (z) =
(1 − r1 e jω1 z−1 )(1 − r2 e jω2 z−1 ) . . . (1 − r Np e jω Np z−1 )
G
= .
1 + a 1 z −1 + a 2 z −2 + · · · + a Np z − Np

when the input is a white noise. The amplitudes of the poles ri , i = 1, 2, . . . , Np , are inside (and close
to) the unit circle. The discrete-time domain description of this system is

y(n) + a1 y(n − 1) + a2 y(n − 2) + · · · + a Np y(n − Np ) = Gx (n), (7.133)

where x (n) is a white noise with variance σx2 = 1, the autocorrelation r xx (k) = δ(k), and the spectral
energy density Sxx (ω ) = 1. For a given narrowband random signal y(n), the task is to find coefficients
ai and G.
Ljubiša Stanković Digital Signal Processing 389

The autocorrelation of the real-valued output signal is obtained after the multiplication of the
difference equation by y(n − k),

y(n)y(n − k) + a1 y(n − 1)y(n − k) + · · · + a Np y(n − Np )y(n − k) = Gx (n)y(n − k),

and the expected value calculation,

E{y(n)y(n − k) + a1 y(n − 1)y(n − k ) + · · · + a Np y(n − Np )y(n − k)} = E{ Gx (n)y(n − k)}.

For k = 0, follows

ryy (0) + a1 ryy (0 − 1) + a2 ryy (0 − 2) + · · · + a Np ryy (0 − Np ) = G2 .

For k > 0 and a causal system, we may find that r xy (k) = h(−k) = 0. It is also clear from (7.133)
that x (n) is related to y(n) and that any y(n − k), for k > 0, does not include x (n), meaning that
E{ x (n)y(n − k)} = 0, and

ryy (k) + a1 ryy (k − 1) + a2 ryy (k − 2) + · · · + a Np ryy (k − Np ) = 0.

The previous equations are known as the Yule-Walk equations. The matrix form of this system of
equations is
   2 
  1 G
ryy (0) ryy (1) ... ryy ( Np )  a1   0 
 ryy (1) ryy (0)  
. . . ryy ( Np − 1)     
 a2  = 0  (7.134)
 ... ... ... ...     .
 ...   ... 
ryy ( Np ) ryy ( Np − 1) . . . ryy (0)
a Np 0

The system is solved for the unknown system coefficients [ a0 , a1 , a2 ,. . . , a Np ] with G = 1. Then, the
coefficients are normalized as [ a0 , a1 , a2 ,. . . , a Np ]/a0 with G = 1/a0 . The spectral energy density of
y(n) follows with Sxx (ω ) = 1 as
2
G

Syy (ω ) = − − − jN ω
. (7.135)
1 + a1 e jω + a2 e j2ω + · · · + a Np e p

This is the autoregressive (AR) spectral estimation.

Note that the autocorrelation functions for real-valued y(n), defined within 0 ≤ n ≤ N − 1, can
be estimated, using

1 N −1− k
for 0 ≤ k ≤ N − 1, (7.136)
N − k n∑
ryy (k) = y(n + k)y(n)
=0

and ryy (k) = ryy (−k) for − N + 1 ≤ k < 0. These values are then used in (7.134) for the autoregressive
spectral estimation.
Next, we will comment the estimated autocorrelation within the basic definition of the power
spectral density framework, Section 7.3.4. Relation (7.136) corresponds to the unbiased estimation
of the autocorrelation function. The power spectral density, according to (7.32), is calculated as
Syy (ω ) = FT{ryy (k)}.
Since the autocorrelation estimates for a large k use only a small number of signal samples
in averaging, they are not reliable. It is common to apply a triangular (Bartlett) window function
390 Discrete-Time Random Signals

(w(k) = ( N − |k|)/N) to reduce the weight of these estimates in the Fourier transform calculation

1 N −1− k 1 N −1− k
∑ (7.137)
N n∑
w(k)ryy (k) = w(k) y(n + k)y(n) = y(n + k)y(n)
N − k n =0 =0

for 0 ≤ k ≤ N − 1. Since the window is used, this autocorrelation function estimate is biased. The
Fourier transform of the biased autocorrelation function w(k)ryy (k) = (1 − |k|/N )ryy (k) is the power
spectral density Pyy (ω ) = FT{(1 − |k|/N )ryy (k)} defined by (7.34).

Example 7.52. Consider the random signal

√ 2
y(n) = 2 cos(1.4n + ϕ1 ) + 2 sin(1.6n + ϕ2 ) + 10e−(n− N/2) /16 + ε(n),

within 0 ≤ n ≤ N − 1 = 127, where ϕ1 and ϕ2 are random variables uniformly distributed from
−1/2 rad to 1/2 rad, while ε(n) is the zero-mean, unit-variance Gaussian noise. Plot the power
spectral density calculated using:
(a) The Fourier transform of ryy (k)

N −1
Syy (ω ) = FT{ryy (k)} = ∑ ryy (k)e− jωk .
k =− N +1

where ryy (k) is calculated using

1 N − k −1
N − k i∑
ryy (±k) = x ( k + i ) x ( i ).
=0

(b) The Fourier transform of the signal y(n)

2
1 N −1
− jωn
Pyy (ω ) = ∑ y(n)e .
N n =0

(c) The power spectrum in (b) corresponds to FT{w B (k)ryy (k)}, where w B (k) is the
Bartlett window whose width is equal to the width of the autocorrelation function ryy .
(d) The Fourier transform of the signal y(n) over K = 7 shorter intervals. The duration of
these intervals is M = 32, with the step R = M/2. The Fourier transforms of y(n), within these
shorter intervals, are
1 M −1
Yi (e jω ) = y(iR + n)e− jωn
M n∑=0
for i = 0, 1, . . . , 6. The power spectral densities |Yi (e jω )|2 are averaged to produce (Welch
periodogram)
A 1 K −1 2

Syy (ω ) = ∑ Yi (e jω ) .
K i =0
(e) Relation (7.135) with appropriately estimated coefficients ai and G, along with the
relations (7.134) and (7.136).

⋆The results are shown in Fig. 7.32, in order from (a) to (e).
Ljubiša Stanković Digital Signal Processing 391

0.5

0
noise-free spectrum
-0.5
-3 -2 -1 0 1 2 3

0.5

0
(a)
-0.5
-3 -2 -1 0 1 2 3

0.5

0
(b)
-0.5
-3 -2 -1 0 1 2 3

0.5

0
(c)
-0.5
-3 -2 -1 0 1 2 3

0.5

0
(d)
-0.5
-3 -2 -1 0 1 2 3

0.5

0
(e)
-0.5
-3 -2 -1 0 1 2 3

Figure 7.32 Spectral analysis of a signal with random phases (normalized values). Order and the description of
the panels correspond to the task order in Example 7.52.
392 Discrete-Time Random Signals

7.6.2 Detection and Matched Filter

Detection of an unknown deterministic signal in a high noise environment is of crucial interest in many
real-world applications. In this case the problem is in testing the hypothesis

H0 : Signal is not present in the observed noisy signal (7.138)

H1 : Signal is present in the observed noisy signal

Hypothesis testing is described in Section 7.4.9.

The decision-making process based on the hypothesis testing is improved if we provide the best
possible observation parameter for the decision. Here we will present the detection of a known signal
in a white noise using the matched filter, which is designed to maximize the output signal value.

7.6.2.1 Matched Filter

Consider a general signal form

x ( n ) = s ( n ) + ε ( n ),
where s(n) is a known signal with the Fourier transform S(e jω ) and ε(n) is a white noise with the
power spectral density σε2 . The problem is to find a system with the maximum output if the input x (n)
contains the signal s(n). The output signal is used to test the null hypothesis H0 : the signal s(n) is not
present in x (n).
The output of the system with the impulse response h(n) and the frequency response H (e jω ), to
the input signal x (n), is of the form

y ( n ) = y s ( n ) + y ε ( n ),

where ys (n) and yε (n) are the system outputs to the input signals s(n) and ε(n), respectively. For the
output signal ys (n) holds
Ys (e jω ) = H (e jω )S(e jω ).
The power spectral density of ys (n) is equal to
2 2 2

Ys (e jω ) = H (e jω ) S(e jω ) .

The power of the output noise is given by

Zπ 2
1
E{|yε (n)|2 } = H (e jω ) σε2 dω.
2π
−π

The output signal y(n), at an instant n0 , is

2
Zπ Zπ
1 1
y s ( n0 ) = H (e jω )S(e jω )e jωn0 dω, with |ys (n0 )|2 = H (e jω )S(e jω )e jωn0 dω .
2π 2π
−π
−π

The aim is to maximize the output signal at an instant n0 if the input signal contains s(n). According
to Schwartz’s inequality (for its discrete form see Part VI)
2
π Zπ Zπ
1 Z jω 2
2
H ( e jω
) S ( e jω
) e jωn 0
dω ≤ 1 S ( e ) dω
1
H (e jω ) dω,
2π 2π 2π
−π −π −π
Ljubiša Stanković Digital Signal Processing 393

the peak output signal-to-noise ratio is

Zπ Zπ 2
1 jω 2 1
2π S(e ) dω 2π H (e jω ) dω
|ys (n0 )|2 −π −π
PSNR = ≤ .
E{|yε (n)|2 } Zπ
1 H (e jω )2 σε2 dω
2π
−π

This ratio reaches its maximum when the equality sign holds
Zπ
1 jω 2 Es
PSNRmax = S(e ) dω = 2 .
2πσε2 σε
−π

The maximum ratio in Schwartz’s inequality is achieved for

H (e jω ) = kS∗ (e jω )e− jωn0 .

In the time domain, the impulse response is

h(n) = ks∗ (n0 − n).

This system is called matched filter. Its impulse response is matched to the signal form. The
matched filter maximizes the ratio of the output signal and the noise and is used in the detection, to
make a decision if the known signal s(n) exists in the noisy signal x (n).
If the additive noise is Gaussian distributed then the null hypothesis (there is no deterministic
signal in the input signal) rejection region, for significance level α = 0.001 is
λ
Probability{|y(n0 )| > λ} = 1 − erf( √ ) < 0.001,
2σy
|y(n0 )| > 3.2905σy

For the application, where an order of 1000 nonzero samples is expected in the output signal, the
significance level must be small. Otherwise, we will have many false positive results.

Example 7.53. The matched filter is illustrated on the detection of the chirp signal
2
s(n) = e−2(n/128) cos(8π (n/128)2 + πn/8)

in a Gaussian white noise of the variance σε2 = 1. The output of the matched filter is calculated
for n0 = 0 using the known signal,
∞
y(n) = x (n) ∗n s(−n) = ∑ (s(m) + ε(m))s(m − n).
m=−∞

The maximum value of the output signal is reached at n = 0, when

∞ ∞
y (0) = ∑ (s(m) + ε(m))s(m) and E{y(0)} = ∑ s2 (m) = Es = 56.86.
m=−∞ m=−∞
394 Discrete-Time Random Signals

1.5 1.5
1 1
0.5 0.5
0 0
-0.5 -0.5
-1 -1
-1.5 -1.5
-2 -1 0 1 2 -200 -100 0 100 200

5 5

0 0

-5 -5
-250 -100 0 100 250 -250 -100 0 100 250

80 80
60 60
40 40
20 20
0 0
-20 -20
-40 -40
-500 -250 0 250 500 -500 -250 0 250 500

Figure 7.33 Illustration of the matched filter: Signal s(t), s(n). Input noisy signal x (n) = s(n) + ε(n), contains
the signal s(n). Input signal x (n) = ε(n) does not contain the signal s(n). Corresponding outputs from the matched
filter y(n) = x (n) ∗ s(−n) are presented bellow the input signal panels. The null hypothesis rejection region is
shaded.

The variance of the output noise is derived in Example 7.51 as

∞ ∞
σy2 = σε2 ∑ |h(n)|2 = σε2 ∑ |s(−n)|2 = σε2 Es ,
n=−∞ n=−∞

The null hypothesis rejection region is defined by

√
|y(n0 )| > 3.2905σy = 3.2905σε Es = 24.81.

Two cases are shown in Fig. 7.33: (1) When the input signal contains s(n) and (2) when the
input signal does not contain s(n). We can see that the output of the matched filter has an easily
detectable peak at n = 0 for the case then the input signal contains s(n). There is no such a peak
in y(n) when the input signal x (n) is noise only. Therefore, the null hypothesis can be rejected
Ljubiša Stanković Digital Signal Processing 395

in the case presented in the left panels, while it can not be rejected for the case shown in the right
panels.

7.6.2.2 Two Hypothesis Decision

Consider a signal s(n) that can take one of two constant values s(n) = A1 or s(n) = A2 , corrupted by
an additive random noise ε(n):
(1) x (n) = A1 + ε(n) or
(2) x (n) = A2 + ε(n).
Assume that the probabilities of these two signal states are P( A1 ) and P( A2 ), such that
P( A1 ) + P( A2 ) = 1. In this experiment, a value of the signal x (n) = y is observed and the question
is which of these two hypothesis is true:

H1 : Signal is of the form x (n) = A1 + ε(n) (7.139)

H2 : Signal is of the form x (n) = A2 + ε(n).

Bayes’ formula for the state x (n) = A1 + ε(n) is

P( A1 |y) p(y)dy = p(y| A1 )dy P( A1 )

where p(y)dy is the probability that y takes a specific value within [y, y + dy). This relation can be
written as
p ( y | A1 ) P ( A1 )
P ( A1 | y ) = .
p(y)
Similarly, Bayes’ formula for the state x (n) = A2 + ε(n) produces

p ( y | A2 ) P ( A2 )
P ( A2 | y ) = .
p(y)

Since P( A1 |y) is the probability of A1 if y occurred and P( A2 |y) is the probability of A2 if the same
y occurred, then a criterion to state that the hypothesis H1 is true can be defined by

P ( A1 | y ) > P ( A2 | y ).

This relation produces the decision relations

H1 : is true if p(y| A1 ) P( A1 ) > p(y| A2 ) P( A2 ) (7.140)

H2 : is true if p(y| A1 ) P( A1 ) < p(y| A2 ) P( A2 )

which are commonly written in the form

H1 p(y| A1 ) H1 P( A2 )
p ( y | A1 ) P ( A1 ) ≷ p ( y | A2 ) P ( A2 ) or ≷ ,
H2 p(y| A2 ) H2 P( A1 )

where p(y| A1 )/p(y| A2 ) is the likelihood ratio.

In the logarithmic form, the log-likelihood ratio criterion for the decision is given by
p(y| A ) H1 P( A )
1 2
ln ≷ ln . (7.141)
p(y| A2 ) H2 P ( A1 )
396 Discrete-Time Random Signals

The decision threshold, d, is obtained from

p(d| A ) P( A )
1 2
ln = ln .
p ( d | A2 ) P ( A1 )

For the Gaussian probability density function of the disturbance ε(n), the signal x (n) in the case
x (n) = A1 + ε(n) is distributed as

1 2 2
pξ (ξ | A1 ) = √ e−(ξ − A1 ) /(2σx ) .
σ 2π
For x (n) = A1 + ε(n) the probability density is

1 2 2
pξ (ξ | A2 ) = √ e−(ξ − A2 ) /(2σx ) .
σ 2π
The decision threshold is then obtained from
2 2 2
e−((d− A1 ) −(d− A2 ) )/(2σx ) = P( A2 )/P( A1 )

or from the logarithmic form

P ( A2 )
2d( A1 − A2 ) − A21 + A22 = 2σx2 ln( ).
P ( A1 )
The threshold value, for the Gaussian distribution, is
P( A )
σx2 ln( P( A2 ) ) A1 + A2
1
d= + .
A1 − A2 2

Obviously, the special case when P( A2 ) = P( A1 ), produces d = ( A1 + A2 )/2.

When a signal sample x (n) = y is obtained as a result of the experiment, the hypothesis testing
rule is now simple,
H1
y ≷ d. (7.142)
H2

Example 7.54. Consider the random signal with two possible forms: (1) x (n) = A1 + ε(n) = 1 + ε(n)
or (2) x (n) = A2 + ε(n) = −1 + ε(n), where ε(n) is the zero-mean Gaussian distributed
random variable with variance σε2 = 0.5. Assume that the probabilities of these two states are
P( A1 ) = 1/3 and P( A2 ) = 2/3. Find the decision threshold d.

⋆The posterior probability functions, P( A1 ) pε (ξ − A1 ) and P( A2 ) pε (ξ − A2 ), are shown in

Fig. 7.34. The decision threshold is obtained from
P( A )
σx2 ln( P( A2 ) ) A1 + A2
2/3
0.52 ln( 1/3 ) 1 + (−1)
1
d= + = + = 0.08664.
A1 − A2 2 1 − (−1) 2
Ljubiša Stanković Digital Signal Processing 397

0.6

0.4

0.2

0
-3 -2 -1 0 1 2 3

Figure 7.34 Illustration of the detection thershold calculation.

7.6.3 Optimal Wiener Filter

Assume that the input signal is x (n) and that it contains an information about the desired signal
d(n). The input signal is processed by a system whose impulse response is h(n). The output signal is
y(n) = h(n) ∗n x (n). The task here is to find the impulse response h(n) of the system such that the
difference of the desired signal and the output signal, denoted as the error

e ( n ) = d ( n ) − y ( n ),

is minimum in the mean squared sense, that is,

h(n) = min{E{|e(n)|2 }}.

h(n)

The mean squared error is

∞ 2

E{|e(n)|2 } = E{d(n) − ∑ h ( m ) x ( n − m ) }.
m=−∞

The minimum value is obtained from

( ! )
∂E{|e(n)|2 } ∞
∗
= E 2 d(n) − ∑ h(m) x (n − m) x (n − k) = 0. (7.143)
∂h∗ (k) m=−∞

This relation states that expected value of the product of the error signal e(n) = d(n) − y(n) and the
input signal x ∗ (n − k) is zero
E {2e(n) x ∗ (n − k)} = 0
for any k. For signals satisfying this relation we say that they are normal to each other.
Relation (7.143) can be written as
( )
∞
E ∑ h(m) x (n − m) x ∗ (n − k) = E {d(n) x ∗ (n − k)}
m=−∞

or
∞
∑ h(m)r xx (k − m) = rdx (k).
m=−∞
398 Discrete-Time Random Signals

Taking the z-transform of both sides of the last equation we get

H (z) R xx (z) = Rdx (z).

The transfer function of the optimal filter is

Rdx (z)
H (z) = .
R xx (z)

For a special case, when the input signal is the desired signal d(n) with an additive noise

x ( n ) = d ( n ) + ε ( n ),

where ε(n) is uncorrelated with the desired signal, the optimal (Wiener) filtering relation follows

Rdd (z)
H (z) =
Rdd (z) + Rεε (z)
since

rdx (k) = E {d(n) x ∗ (n − k)} = E {d(n)[d∗ (n − k) + ε∗ (n − k)]} = rdd (k).

Here we used E {d(n)ε∗ (n − k)} = 0, since d(n) and ε(n) are uncorrelated. Also

r xx (k) = E {[d(n) + ε(n)][d∗ (n − k) + ε∗ (n − k)]} = rdd (k) + rεε (k).

The frequency response of the optimal filter is given by

Sdd (ω )
H (e jω ) = .
Sdd (ω ) + Sεε (ω )

Example 7.55. A signal x (n) = d(n) + ε(n) is processed by an optimal filter. Power spectral density
of d(n) is Sdd (ω ). If the signal d(n) and the additive noise ε(n), whose power spectral density
is Sεε (ω ), are independent find the output signal-to-noise ratio.

⋆For this signal and noise, according to (7.132), we have

2 2 2 (ω )
Sdd (ω ) Sdd
Syy (e jω ) = H (e jω ) Sxx (e jω ) = Sxx (e jω ) =

Sdd (ω ) + Sεε (ω ) Sdd (ω ) + Sεε (ω )

since Sxx (e jω ) = Sdd (ω ) + Sεε (ω ). The output signal-to-noise ratio is

Rπ 2
1 jω
2π −π Sdd ( ω ) H ( e ) dω
SNR = Rπ .
1 jω 2
2π −π Sεε ( ω ) H ( e ) dω

Note that the input signal-to-noise ratio is

1
Rπ
2π −π Sdd ( ω ) dω
SNRi = 1
Rπ .
2π −π Sεε ( ω ) dω
Ljubiša Stanković Digital Signal Processing 399

The optimal prediction system follows with the input signal x (n) = d(n − 1) + ε(n − 1) and
the desired signal d(n). Transfer function of the optimal predictor is obtained from

rdx (k) = E {d(n) x ∗ (n − k)} = E {d(n)[d∗ (n − 1 − k) + ε∗ (n − 1 − k)]} = rdd (k + 1)

and

r xx (k) = E {[d(n − 1) + ε(n − 1)][d∗ (n − 1 − k) + ε∗ (n − 1 − k)]} = rdd (k) + rεε (k).

as
zSdd (z)
H (z) =
Sdd (z) + Sεε (z)
since
∞ ∞
∑ rdd (k + 1)z−k = ∑ rdd (k)z−k+1 = zSdd (z).
k =−∞ k =−∞
The optimal smoothing is the case when the desired signal is d(n) and we can use its future
value(s). This processing follows with x (n) = d(n + 1) + ε(n + 1) as

z−1 Sdd (z)

H (z) = .
Sdd (z) + Sεε (z)

Example 7.56. The input signal is x (n) = s(n) + ε(n), where d(n) = s(n) is the desired signal and
ε(n) is a noise. If the autocorrelation functions of the signal and noise are rss (n) = 4−|n| and
rεε (n) = 2δ(n), respectively, and the cross-correlation of the signal and noise is rsε (n) = δ(n),
design the optimal filter.

⋆The optimal filter transfer function is

H (z) = Rdx (z)/R xx (z)

where are

Rdx (z) = Rss (z) + Rsε (z) and R xx (z) = Rss (z) + 2Rsε (z) + Rεε (z).

Based on the correlation functions, we can calculate the z-transforms

∞ ∞ −1 ∞
Rss (z) = ∑ rss (n)z−n = ∑ 4−|n| z−n = ∑ 4n z − n + ∑ 4− n z − n =
n=−∞ n=−∞ n=−∞ n =0
z/4 1 −3.75z
= + = ,
1 − z/4 1 − 1/(4z) (z − 0.25)(z − 4)

while Rsε (z) = 1 and Rεε (z) = 2.

The transfer function of the optimal filter is

Rss (z) + Rsε (z) 0.25z2 − 2z + 0.25

H (z) = = 2 .
Rss (z) + 2Rsε (z) + Rεε (z) z − 5.1875z + 1

The optimal systems realization using the FIR filters will be presented within the introductory
part of the chapter dealing with adaptive discrete systems in Part III.
400 Discrete-Time Random Signals

7.7 QUANTIZATION EFFECTS

In order to process analog signals using computers they have to be converted into numbers stored
into registers of a finite precision. Continuous-time signals are transformed into digital signals using
analog-to-digital (A/D) converters. This operation is done in two steps. First, the continuous-time signal
is converted into a discrete-time signal by taking samples of the continuous-time signal at discrete-time
instants (sampling)
x (n) = x (n∆t)∆t.
Next, the discrete-time signal, with continuous amplitudes of samples, is converted into a digital signal

xQ (n) = Q[ x (n)]

with discrete-valued amplitudes (quantization). This process is illustrated in Fig. 7.35. The error caused
by the quantization of the discrete-time signal amplitudes is called the quantization noise.

continuous discrete−time digital

1 1
15 1111
14 1110
0.8 0.8 13 1101
12 1100
11 1011
10 1010
0.6 0.6
9 1001
xd(n)
x(n)
x(t)

8 1000
7 0111
0.4 0.4 6 0110
5 0101
4 0100
0.2 0.2 3 0011
2 0010
1 0001
0 0 0000
0 5 10 15 0 5 10 15 0 5 10 15
t n n

Figure 7.35 Illustration of a continuous signal and its discrete-time and digital version.

The quantization noise influences results of signal processing in several ways:

-Input signal quantization error, described by an additive quantization noise. This influence (in
the form of an additive input noise that depends on quantization step ∆) can be modeled as the uniform
noise with values between −∆/2 and ∆/2.
-Quantization of the results of arithmetic operations. It depends on the way how the calculations
are performed.
-Quantization of the coefficients in algorithms. Usually this kind of error is neglected in analysis
since it is deterministic (comments on the errors in the coefficients are given in the chapter dealing
with the realizations of discrete-time systems).
In order to make appropriate analysis of the quantization effects, common assumptions are:
1) random variables corresponding to the quantization errors are uncorrelated, that is, the
quantization error is a white noise process with the uniform distribution,
2) the error sources are uncorrelated with one another, and
3) all the errors are uncorrelated with the input signal and, consequently, with all signals in the
system.
Ljubiša Stanković Digital Signal Processing 401

7.7.1 Input signal quantization

For registers with b bits the digital signal values xQ (n) are coded into binary format. Assume that the
registers with b bits are used and that all input signals are normalized to the range

0 ≤ x (n) < 1.

The binary numbers are written within the register as

a −1 a −2 a −3 . . . a − b .

The value of xQ (n) is

x Q ( n ) = a −1 2−1 + a −2 2−2 + · · · + a − b 2− b .
The maximum number that can be written within this format is 0.111 . . . 11 representing 1 − 2−(b+1) .
Common number of bits b ranges from 8 to 24.
For reducing the signal number of digits to b bits, rounding or truncation is used. An example of
the quantization with b = 4 bits is presented in Fig. 7.35, where the maximum value of xd (n) = xQ (n)
is denoted by 1111, meaning 2−1 + 2−2 + 2−3 + 2−4 = 15/16.
For the case with positive and negative numbers, one extra bit is used for the sign. The registers
are now with b + 1 bits. The first bit is the sign bit and the remaining b bits represent the signal absolute
value
s a −1 a −2 a −3 . . . a − b .
In computers, negative numbers are commonly represented in a complement of 2 form.
In order to distinguish these two cases we will use register of length b, meaning no sign bit exists
and register of length b + 1, where the sign bit is used.

Example 7.57. In a register with b = 8 bits, the binary number xQ (n)

10110010

has the decade system value

89
x Q ( n ) = 1 · 2−1 + 1 · 2−3 + 1 · 2−4 + 1 · 2−7 = = 0.6953.
128
The decimal point is at the position just before the first digit. The values of xQ (n), in this register,
are
255
0 ≤ xQ (n) ≤
256
with the quantization step 1/256.

The quantization error is a difference in the amplitude of the original signal and the quantized
signal
e ( n ) = x ( n ) − x Q ( n ).
402 Discrete-Time Random Signals

When the rounding approach is used, the maximum absolute error can be a half of the last digit weight,
1 1 −b
− 2− b ≤ x ( n ) − x Q ( n ) < 2
2 2
1 1
− ∆ ≤ x (n) − xQ (n) < ∆
2 2
where
∆ = 2− b .
We can also write |e(n)| ≤ 2−(b+1) = 1
2 ∆.
In the example from Fig. 7.35, the quantization step is 2−4 = 1/16 and the error is within
|e(n)| ≤ 12 16
1
.
The error values are equally probable within the defined interval, with the probability density
function 1
for − 21 ∆ ≤ ξ < 12 ∆
pe (ξ ) = ∆ ,
0 elsewhere.
The quantization error of the signal x (n) may be described as an additive uniform white noise.
The expected value of the quantization error, with the rounding approach, is
∆/2
Z
µe = E{e(n)} = ξ pe (ξ )dξ = 0.
−∆/2

The variance of rounding quantization error is

∆/2
Z
1 1
σe2 = (ξ − µe )2 dξ = ∆2 .
∆ 12
−∆/2

When the truncation is used, the quantization error is within

0 ≤ x (n) − xQ (n) < ∆

or 0 ≤ e(n) < ∆, with the expected value

∆
µe = E{e(n)} =
2
and the variance
Z∆
1 ∆ 1
σe2 = (ξ − )2 dξ = ∆2 .
∆ 2 12
0

Example 7.58. The DFT of a signal x (n) is calculated using the quantized version of this signal

xQ (n) = Q[ x (n)] = x (n) + e(n).

Quantization is done by an A/D converter with b + 1 = 8 bits, using rounding. The DFT is
calculated on a high precision computer with N = 1024 signal samples. Find the expected value
and variance of the calculated DFT.
Ljubiša Stanković Digital Signal Processing 403

⋆The DFT of the quantized signal is

N −1
XQ (k ) = ∑ [ x (n) + e(n)] e− j2πkn/N .
n =0

Its expected value is

N −1
µ XQ (k) = E{ XQ (k)} = ∑ x (n)e− j2πkn/N = X (k).
n =0

The variance is
N −1 N −1
1 −14 10 1
2
σX (k) = ∑ ∑ σe2 δ(n1 − n2 )e− j2πk(n1 −n2 )/N = σe2 N = 2 2 = .
Q
n1 =0 n2 =0
12 192

The noise in the DFT is a sum of many independent noises from the input signal and coefficients.
Thus, it is Gaussian distributed with standard deviation σXQ = 0.072. It may significantly influence
the signal DFT values, especially if they are not well concentrated or if there are signal components
with small amplitudes.

Example 7.59. How the input signal quantization error influences the results of:
(a) the weighted sum
N −1
Xs = ∑ an x (n) and
n =0
(b) the product
N −1
XP = ∏ x ( n ).
n =0

⋆If the quantized values xQ (n) = Q[ x (n)] = x (n) + e(n) of the signal x (n) are used in
calculation instead of the true signal values then:
(a) The estimator of a weighted sum is
N −1 N −1 N −1
X̂s = ∑ an x Q (n) = ∑ an x (n) + ∑ a n e ( n ).
n =0 n =0 n =0

Obviously, the total error is

N −1
e Xs = ∑ a n e ( n ).
n =0
It is Gaussian distributed since there are many small errors e(n), for large N. It has been assumed
that the weighting coefficients are such that they allow many signal values to influence the result
with similar weights.
The expected value of the total error is
N −1
µ Xs = E { e Xs } = ∑ an E{e(n)} = 0,
n =0
404 Discrete-Time Random Signals

for rounding. The variance of the total error is

N −1
2 1 2 N −1 2
σX = ∑ a2n var{e(n)} = ∆ ∑ an .
s
n =0
12 n =0

(b) The estimator of the product is

N −1 N −1
X̂P = ∏ xQ (n) = ∏ (x(n) + e(n)).
n =0 n =0

Assuming that the individual errors are small so that all the higher-order error terms, containing
the error products, for example, e(n)e(m) or e(n)e(m)e(l ), could be neglected we get
N −1 N −1 N −1
X̂P ∼
= ∏ x ( n ) + ∑ ∏ x ( n ) e ( m ).
n =0 m =0 n =0
n6=m

The quantization effect caused error is

N −1 N −1
e XP = ∑ ∏ x ( n ) e ( m ).
m =0 n =0
n6=m

It is interesting to note that the relative error is additive since

N −1 N −1
∑ ∏ x (n)e(m)
m =0 n =0 N −1 N −1
e n6=m e(m)
r XP = XP = = ∑ = ∑ r x ( m ).
XP N −1 x ( m ) m =0
m =0
∏ x (n)
n =0

The expected value is zero if the rounding is used. The variance is signal-dependent,
N −1 N −1 N −1 N −1
2 1
σX p
= ∑ ∏ x2 (n)var{e(n)} = 12 ∆2 ∑ ∏ x2 (n).
m =0 n =0 m =0 n =0
n6=m n6=m

7.7.2 Quantization of the results

In the quantization of results, after basic arithmetic operations are performed, we can distinguish two
cases. The first case is when the fixed-point arithmetic is used. The register here assumes that the
decimal point is positioned at the fixed place. All data are written with respect to this assumed decimal
point position. In the floating-point arithmetic, the numbers are written in the sign-mantissa-exponent
format. The quantization error is then produced on mantissa only.

7.7.2.1 Fixed point arithmetic

Fixed point arithmetic assumes that the decimal point position is at a fixed place. The common
assumption is that all input values and the intermediate results, in this case, are normalized so that
0 ≤ x (n) < 1 or −1 < x (n) < 1, if the sign bit is used.
Ljubiša Stanković Digital Signal Processing 405

In multiplications, the result of a multiplication

xQ (n) xQ (m)

will, in general, produce a result of 2b digits. It should be quantized in the same way as the input signal

Q[ xQ (n) xQ (m)] = x Q (n) xQ (m) + e(n, m)

where e(n, m) is the quantization error satisfying all the previous properties with

1 1
− ∆ ≤ e(m, n) ≤ ∆.
2 2

Example 7.60. Find the expected value of the quantization error for
N −1
r (n) = ∑ x (n + m) x (n − m)
m =0

where x (n) is quantized and the product of signals is quantized to b bits as well. Assume that the
signal values are such that their additions will not cause overflow.

⋆For this calculation, the model is

N −1
rQ (n) = ∑ xQ (n + m) xQ (n − m) + e(n + m, n − m)
m =0
N −1
= ∑ {[ x(n + m) + e(n + m)] [(x(n − m) + e(n − m)] + e(n + m, n − m)}.
m =0

The expected value is

N −1 N −1
E{r Q (n)} = ∑ x ( n + m ) x ( n − m ) + E{ ∑ e(n + m)e(n − m)}
m =0 m =0
1 2
= r (n) + E{e2 (n)} = r (n) + ∆ ,
12
since it is assumed that errors for two different signal samples are not correlated E{e(n + m)e(n −
m)} = 0 for m 6= 0 and the signal and errors are not correlated, E{ x (n + m)e(n − m)} = 0, for
any m and n.
In general, the additions cause quantization errors as well. Namely, by adding two values
0 ≤ x (n) < 1, the result could be greater than 1. In order to avoid the overflow, the input values
are shifted in the register to the right (appropriately divided), causing the quantization error.

In the case when complex-valued numbers are used in calculation, the quantization of the real
part and the imaginary part is done separately,

xQ (n) = Q[ x (n)] = Q[Re{ x (n)} + j Im{ Q[ x (n)]}] = x (n) + er (n) + jei (n).
406 Discrete-Time Random Signals

Since the real and imaginary part are independent, with the same variance, the variance of the
quantization error for a complex-valued signal is given by
1 2 1 2
σe2 = 2 ∆ = ∆ .
12 6
For the additions the variance is doubled as well.
In case of multiplications one complex-valued multiplication requires four real-valued multi-
plications, introducing four errors. The quantization variance of a complex-valued multiplication
is
1 1
σe2 = 4 ∆2 = ∆2 .
12 3
If the values of a signal x (n) are not small we have to ensure that no overflow occurs during
the calculations using the fixed-point arithmetic. Consider a real-valued random white signal whose
samples are within −1 < x (n) < 1, with the variance σx2 . The registers of b + 1 bits are assumed, with
one bit being used for the sign. As an example consider the expected value calculation

1 N −1
x ( n ).
N n∑
XN =
=0

We have to be sure that an overflow will not occur during the expected value calculation. All sums
should stay within the interval (−1, 1).
One approach to calculate X N is in dividing the input signal values by N and summing them,
that is
x (0) x (1) x ( N − 1)
XN = + + ··· + .
N N N
Then we are sure that no result will be outside the interval (−1, 1). Division of the signal samples by
N introduces an additive quantization noise,

x (0) x (1) x ( N − 1)
X̂ N = + e (0) + + e (1) + · · · + + e ( N − 1).
N N N
Variance of the equivalent noise e(0) + e(1) + · · · + e( N − 1) is

1 2 1 −2b
σe2 = ∆ N= 2 N.
12 12
Since the variance of x (n)/N is σx2 /N 2 , the variance of X̂ N is

2 σx2 1
σX =N + ∆2 N.
N
N2 12
Ratio of the variances corresponding to the signal and the noise in the result is
σ2
N Nx2 1 σx2 1 σx2
SNR = = = 2
1 2
12 ∆ N
N 2 12
1 2
∆ N 1 −2b
12 2

or in [dB]

1 σx2
SNR = 10 log( ) = 20 log σx − 20 log N − 20 log 2−b + 10 log(12)
N2 1 −2b
12 2
log2 N log 2−b
= 20 log σx − 20 − 20 2 + 10.8 = 20 log σx − 6.02(m − b) + 10.8,
log2 10 log2 10
Ljubiša Stanković Digital Signal Processing 407

where N = 2m . Obviously, by increasing the number of samples N to 2N will keep the same SNR if b
is increased for one bit, since (m + 1 − (b + 1)) = m − b.
Another way to calculate the mean is in performing the summation step by step, according to the
presented scheme, for N = 8,
x (0) x (1) x (2) x (3) x (4) x (5) x (6) x (7)
2 + 2 2 + 2 2 + 2 2 + 2
+ +
XN = 2 2 + 2 2 .
2 2
Here, two adjunct signal values x (n) are divided by 1/2 first. They are added then, avoiding
possible overflows. The error in one step is

x (n) x ( n + 1) x ( n ) + x ( n + 1) (2)
+ e(n) + + e ( n + 1) = + en .
2 2 2
The error
(2)
e n = e ( n ) + e ( n + 1)
has the variance n o
(2) 1 1 1
var en = ∆2 + ∆2 = ∆2 .
12 12 6
After every division by 2, the result is shifted in the register to the right and a quantization error is
created. Thus, the error model, due to the addition quantization, is
x (0) x (1) (2) x (2) x (3) (2) x (4) x (5) (2) x (6) x (7) (2)
2 + 2 + e0 2 + 2 + e2 (4) 2 + 2 + e4 2 + 2 + e6 (4)
2 + 2 + e0 2 + 2 + e4 (8)
X̂ N = + + e0
2 2
x (0) x (1) x ( N − 1)
= + + ··· +
N N N
(2) (2) (2)
e0 e e
+ + 2 + · · · + N −2
N/2 N/2 N/2
(4) (4)
e0 e
+ + · · · + N −4
N/4 N/4
..
.
(N)
e0
+ . (7.144)
N/N

The variance of all quantization noises is the same σe2 = 61 ∆2 = 61 2−2b . Notice that the noises in the
first stage are divided by N/2, due to divisions by 2 in the next stages of summation. Their variance is
reduced for N 2 /4. The value of the variance of errors in these stages is
(2) (2) (2)
e0 e e 1 1 N 1 2
var{ + 2 + · · · + N −2 } = ∆2 2 = ∆2
N/2 N/2 N/2 6 N /4 2 6 N
(4) (4)
e0 e 1 1 N 1 4
var{ + · · · + N −4 } = ∆2 2 = ∆2
N/4 N/4 6 N /16 4 6 N
..
.
(N)
e0 1 1 N 1 N 1 2m
var{ } = ∆2 2 2 = ∆2 = ∆2 .
N/N 6 N /N N 6 N 6 N
408 Discrete-Time Random Signals

where 2m = N. The total variance of X̂ N is

2 σx2 1 2 1 4 1 2m
σX =N 2
+ ∆2 + ∆2 + · · · + ∆2 (7.145)
N
N 6 N 6 N 6 N
σx2 1 22 m −1 σx2 1 2 1 − 2m
= + ∆ (1 + 2 + · · · + 2 )= + ∆2
N 6 N N 6 N 1−2
σx2 1 22 σx2 1 2 1
= + ∆ (N − 1 ) = + ∆ (1 − ).
N 6 N N 3 N
σx2
Ratio of the variances N and 31 ∆2 (1 − 1
N ), corresponding to the output signal-to-noise ratio, is

σx2
σx2 ∼ 1 σx2
SNR = 1 2
N
1
= 1 2 = = 3σx2 22(b−m/2) .
3 ∆ (1 − N ). 3 ∆ ( N − 1)
N 13 2−2b

Significant improvement (for an order of N) is obtained using this scheme for the summation, instead
of the direct one. In dB the ratio is
m
SNR ∼ = 10 log 3σx2 22(b−m/2) = 20 log σx − 6.02( − b) + 4.8.
2
If the signal values were complex then 2−2b /12 would be changed to 2−2b /6.

7.7.2.2 Discrete rounding error

The previous results are common in literature. They are derived assuming that the variances of the
errors are the same and obtained assuming uniform nature of the quantization errors. However these
results differ from the ones obtained by statistical analysis. The reason is in the quantization error
distribution and variance. Namely, after the high precision signal x (n) is divided by 2 and stored into
(b + 1)-bit registers, the errors in x (n)/2 + e(n) are uniform, with −∆/2 ≤ e(n) < ∆/2. When these
values are stored into registers, then in every next stage, when we calculate [ x (n)/2 + e(n)] + [ x (n +
1)/2 + e(n + 1)]/2 the input values x (n)/2 + e(n) and x (n + 1)/2 + e(n + 1) are already stored
in the (b + 1)-bit registers. Division by 2 is just a one bit shift to the right. This shift cases one bit error.
Therefore this one bit error is discrete in amplitude

ed ∈ {−∆/2, 0, ∆/2},

with probabilities
Pd (±∆/2) = 1/4 and Pd (0) = 1/2.
The expected value of this kind of error is zero, provided that the rounding is done in such a way that it
takes values ±∆/2 with equal probability (various tie-breaking algorithms for rounding exist). The
(i )
variance of ed is
n o 1 ∆ 1 ∆ 1
(i )
var en = 2var {ed } = 2[ (− )2 + ( )2 ] = ∆2 , for i > 2.
4 2 4 2 4
The total variance of X̂ N is then of the form

σx2 1 2 1 4 1 2m σ2 1 4
2
σX =N 2
+ ∆2 + ∆2 + · · · + ∆2 = x + ∆2 (1 − ),
N
N 4 N 4 N 4 N N 2 3N
Ljubiša Stanković Digital Signal Processing 409

instead of (7.145). Signal-to-noise ratio is given by

σx2
SNR = N ∼
= 2σx2 22(b−m/2) .
1 2 4
2 ∆ (1 − 3N )

The previous analysis corresponds to the calculation of the DFT coefficient X (0) when the input
signal is a random uniform signal, whose values are within −1 < x (n) < 1, with variance σx2 . A model
for the element X (k), with all quantization errors included, is

1 N −1 n nk
o N −1
X̂ (k) = ∑
N n =0
[ x (n) + ei (n)] WN + e m ( n ) = ∑ y ( n ),
n =0

where ei (n) is the input signal quantization error and em (n) is the multiplication quantization error.
The variances for complex-valued signals are
1 2 1 2 1 1
var{ei (n)} = 2 ∆ = ∆ , var{em (n)} = 4 ∆2 = ∆2 .
12 6 12 3
Moreover, we have to provide that the additions do not produce an overflow. If we use the calculation
scheme, presented for N = 8, as
y (0) y (1) (2) y (2) y (3) (2) y (4) y (5) (2) y (6) y (7) (2)
2 + 2 + e0 2 + 2 + e2 (4) 2 + 2 + e4 2 + 1 + e6 (4)
2 + 2 + e0 2 + 2 + e4 (8)
X̂ (k) = + + e0 ,
2 2
then in every addition, the terms should be divided by 2. This division introduces the quantization error.
In the first step,

y(n) y ( n + 1) 1 nk
+ e(n) + + e(n + 1) = {[ x (n) + ei (n)] WN + em (n)+
2 2 2
( n +1) k
[ x (n + 1) + ei (n + 1)]WN + em (n + 1)} + e(n) + e(n + 1).

The total error in this step is

nk + e ( n ) + e ( n + 1)W ( n +1) k
(2) ei (n)WN m i N + e m ( n + 1)
en = + e ( n ) + e ( n + 1)
2
with the variance

(2) 1 1 2 1 2 1 2 1 2 1 7
var{en } = ∆ + ∆ + ∆ + ∆ + 2 ∆2 = ∆2 .
4 6 3 6 3 6 12
(4) (N)
In all other steps, within the errors e0 to e0 , just the addition errors appear. Their variance, for
complex-valued terms, is
(i ) 1
var{en } = 2 ∆2 .
6
Therefore, the variance of
(2) (2) (2)
x (0) x (1) x ( N − 1) e e e
X̂ N = + + ··· + + 0 + 2 + · · · + N −2 +
N N N N/2 N/2 N/2
(4) (4) (N)
e0 e e
+ + · · · + N −4 + . . . . + 0 (7.146)
N/4 N/4 N/N
410 Discrete-Time Random Signals

is obtained using
(2) (2) (2)
e0 e e 7 1 N 7 2
var{ + 2 + · · · + N −2 } = ∆2 2 = ∆2
N/2 N/2 N/2 12 N /4 2 12 N
(4) (4)
e0 e 1 1 N 1 4
var{ + · · · + N −4 } = ∆2 2 = ∆2
N/4 N/4 3 N /16 4 3 N
...
(N)
e0 1 1 N 1 2m
var{ } = ∆2 2 2 = ∆2 .
N/N 3 N /N N 3 N

The total variance of X̂ N is

σx2 1 2 3
2
σX = + ∆2 ( + 1 + 2 + · · · + 2m −1 )
N
N 3 N 4
σ2 2 N − 14 ∼ σx2 2
= x + ∆2 = + ∆2
N 3 N N 3
with
3σx2 m
SNR = 10 log = 20 log σx − 6.02( − b) + 1.76.
2N∆2 2
If the described discrete nature of the quantization error amplitude, after the first quantization
step, is taken into account (provided that the rounding is done in such a way that the error takes values
±∆/2 with equal probability), then with
n o 1
(i )
var en = 4var {ed } = ∆2 ,
2
for i > 2, the variance of X̂ N follows
5
σx2 ∆2 7 σ2 N− ∼ σ2
2
σX = + ( + 2 + 4 + · · · + 2m −1 ) = x + ∆2 6
= x + ∆2 .
N
N N 6 N N N
If the FFT is calculated using the fixed-point arithmetic and the signal is uniform, distributed
within −1 < x (n) < 1, with the variance σx2 , then in order to avoid an overflow the signal could be
divided at the input with N and the standard FFT could be used, as in Fig. 7.36.
An improvement in the SNR can be achieved if the scaling is done not to the input signal x (n)
by N, but by 1/2 in every butterfly, as shown in Fig. 7.37. The improvement achieved here is due
to the fact that the quantization errors appearing in the early butterfly stages are divided by 1/2 and
therefore reduced at the output, as in (7.144). An improvement of the order of N is achieved in the
output signal-to-noise ratio.

7.7.2.3 Floating point arithmetic

Fixed point arithmetic is simple, but could be inefficient if the signal values within a wide range of
amplitudes are expected. For example, if we can expect the signal values

xQ (n1 ) = 1011111110101.010
xQ (n2 ) = 0.0000000000110101,
Ljubiša Stanković Digital Signal Processing 411

x(0)/N X(0)

x(1)/N X(4)
−1 W80

x(2)/N X(2)
−1 W80

x(3)/N X(6)
−1 W82 −1 W80

x(4)/N X(1)
−1 W80

x(5)/N X(5)
−1 W81 −1 W80

x(6)/N X(3)
−1 W82 −1 W80

x(7)/N X(7)
−1 W83 −1 W82 −1 W80

Figure 7.36 The FFT calculation scheme obtained using the decimation-in-frequency for N = 8 with the signal
being divided by N in order to avoid an overflow when the fixed point arithmetic is used.

1/2 1/2 1/2

x(0) X(0)
2

2
1/
2

1/
2
1/

1/2 1/2
x(1) −1/2 X(4)
W80
2
1/
2
1/

1/
2

1/2 1/2
x(2) −1/2 X(2)
W80
1/

2
2

1/
2
1/

1/
2

1/2
x(3) −1/2 −1/2 X(6)
W82 W80
1/

2
1/
2

1/2 1/2
x(4) −1/2 X(1)
W80
2

2
1/

1/
2
2

1/2
x(5) −1/2 −1/2 X(5)
W81 W80
2
1/

1/
2

1/2
x(6) −1/2 −1/2 X(3)
W82 W80
1/

2
1/

1/
2
2

1/
2

x(7) −1/2 −1/2 −1/2 X(7)

W83 W82 W80

Figure 7.37 The FFT calculation scheme obtained using the decimation-in-frequency for N = 8 with the signal
being divided by 1/2 in every butterfly in order to avoid an overflow when the fixed-point arithmetic is used.
412 Discrete-Time Random Signals

then obviously fixed-point arithmetic would require large registers so that both values can be stored
without loosing their significant digits. However, we can represent these signal values into the
exponential form as

xQ (n1 ) = 1.011111110101010 × 212

xQ (n2 ) = 1.10101 × 2−11 .

The exponential format of numbers is then written within the register in the following format

s n s e e1 e2 e3 e4 e5 e6 e7 m − 1 m − 2 m − 3 . . . m − b

where:
sn is the sign of number (1 for a positive number and 0 for a negative number)
se is the sign of exponent (1 for a positive exponent and 0 for a negative exponent)
e1 e2 . . . e7 is the binary format of the exponent, and
m−1 m−2 · · ·−b is the mantissa, assuming that the integer value is always 1, it is omitted.
Within this format, the previous signal value xQ (n1 ), with a register of 19 bits in total, is

1 1 0 0 0 1 1 0 0 0 1 1 1 1 1 1 1 0 1,

while xQ (n2 ) is
1 0 0 0 0 1 0 1 1 1 0 1 0 1 0 0 0 0 0.
If the exponent cannot be written within the defined number of bits (here 7), the computer has to stop
the calculation and indicate "overflow", that is, the number cannot fit into the register. For mantissa,
the values are just rounded to the available number of bits. In the implementations based on the
floating-point arithmetic, the quantization affects the mantissa only. The relative error in mantissa is
again
1
|e(n)| ≤ 2−(b+1) = ∆.
2
The error in signal is multiplied by the exponent. Since we can say that the exponent value is of the
signal order, we can write

Q[ x (n)] = xQ (n) = x (n) + e(n) x (n) = x (n)(1 + e(n)).

The quantization error behaves here as a multiplicative uniform noise. Thus, for the floating-point
representation, multiplicative errors appear.
The floating-point additions also produce the quantization errors, which are represented by a
multiplicative noise. During additions, the number of bits may increase. This increase in the number of
bits requires mantissa shift, what causes multiplicative error.
In addition to the IEEE standard when the total number of bits is 32 (23 for mantissa and 7 for
exponent) we will mention two standard formats for the telephone signal coding. The µ-law pulse-coded
modulation (PCM) is used in the North America and the A-law PCM is used in European telephone
networks. They use 8-bit representations with a sign bit, 3 exponent bits, and 4 mantissa bits

s e1 e2 e3 m 1 m 2 m 3 m 4 .

The µ-law encoding takes a 14-bit signed signal value (its two’s complement representation) as input,
adds 33 (binary 100001) and converts it to an 8-bit value. The encoding formula in the µ-law is
h i
(−1)s 2e+1 (m + 16.5) − 33 .

This is a 14-bit signed integer from −8031 to +8031.

Ljubiša Stanković Digital Signal Processing 413

The sign bit s is set to 1 if the input sample is negative. It is set to 0 if the input sample is positive.
Number 0 is written as
0 0 0 0 0 0 0 0.

Example 7.61. As an example consider the positive numbers from +1 to +30. They are written as
+21 (m + 16.5) − 33 with 15 quantization steps equal to 2 (starting from m = 1 to m = 15). Then
the numbers from +31 to +94 are written as +22 (m + 16.5) − 33 with 16 quantization steps
equal to 4 (with m from 0 to 15). The last interval for positive numbers is from +4063 to +8158
written as +28 (m + 16.5) − 33 with 16 quantization intervals (with m from 0 to 15) of the width
256. The range of the input values is from −8159 to +8159 (±213 ) with the minimum step size
2 for the smallest amplitudes.

The compression function corresponding to the µ-law encoding format of signal 0 ≤ | x | ≤ 1 is

ln(1 + µ | x |)
F ( x ) = sign( x ) .
ln(1 + µ)

with µ = 255.

Example 7.62. Write the number a = 456 in the binary µ-law format.

⋆The number to be represented by 2e+1 (m + 16.5) is 456 + 33 = 489. The mantissa range is
0 ≤ m ≤ 15. This means that the exponent (e + 1) should be such that

489
0 + 16.5 ≤ ≤ 15 + 16.5
2e +1
for the range 16.5 ≤ m + 16.5 ≤ 31.5. It is easy to conclude that 489/16 = 30.5625, meaning
e + 1 = 4 with m + 16.5 = 30.5625. The nearest integer value of m is m = 14. Therefore
â = 23+1 × (14 + 16.5) − 33 = 455 is the nearest µ-law format number to a. The binary form is

0 0 1 1 1 1 1 0.

The quantization step for this range of numbers is 24 = 16. It means that the closest possible
smaller number is 439, while the next possible larger number would be 471. It is the last number
with the quantization step 2e+1 = 16.

Example 7.63. Write a model for the calculation of

r (n, m) = x (n + m) x (n − m)

if the quantization error is caused by the floating-point registers with b bits for the mantissa. What
is the expected value? Write the model for

y ( n ) = x ( n ) + x ( n + 1).
414 Discrete-Time Random Signals

The considered signals are real-valued.

⋆The quantization model for this calculation is given by

r̂ (n, m) = x (n + m)(1 + e(n + m)) x (n − m)(1 + e(n − m))(1 + e(n + m, n − m)).

The expected value is

E{r̂ (n)} = x (n + m) x (n − m) + E{e(n + m)e(n − m)}

1 2
= r (n) + E{e2 (n)}δ(m) = r (n) + ∆ δ ( m ).
12
For y(n), the model is

ŷ(n) = [ x (n)(1 + e(n)) + x (n + 1)(1 + e(n + 1))](1 + e(n, n + 1)),

where e(n, n + 1) is the is the multiplicative noise that models the addition error.
Ljubiša Stanković Digital Signal Processing 415

7.8 PROBLEMS

Problem 7.1. Signal x20i (n), for i = 01, 02, .., 15, is the monthly average of the maximum daily
temperatures in a city, measured from the year 2001 to 2015. The values of this signal are given in
Table 7.2. If we can assume that the signal for every individual month is Gaussian, find the probability
that the average of maximum daily temperatures: (a) in January is lower than 2, (b) in January is higher
than 12.

Problem 7.2. Available are M realizations of the random variable xi (n), i = 1, 2, . . . , M, at an instant
n. The variance of x (n) is estimated in two possible scenarios:
(a) The mean value is know in advance and it is equal to zero, µ x (n) = 0.
(b) The mean value is not known and it is estimated from data as
1
µ x (n) = ( x (n) + x2 (n) + · · · + x M (n))
M 1
How the estimate of the variance in (a) is related to the variance estimate in (b)?

Problem 7.3. A random variable x (n) is recorded in N = 10 trials

x = [0.26, 0.31, 0.64, 0.99, 1.00, 0.92, 0.85, 0.73, 0.58, 0.15] T ,

with different independent random variable, tn , values given in the vector form

t = [−0.8, −0.83, −0.60, −0.10, −0.01, 0.28, 0.39, 0.52, 0.65, 0.92] T .

The random variable x (n) is modeled using the fifth-order polynomial

x (n) = a0 + a1 tn + a2 t2n + a3 t3n + a4 t4n + a5 t5n

or in the matrix form

x = Ta,
where T is the matrix with columns tm , that is
T = [t0 , t1 , . . . , t5 ], and tm is the notation for the column
m
vector with the elements tn , n = 1, 2, . . . , N (see the form matrix definition in (7.14)).
(a) Estimate the model parameters using the polynomial fitting (the least-squares solution to the
minimization of J (a) = ||x − Ta||22 ),

â = (T T T)−1 T T x. (7.147)

.
(b) Estimate the model parameters with the ridge regression model (the solution to the
minimization of J (a) = ||x − Ta||22 + λ||a||22 ) in the form

â = (T T T + λI)−1 T T x, (7.148)

with λ = 0.1.
(c) Repeat the calculations in (a) and (b) with an increased additive noise in the data

x = [0.35, 0.33, 0.57, 0.92, 0.94, 0.89, 0.87, 0.86, 0.44, 0.29] T .

(d) Predict the value x (1.12) in all considered cases. Use the result in (a) as the reference.
(e) Find the bias and the covariance matrix of the regression ridge estimator as a function of λ,
when the noise in the true data s is white, with the variance σε2 and the considered signal is x = s + ε
(advanced topic).
416 Discrete-Time Random Signals

Problem 7.4. The cumulative probability distribution function F (χ) is given as



 0 χ≤0


 χ/2 0<χ≤1
F (χ) = 1/2 1<χ≤2



 ( χ − 1 ) /2 2<χ≤3

1 χ > 3.

Find the probability density function p(ξ ) and the probability that x (n) < 2.5.

Problem 7.5. The probability density function of a random variable x (n) is

p x (ξ ) = ae−b|ξ | , −∞ < ξ < ∞,

where a and b are the constants. Find the relation between a and b. What is the cumulative probability
distribution function for a = 1?

Problem 7.6. A random signal x (n) is characterized by the probability density function

λ −λ|ξ |
p x (ξ ) = e , λ > 0.
2
Find the expected value and variance of x (n).

Problem 7.7. The joint probability density function of signals x (n) and y(n) is

kξe−ξ (ζ +1) , 0 ≤ ξ < ∞ 0 ≤ ζ < ∞
p xy (ξ, ζ ) =
0 elsewhere.

Find the value of constant k.

Problem 7.8. Consider two independent random signals x (n) and y(n) with probability density
functions p x(n) (ξ ) and py(n) (ξ ). A new random signal is defined is such a way that it takes the greater
value of the signals x (n) and y(n) at each instant n,

z(n) = max{ x (n), y(n)}.

Find the probability distribution and the probability density function of the random signal z(n).

Problem 7.9. A set of N = 10 balls is considered, with an equal number of balls being marked with 1
(or white) and 0 (or black). A random signal x (n) corresponds to drawing four balls in a row. It has
four samples x (0), x (1), x (2), and x (3) corresponding to these draws. The signal values are equal to
the number (color) associated with the randomly drawn ball. If k is the number of values 0 that appear
in the signal (number of black balls), write the probability for k = 0. Generalize the result for N balls
and M signal samples.

Problem 7.10. The random signal x (n) is zero-mean Gaussian distributed with the the probability
density function
1 2 2
p x (ξ ) = √ e−ξ /(2σx ) .
σx 2π
Show that the variance of this random variable is equal to σx2 .

Problem 7.11. The random signal x (n) is zero-mean Gaussian distributed random variable with the
variance σ22 . Find the median of x (n) and the median of | x (n)|.
Ljubiša Stanković Digital Signal Processing 417

Problem 7.12. (a) Consider a zero-mean Gaussian distributed random noise ε(n) with variance
σε2 . Find the variance of y(n) = ε(n) − ε(n − 1) and relate it to the sample median of |y(n)| =
|ε(n) − ε(n − 1)|.
(b) Sow that, if a signal x (n) consists of a slow-varying deterministic signal s(n) such that
|s(n) + ε(n) − s(n − 1) − ε(n − 1)| ≈ |ε(n) − ε(n − 1)|, the noise standard deviation can estimated
using
1 1

σ̂ε = √ median {x (n) − x (n − 1)}.
2 0.6745 n=2,3,...,N
(c) Check this result on the signal and noise from Example 7.45.
Problem 7.13. The random signal x (n) is such that x (n) = 0 with probability 0.8. In all other cases
x (n) is Gaussian random variable with the expected value 3 and the variance equal to 2. Find the
expected value and the variance of x (n).
Problem 7.14. The signal ε(n) is a Gaussian noise with the expected value µε = 0 and the variance
σε2 . Find the probability that |ε(n)| > A. If the signal length is N = 2000, find the expected number of
samples with amplitudes higher than A = 10, assuming that σε2 = 2. What is the result for A = 4 and
σε2 = 2.
Problem 7.15. The random signal x (n) is a Gaussian noise with the expected value 0 and the variance
σx2 . The signal has a large number N of samples. A random sequence y(n) is formed using M samples
from the signal x (n) with the lowest amplitudes. Find µy and σy .
Problem 7.16. Consider the signal s(n) = Aδ(n − n0 ) and a zero-mean Gaussian noise ε(n) with
variance σε2 , within the interval 0 ≤ n ≤ N − 1, where n0 is a constant integer within 0 ≤ n0 ≤ N − 1.
Find the probability of the event A that the maximum value of x (n) = s(n) + ε(n) is obtained at
n = n0 .
Problem 7.17. The random signal x (n) is a Gaussian noise with the expected value 0 and the variance
σx2 . A random sequence y(n) is formed by omitting the samples from the signal x (n) whose amplitudes
are higher than A. Find the probability density function of the sequence y(n). Find µy and σy .
Problem 7.18. The signal samples x (n) are such that

A + ε(n), for n ∈ N x
x (n) =
ε ( n ), otherwise

where ε(n) is a Gaussian noise with the expected value µε = 0 and the variance σε2 , A > 0 is a constant
and N x is a nonempty set of discrete-time instants. The threshold-based criterion is used to detect if an
arbitrary time instant n belongs to the set N x

n ∈ Nx if x (n) > T,

where T is the threshold. Find the value of threshold T if the probability of false detection is 0.01.
Problem 7.19. The signal x (n) is a random Gaussian sequence with the expected value µ x = 5 and
the variance σx2 = 1. The signal y(n) is a random Gaussian sequence, independent from x (n), with the
expected value µy = 1 and the variance σy2 = 1. If we consider N = 1000 samples of these signals,
find the expected number of time instants where x (n) > y(n) holds.
Problem 7.20. Let x (n) and y(n) be independent real-valued white Gaussian random variables with
expected values µ x = µy = 0 and the variances σx2 and σy2 . Show that the random variable

1 M
M n∑
z= x (n)y(n)
=1
418 Discrete-Time Random Signals

has the variance

1 2 2
σz2 = σ σ .
M x y
Problem 7.21. Find the moments, Mi , and the cumulants, Ki , (up to the fourth-oder) of the Gaussian
distributed random variable
1 2 2
p x (θ ) = √ e−(ξ −µ) /(2σ ) .
σ 2π
The moments can be calculated form
Z∞
1 2 1
Φ x (− jθ ) = p x(n) eθξ dξ = 1 + θ M1 + θ M2 + θ 3 M3 + . . . , (7.149)
2! 3!
−∞

while the cumulants follow from

1 2 1
ln(Φ x (− jθ )) = 1 + θK1 + θ K2 + θ 3 K3 + . . . , (7.150)
2! 3!
The function Mx (θ ) = Φ x (− jθ ) = E{eθξ } is called the moment generating function.

Problem 7.22. A random signal ε(n) is stationary and Cauchy distributed with the probability density
function
a
pε(n) (ξ ) = .
1 + ξ2
Find the coefficient a, expected value, and the variance of this signal.

Problem 7.23. Find the expected value and the variance of the Poisson distributed random variable

λk e−λ
P( x (n) = k) = P(k) = for λ > 0.
k!
Problem 7.24. The causal system is defined by

y(n) = x (n) + 0.5y(n − 1).

The input signal is x (n) = aδ(n) with the random amplitude a. The random variable a is uniformly
distributed within the interval from 4 to 5. Find the expected value and autocorrelation of the output
signal. Is the output signal WSS?

Problem 7.25. Consider the Hilbert transformer with the impulse response
 2

 π2 sin (nπ/2 )
, n 6= 0
n
h(n) = .

 0, n=0

The input signal to this transformer is a white noise with the variance equal to 1.
(a) Find the autocorrelation function of the output signal.
(b) Find the cross-correlation of the input and the output signal. Show that the cross-correlation is
an antisymmetric function.
(c) Find the autocorrelation and the power spectral density function of the analytic signal
ε a (n) = ε(n) + jε h (n), where ε h (n) = ε(n) ∗n h(n).

Problem 7.26. Consider the causal system

y(n) − ay(n − 1) = x (n).

Ljubiša Stanković Digital Signal Processing 419

If the input signal is a white noise x (n) = ε(n), with the autocorrelation function rεε (n) = σε2 δ(n),
find the autocorrelation and the power spectral density of the output signal.

Problem 7.27. Consider the linear time-invariant system whose input is

x (n) = ε(n)u(n)

and the impulse response is

h ( n ) = a n u ( n ),
where ε(n) is a stationary real-valued noise with the expected value µε and the autocorrelation
rεε (n, m) = σε2 δ(n − m) + µ2ε . Find the expected value and the variance of the output signal.

Problem 7.28. Find the expected value, the autocorrelation, and the power spectral density of the
random signal
N
x (n) = ε(n) + ∑ ak e j(ω n+θ ) ,
k k

k =1
where ε(n) is a stationary real-valued noise with the expected value µε and the autocorrelation
rεε (n, m) = σε2 δ(n − m) + µ2ε and θk are random variables, uniformly distributed over the interval
−π < θk ≤ π. All random variables are statistically independent.

Problem 7.29. Find a stable optimal filter if the correlation functions for the desired signal and additive
noise are rss (n) = 0.25|n| , rsε (n) = 0 and rεε (n) = δ(n). Discuss the filter causality.

Problem 7.30. Calculate the DFT value X (2) of the signal s(n) = exp( j4πn/N ), with N = 8,
corrupted by the additive noise ε(n) = 2001δ(n) − 204δ(n − 3), using
N −1
X (k) = ∑ (s(n) + ε(n))e− j2πkn/N
n =0

and estimate the DFT using

n o
XR (k) = N median Re (s(n) + ε(n))e− j2πkn/N
n=0,1,..,N −1
n o
+ jN median Im (s(n) + ε(n))e− j2πkn/N .
n=0,1,..,N −1

Problem 7.31. The spectrogram is one of the most commonly used tools in time-frequency analysis.
Its form is 2
N −1
− j 2π ik
Sx (n, k) = ∑ x (n + i )w(i )e N
i =0
where the signal is x (n) = s(n) + ε(n), with s(n) being the desired deterministic signal and ε(n)
being a complex-valued, zero-mean white Gaussian noise, with the variance σε2 and independent
and identically distributed (i.i.d.) real and imaginary parts. The window function is w(i ). Using the
rectangular window of the width N find:
a) the expected value of Sx (n, k),
b) the variance of Sx (n, k).
Note: For a Gaussian random signal ε(n), holds

E{ε(l )ε∗ (m)ε∗ (n)ε( p)} = E{ε(l )ε∗ (m)} E{ε∗ (n)ε( p)}
+ E{ε(l )ε∗ (n)} E{ε∗ (m)ε( p)} + E{ε(l )ε( p)} E{ε∗ (m)ε∗ (n)}. (7.151)
420 Discrete-Time Random Signals

Problem 7.32. The basic time-frequency distribution is the Fourier transform, whose discrete-time
form reads
L
Wx (n, ω ) = ∑ x (n + k) x ∗ (n − k)e− j2ωk ,
k =− L
where the signal is given by x (n) = s(n) + ε(n), with s(n) being the desired deterministic signal and
ε(n) being the complex-valued, zero-mean white Gaussian noise whose variance is σε2 . The real and
imaginary parts of the noise are independent and identically distributed (i.i.d.). Find:
a) the expected value of Wx (n, ω ),
b) the variance of Wx (n, ω ).
Use the previous problem note. Find the variance for an FM signal, when |s(n)| = A.
Problem 7.33. A random signal s(n) carries an information. Its autocorrelation function is rss (n) =
4(0.5)|n| . A noise with variance of autocorrelation rεε (n) = 2δ(n) is added to the signal. Find the
optimal filter for:
(a) d(n) = s(n) - optimal filtering,
(b) d(n) = s(n − 1) - optimal smoothing,
(c) d(n) = s(n + 1) - optimal prediction.
Problem 7.34. Design an optimal filter if the autocorrelation function of the signal is rss (n) =
3(0.9)|n| . The autocorrelation of noise is rεε (n) = 4δ(n), while the cross–correlation of the signal and
noise is rsε (n) = 2δ(n) .
Problem 7.35. The power spectral densities of the signal Sdd (e jω ) and of the input noise is Sεε (e jω )
are given in Fig. 7.38. Show that the frequency response of the optimal filter H (e jω ) is presented in
Fig. 7.38(bottom). Find the SNR at the input and the output of the optimal filter.
Problem 7.36. Find the expected value of the quantization error of the Fourier transform (its pseudo
form over-sampled in frequency)
N −1
Wx (n, k) = ∑ x (n + m) x (n − m)e− j2πmk/N ,
m =0

where x (n) is a real-valued quantized signal. The product of signals is quantized to b bits as well.
Neglect the quantization of the coefficients e− j2πmk/N and the quantization of their products with the
signal.

7.9 EXERCISE

Exercise 7.1. Signal x20i (n) is equal to the monthly average of the maximum daily temperatures in
a city measured from year 2001 to 2015. If we can assume that the signal for an individual month is
Gaussian find the probability that the average of the maximum temperatures: (a) in July is lower than
25, (b) in August is higher than 39.
Exercise 7.2. The random signal x (n) is such that x (n) = x1 (n) with probability p. In all other cases
x (n) is x2 (n). If the expected value and the variance of x1 (n) and x2 (n) are µ x1 , σx21 and µ x2 , σx22 ,
respectively, find the expected value and the variance of x (n).
Result: µ x = pµ x1 + (1 − p)µ x2 and
h i h i
σx2 = p E{ x12 (n)} − µ2x + (1 − p) E{ x22 (n)} − µ2x

= p[σx21 + µ2x1 − µ2x ] + (1 − p)[σx22 + µ2x2 − µ2x ]

= pσx21 + (1 − p)σx22 + p(1 − p)(µ x1 − µ x2 )2 .
Ljubiša Stanković Digital Signal Processing 421

1.5
Sdd (ejω)

0.5

0
−3 −2 −1 0 1 2 3

1.5 jω
Sεε (e )

0.5

0
−3 −2 −1 0 1 2 3

1.5
H(ejω)
1

0.5

0
−3 −2 −1 0 1 2 3
2
Figure 7.38 Power spectral densities of the signal S(e jω ) and input noise Sεε (e jω ) along with the frequency
response of an optimal filter H (e jω ).

Exercise 7.3. Find the expected value and the variance of a white uniform noise whose values are
within the interval − a ≤ x (n) ≤ a. If this signal is an input to the FIR system with the impulse response
h(n) = 1 for 1 ≤ n ≤ N and h(n) = 0 elsewhere, find the expected value and the variance of the
output signal.

Exercise 7.4. Consider the signal x (n) equal to the Gaussian zero-mean noise with the variance σε2 . A
new noise y(n) is formed using the values of x (n) lower than median value. Find the expected value
and the variance of this new noise y(n). Result: σy2 = 0.1426σε2 .

Exercise 7.5. The causal system is defined by

1
y ( n ) − y ( n − 1) = x ( n ).
2
The input signal is the causal part of a white noise ε(n),

x (n) = ε(n)u(n)
422 Discrete-Time Random Signals

where µε = 0 and rεε (n) = σε2 δ(n). Find the expected value value and the autocorrelation ryy (n, m) of
the output signal. What is the cross-correlation between the input signal and the output signal ryx (n, m).
Show that for n → ∞ the output signal tends to a WSS signal.

Exercise 7.6. (a) Calculate the DFT value X (4) for x (n) = exp( j4πn/N ) with N = 16.
(b) Calculate the DFT of a noisy signal x (n) + ε(n), where the noise realization is ε(n) =
1001δ(n) − 899δ(n − 3) + 561δ(n − 11) − 32δ(n − 14).
(c) Estimate the DFT using the noisy signal x (n) + ε(n) and
n o
XR (k) = N median Re ( x (n) + ε(n))e− j2πkn/N
n=0,1,..,N −1
n o
+ jN median Im ( x (n) + ε(n))e− j2πkn/N .
n=0,1,..,N −1

Discuss the results.

Exercise 7.7. The power spectral densities of the desired signal Sdd (e jω ) and the input noise Sεε (e jω )
are given in Fig. 7.39 for two cases. One on the left panels and the other on the right panels. Show that
the frequency response of the optimal filter H (e jω ) is given in Fig. 7.39(bottom panel for both cases of
the signal and noise). Find the SNR at the input and the output of the optimal filter in both cases.

Exercise 7.8. Find the transfer function of the optimal filter for the signal x (n) = s(n) + ε(n), where
ε(n) is a white noise with the autocorrelation rεε (n) = Nδ(n), and s(n) is the random signal obtained
as the output of the first-order linear system to the white noise with the autocorrelation rss (n) = a|n| ,
0 < a < 1. The signal and noise are not correlated.

Exercise 7.9. A random signal s(n) carries an information. Its autocorrelation function is rss (n) = 41|n| .
A noise with the autocorrelation rεε (n) = 0.5δ(n) is added to the signal. Find the optimal filter for:
(a) d(n) = s(n) - optimal filtering,
(b) d(n) = s(n − 1) - optimal smoothing,
(c) d(n) = s(n + 1) - optimal prediction.

Exercise 7.10. Find the power spectral densities of the signals whose autocorrelation functions are:
(a) r xx (n) = δ(n) + 2 cos(0.πn),
(b) r xx (n) = −4δ(n + 1) + 7δ(n) − 4δ(n − 1), and
∞
(c) r xx (n) = 2a cos(ω0 n) + ∑ σ2 (1/2)k δ(n − k).
k =0

Exercise 7.11. Find the expected value and variance of the periodogram, Pxx (e jω ), of a deterministic
signal s(n) corrupted by the white noise with variance σε2 ,
2
1 N/2−1
jω

− jωn
Pxx (e ) = E{ ∑ (s(n) + ε(n))e }. (7.152)
N n=− N/2

Hint: For the variance calculation use

N/2−1 N/2−1 N/2−1 N/2−1
1
Var{ Pxx (e jω )} = E { ∑ ∑ ∑ ∑ (s(n1 ) + ε(n1 ))(s(n2 ) + ε(n2 ))
N 2 n =− N/2 n =− N/2 n =− N/2 n =− N/2
1 2 3 4

∗ ∗ − jω (n1 +n2 −n3 −n4 )

×(s(n3 ) + ε(n3 )) (s(n4 ) + ε(n4 )) e } − |E{ Pxx (e jω )}|2

and relation (7.151).

Ljubiša Stanković Digital Signal Processing 423

1.5 jω 1.5 jω
Sdd (e ) Sdd (e )

1 1

0.5 0.5

0 0
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3

1.5 1.5
Sεε (ejω) Sεε (ejω)

1 1

0.5 0.5

0 0
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3

1.5 1.5
H(ejω) H(ejω)
1 1

0.5 0.5

0 0
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
2
Figure 7.39 Power spectral densities of the signal S(e jω ) and the input noise Sεε (e jω ) along with the frequency
responses of the optimal filters H (e jω ). Two cases are shown, one on the left panels and the other on the right
panels.
424 Discrete-Time Random Signals

7.10 SOLUTIONS

Solution 7.1. (a) The expected value of the temperature for January, Table 7.2, is

µ x (1) = 7.2667.

The standard deviation for January, calculated over 15 years, is σx (1) = 2.6196. The probability that
the average maximum temperature in January is lower than 2 is

Z2 2
1 − (ξ −µx (1)) 7.2667 − 2
P ( x (1) < 2) = √ e 2σx2 (1) dξ = 0.5 1 − erf( √ ) = 0.0260.
σx (1) 2π 2.7115 2
−∞

This means that this event will occur once in about 40 years.
(b) The average maximum temperature is higher than 12 with the probability
Z∞ 2
1 − (ξ −µx (1)) 12 − 7.2667
P( x (1) > 12) = √ e 2σx2 (1) dξ = 0.5 1 − erf( √ ) = 0.0404.
σx (1) 2π 2.7115 2
12

This means that this event will happen once in about 25 years.

Solution 7.2. (a) In the scenario when the expected value is a priory known, µ x (n) = 0, the variance
estimation is
1 2
σx2 (n) = x1 (n) + x22 (n) + · · · + x2M (n) − µ2x (n).
M
(b) When the mean is also estimated from the data, the variance will be denoted by s2x (n), and it
is equal to
!
2 1 1 M 2 1 M 2
M i∑ M i∑
s x (n) = x1 ( n ) − xi ( n ) + · · · + x M ( n ) − xi ( n )
M =1 =1

1 M 1 M 2 1 M 2 1 M 2
= ∑
M j =1
x j (n) − ∑
M i =1
xi ( n ) = ∑
M j =1
x j (n) − ∑
M i =1
xi ( n )

1 M 2 1 M M 1 M 2 1 M 1 M M
= ∑ x j ( n ) − 2 ∑ ∑ xi ( n ) x j ( n ) = ∑ x j (n) − 2 ∑ x2j (n) − 2 ∑ ∑ xi (n) x j (n)
M j =1 M j =1 i =1 M j =1 M j =1 M j =1 i =1
i6= j

M−1 M 2 M2 − M 1 M M
= 2 ∑
M j =1
x j (n) −
M 2 2 ∑ ∑ x i ( n ) x j ( n ).
M − M j =1 i =1
i6= j

In the second summation, the denominator ( M2 − M) is used since there are exactly ( M2 − M ) terms
in it, and the mean value estimate is
M M
1
µ̂2x (n) ≈
− M j=1 i∑
∑ x i ( n ) x j ( n ).
M2 =1
i6= j
Ljubiša Stanković Digital Signal Processing 425

Therefore, the the variance, s2x (n), can be written in the form

M − 1 1 M 2 1 M M
s2x (n) = ∑ x j (n) − 2 ∑ ∑ xi ( n ) x j ( n )
M M j =1 M − M j =1 i =1
i6= j

M − 1 1 M 2 M−1
= ∑ x j (n) − µ̂2x (n) ≈ σx2 (n)
M M j =1 M

This means that the variance with the true mean value, σx2 (n), is (approximately) related to the variance
with the estimated mean value, s2x (n), as
!
M 2 1 1 M 2 1 M 2
2
M i∑ M i∑
σx (n) ≈ s (n) = x1 ( n ) − xi ( n ) + · · · + x M ( n ) − xi ( n ) .
M−1 x M−1 =1 =1

Solution 7.3. (a) The model parameters, obtained as the solution to the least squares minimization
problem of J (a) = ||x − Ta||22 , for the data

t = [−0.8, −0.83, −0.60, −0.10, −0.01, 0.28, 0.39, 0.52, 0.65, 0.92] T

and
x = [0.26, 0.31, 0.64, 0.99, 1.00, 0.92, 0.85, 0.73, 0.58, 0.15] T .
are given by (see the matrix T definition in (7.14))

â = (T T T)−1 T T x = [0.9997, −0.0043, −0.9926, 0.0275, −0.0107, −0.0293] T . (7.153)

The estimated model is shown in Fig. 7.40(a) by the dotted line. Since the noise is small (cased by
rounding the data to two decimals), the model fits the data accurately.
(b) When the regularization constant is added, the solution to the ridge regression minimization
of J (a) = ||x − Ta||22 + λ||a||22 ), in the form

â = (T T T + λI)−1 T T x = [0.9225, 0.0133, −0.5775, 0.0005, −0.3871, −0.0028] T (7.154)

is obtained. In this case, a small bias in fitting the data can be observed from Fig. 7.40(b).
(c) When a stronger additive noise is present in the data

x = [0.35, 0.33, 0.57, 0.92, 0.94, 0.89, 0.87, 0.86, 0.44, 0.29] T ,

the least squares and the ridge regression solutions are, respectively,

â = (T T T)−1 T T x = [0.9515, 0.4443, −1.1449, −1.6335, 0.3664, 1.3640] T (7.155)

and

â = (T T T + λI)−1 T T x = [0.8740, 0.0795, −0.5277, −0.0622, −0.2434, −0.0068]. (7.156)

The model results are shown in Fig. 7.40(c) and (d), respectively. The higher-order model coefficients
in â are larger in the solution when the regularization is not used.
(d) The predicted values of x (1.12) in all considered cases are indicated by a circle. We can see
that the moderate noise causes significant deviation (over-fitting) of the results, Fig. 7.40(c), if the
regularization is not used. In the case with a very small noise, the regularizations slightly worsen the
results, by introducing the bias, as shown in Fig. 7.40(b).
426 Discrete-Time Random Signals

1 1

0.5 0.5

0 0

-0.5 -0.5
-1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1

1 1

0.5 0.5

0 0

-0.5 -0.5
-1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1

Figure 7.40 Polynomial fitting example. (a) Data with a small noise (the black dots) and the polynomial fitting
with the least-squares solution (the dotted line). (b) Data with a small noise and the polynomial fitting with the
regression model using regularization constant λ = 0.1, producing a small deviation from the data. (c) Data with
moderate noise and the polynomial fitting with the least-squares solution. The noise causes significant deviations
and an over-fitted model. (d) Data with moderate noise and the polynomial fitting with the regression model using
the regularization constant λ = 0.1, keeping the energy of all model coefficients low. The predicted value at x (1.12)
is marked by the circle.

(e) Advanced topic: Assume that the observed signal consists of the true data, s and the noise
ε, that is x = s + ε. The parameters of the true model, Ta = s, are the solution to the least-squares
minimization of J (a) = ||s − Ta||22 , that is

a = (T T T)−1 T T s.

The bias of the ridge regression model estimate is obtained from

â = (T T T + λI)−1 T T x = (T T T + λI)−1 T T (s + ε),

E{â} = (T T T + λI)−1 T T s, as

bias(â) = E{â} − a = (T T T + λI)−1 − (T T T)−1 T T s.
Ljubiša Stanković Digital Signal Processing 427

For λ = 0 the estimator is unbiased, bias(â) = 0, while for large λ the bias increases toward
|bias(â)| = |(T T T)−1 T T s|.
The covariance matrix of the estimator is, by definition,
T
Cov(â) = E{(â − E{â})(â − E{â}) T } = E{ (T T T + λI)−1 T T ε (T T T + λI)−1 T T ε }
T
= σε2 (T T T + λI)−1 T T (T T T + λI)−1 T T = σε2 (T T T + λI)−1 T T T(T T T + λI)−1 .

since E{εε T } = σε2 I.

The covariance matrix for λ = 0 is Cov(â) = σε2 (T T T)−1 . Its elements decrease toward 0 as λ
increases.
Obviously, there is an optimal value of parameter λ, when the bias-to-variance trade off,
||bias(â)||22 + Trace(Cov(â)), is minimum. Trace(Cov(â)) represents the sum of diagonal elements
(variances) in the covariance matrix (7.24).
Cross-validation method. In practice, finding the best parameter λ value is a difficult problem. One
way to find the best λ is the so-called leave-one-out cross-validation method, where:
1. The estimation of the model parameters is done with a preselected set of λ values, leaving out
one (or more) data point (tn , x (n)).
2. The estimation is repeated for other data points being leaved out, one by one.
3. The total squared error of the model predicted values with respect to the corresponding leaved
out values is calculated for each of the considered λ.
4. The best λ is the one which produces the smallest total prediction error.

Solution 7.4. For the cumulative probability distribution function



 0 χ≤0


 χ/2 0<χ≤1
F (χ) = 1/2 1<χ≤2



 ( χ − 1 ) /2 2<χ≤3

1 χ>3

the probability density function is obtained as the derivative of F (χ),



 0 ξ≤0


 1/2 0 < ξ ≤ 1
dF (ξ )
p x (ξ ) = = 0 1<ξ≤2
dξ 


 1/2 2 <ξ≤3

0 ξ > 3.

The probability of x (n) < 2.5 is P( x (n) < 2.5) = F (2.5) = 0.75.

Solution 7.5. Integral of the probability density function over (−∞, ∞) is

Z ∞
p x (ξ )dξ = 1.
−∞

Therefore, Z ∞ Z Z ∞
0 2a
ae−b|ξ | dξ = a ebξ dξ + e−bξ dξ = = 1,
−∞ −∞ 0 b
428 Discrete-Time Random Signals

resulting in b = 2a.
For a = 1, the probability density function is p x (ξ ) = e−2|ξ | for −∞ < ξ < ∞ . The probability
distribution function is Z χ Z χ
e2χ
Fx (χ) = p x (ξ )dξ = e2ξ dξ =
−∞ −∞ 2
for −∞ < χ < 0, and
Z χ Z χ
1 1 e−2χ
Fx (χ) = + pǫ (ξ )dξ = + e−2ξ dξ = 1 −
2 0 2 0 2
for 0 < χ < ∞.

Solution 7.6. The expected value is

Z ∞ Z Z ∞ Z ∞ Z ∞
1 0 1 1 1
Ex = ξ p x (ξ )dξ = ξeλξ dξ + ξe−λξ dξ = − ξe−λξ dξ + ξe−λξ dξ = 0.
−∞ 2 −∞ 2 0 2 0 2 0

The variance of x (n) is obtained from

n o Z ∞ 2
σx2 (n) = E | x (n) − E{ x (n)}|2 = E{ x2 (n)} = ξ 2 p x (ξ )dξ = .
−∞ λ2

Solution 7.7. Since

Z ∞ Z ∞
p xy (ξ, ζ )dξdζ = 1
−∞ −∞
we have
Z ∞Z ∞ Z ∞ Z ∞ Z ∞ Z ∞
1
kξe−ξ (ζ +1) dξdζ = kξe−ξ dξ e−ξζ dζ = kξe−ξ dξ = ke−ξ dξ = k
0 0 0 0 0 ξ 0

and the value of constant k is 1.

Solution 7.8. Since the random signal z(n) takes the greater of the values x (n) and y(n) the
probability that z(n) = max{ x (n), y(n)} is lower than or equal to an assumed χ is equal to the
probability that both random samples x (n) and y(n) are lower than or equal to this assumed χ, that is

P{z(n) ≤ χ} = P{max{ x (n), y(n)} ≤ χ}

= P{ x (n) ≤ χ and y(n) ≤ χ} = P{ x (n) ≤ χ} P{y(n) ≤ χ}.

Since
P{ x (n) ≤ χ} = Fx(n) (χ) and P{y(n) ≤ χ} = Fy(n) (χ),
we get the probability distribution of the random variable z(n) in the form

Fz(n) (χ) = P{z(n) ≤ χ} = Fx(n) (χ) Fy(n) (χ).

The probability density function follows as the derivative od the probability distribution,

dFz(n) (ξ )
pz(n) (ξ ) = = p x(n) (ξ ) Fy(n) (ξ ) + Fx(n) (ξ ) py(n) (ξ ).
dξ
Ljubiša Stanković Digital Signal Processing 429

Solution 7.9. There are 5 out of 10 black balls. The probability that x (0) = 0 is

5
P0 = .
10
If the first ball was 0, then we have 9 balls for the second draw, with 4 balls marked with 0. The
probability that x (1) = 0, if x (0) = 0 is
4
P1 = .
9
If x (0) = 0 and x (1) = 0, then there are 8 remaining balls with 3 of them being marked with 0. The
probability that x (2) = 0, with x (0) = 0 and x (1) = 0, is
3
P2 = .
8
The probability for k = 0 is
5 432
P ( k = 0) = .
10 9 8 7
In general, if there were N balls, with an equal number of balls being marked with 1 (or white) and 0
(or black), and we considered M signal samples (drawings), the probability P(k = 0) would be
M −1
N/2 − i
P ( k = 0) = ∏ N−i
.
i =0

Solution 7.10. The variance of this random signal is defined by

Z∞ Z∞ ξ2
2 1 −
Var{ x (n)} = ξ p x (ξ )dξ = ξ2 √ e 2σx2
dξ.
σx 2π
−∞ −∞

Using the fact that

Z∞ ξ2 Z∞ ζ2
1 − 1 −
Var{ x (n)}Var{ x (n)} = ξ2 √ e 2σx2
dξ ζ2 √ e 2σx2
dζ
σx 2π σx 2π
−∞ −∞
Z∞ Z∞ 2 +ζ 2
1 −ξ
= 2
ξ2ζ2e 2σx2
dξdζ
σx 2π
−∞ −∞

and the substitution of the variables to polar coordinates, ξ = σx ρ cos(φ), ζ = σx ρ sin(φ), we get

Z∞ Z2π ρ2
1
Var{ x (n)}Var{ x (n)} = ρ4 cos2 (φ) sin2 (φ)e− 2 σx4 ρdρdφ
2π
0 0
Z∞ ρ2
Z∞
1 1 4
= σx4 ρ4 e− 2 ρdρ = σ t2 e−t dt = σx4 ,
8 2 x
0 0

where cos2 (φ) sin2 (φ) = sin2 (2φ)/4 = (1 − cos(4φ))/8 and the substitution ρ2 /2 = t are used.
This means that Var{ x (n)} = σx2 .
430 Discrete-Time Random Signals

Solution 7.11. For a random signal x (n), with a probability density function p x (ξ ), the median is
defined as the value m x such that
Zmx Z∞
p x (ξ )dξ = p x (ξ )dξ.
−∞ mx

For the zero-mean Gaussian distributed random variable m x = 0. For the random variable | x (n)|, the
probability density function is
ξ2
2 −
2σx2
p| x| (ξ ) = 2p x (ξ )u(ξ ) = √ e u ( ξ ).
σx 2π
The median of | x (n)| is obtained from

Zmx ξ2 Z∞ ξ2
2 −
2σx2
2 −
2σx2
1
√ e dξ = √ e dξ =
σx 2π σx 2π 2
0 mx

or
Z∞ ξ2
2 −
2σx2
mx 1
√ e dξ = 1 − erf( √ )= .
σx 2π σx 2 4
mx
The solution is
m x = 0.6745σx .

Solution 7.12. (a) When ε(n) is a zero-mean Gaussian distributed random noise, with variance σε2 ,
the variance of y(n) = ε(n) − ε(n − 1) is equal to

σy2 = 2σε2 .

√

median {x (n) − x (n − 1)} ≈ median {ε(n) − ε(n − 1)} ≈ 2 0.6745σε .
n=2,3,...,N n=2,3,...,N

This means that the noise variance can be estimated by

1

σ̂ε = √ median {x (n) − x (n − 1)}.
2 0.6745 n=2,3,...,N
Since the total signal energy can be calculated as
N
Ex = ∑ | x(n)|2 = Es + Nσε2 ,
n =1

the previous relation can be used to estimate the input SNR value.
Ljubiša Stanković Digital Signal Processing 431

(c) The true standard deviation of noise in Example 7.45 was σε = 4. The value of variance,
estimated using the previous relation, is
1

σ̂ε = √ median {x (n) − x (n − 1)} = 4.5.
2 0.6745 n=2,3,...,N
The difference between σ̂ε and σε is due the fact that the signal variations |s(n) − s(n − 1)| are
small, but not negligible. However, the estimation σ̂ε is sufficiently accurate for the presented algorithm,
since its slightly higher value than the true standard deviation will affect the confidence intervals only,
by increasing their bounds and the corresponding probabilities (from the factor of 7.1 in Example 7.45
to the factor of 8, corresponding to the case as if the true standard deviation σε = 4 were used with the
confidence interval bounds defined by 2.82σXN , instead of the assumed bounds defined by 2.5σXN ).
The standard deviation estimate could also be obtained using the variance definition for
x (n) − x (n − 1), given by
2 2

mean {x (n) − x (n − 1) } ≈ mean {ε(n) − ε(n − 1) } = 2σ̂ε2 .
n=2,3,...,N n=2,3,...,N

In Example 7.45, this kind of estimation would produce σ̂ε = 5.3.

Solution 7.13. Let us find the probability that x (n) < ξ for arbitrary ξ. Consider the case when ξ < 0,

0.2 ξ −3
P{ x (n) < ξ } = 1 + erf .
2 2

It has been taken into account that the considered sample is Gaussian (with probability 0.2), along with
the probability that the sample value is smaller than ξ.
For ξ > 0, we should take into account that the signal takes x (n) = 0 with the probability 80%
as well as that in the remaining 20% cases, Gaussian random value could be smaller than ξ. So, we get

0.2 ξ −3
P{ x (n) < ξ } = 0.8 + 1 + erf .
2 2
Now, we have 
 1 + erf ξ −
0.2
2 2
3
forξ < 0
P{ x (n) < ξ } =
 0.8 + 0.2 1 + erf ξ −3 for ξ > 0.
2 2

This function has a discontinuity at ξ = 0. It is not differentiable at this point as well. The derivative of
P{ x (n) < ξ } can be expressed in the form of the generalized functions (Dirac delta function) as

d 0.2 ( ξ −3)2
P{ x (n) < ξ } = py(n) (ξ ) = √ e− 4 + 0.8δ(ξ ).
dξ 2 π
The expected value and the variance are
Z∞
µy(n) = ξ py(n) (ξ )dξ = 0.2 × 3 + 0.8 × 0 = 0.6
−∞
Z∞
σy2(n) = (ξ − 0.6)2 py(n) (ξ )dξ = 0.2 × 7.76 + 0.8 × (0.6)2 = 1.84.
−∞
432 Discrete-Time Random Signals

Solution 7.14. The probability that |ε(n)| > A is

P{|ε(n)| > A} = P{ε(n) < − A} + P{ε(n) > A}

−A
Z ζ2 Z∞ ζ2

1 −
2σε2
1 −
2σε2
A
= √ e dζ + √ e dζ = 1 − erf √ .
σε 2π σε 2π 2σε
−∞ A

For A = 10 and σε2 = 2, we get

P{|ε(n)| > 10} = 1 − erf(5) ≈ 1.5 × 10−12 .

For N = 2000, the expected number of samples with amplitude above A is P{|ε(n)| > 10} × 2000 ≈
3 × 10−9 ≈ 0. This means that we do not expect any sample with amplitude higher than 10.
For A = 4, we have

P{|ε(n)| > A} = 1 − erf (2) ≈ 4.7 × 10−3

with 2000 × 4.7 × 10−3 = 9.4 ≈ 9 samples among the considered 2000 assuming an amplitude higher
than 4.

Solution 7.15. If we are in the position to use a reduced set of the signal samples for processing, then
the ideal scenario would be to eliminate signal samples with the higher noise values and to keep for
processing the samples with the lower noise values. For the case of N, the signal samples and signal
processing based on M samples, we can find the interval of amplitudes A for the lowest M noisy
samples. The probability that | x (n)| < Aσε is equal to

ZAσε
1 2 2
P{| x (n)| < Aσε } = √ e−ξ /(2σε ) dξ.
σε 2π
− Aσε

Since we use M out of N samples, this probability should be equal to M/N,

ZA
1 2 A M
√ e−ξ /2
dξ = erf √ = .
2π 2 N
−A

The calculation of A value is easily related

√ to the inverse erf( x ) function, denoted by erfinv( x ). For
a given M/N, the amplitude is A = 2erfinv( M N ). For example, for M = √ N/2 a half of the lowest
noise samples will be within the interval [−0.6745σε , 0.6745σε ], since A = 2erfinv(0.5) = 0.6745.
The probability density function of the new noise is
( 2 2
√k e−ξ /(2σε ) for |ξ | < Aσε
py (ξ ) = σε 2π
0 for |ξ | ≥ Aσε .

R∞
The constant k is obtained from the condition that py (ξ )dξ = 1. Its value is k = N/M.
−∞
The variance of this new noise, formed from the Gaussian noise after the largest N − M values
are removed, is much lower than the variance of the whole noise. It is given by
√ M
N Z ( N )σε
2erfinv
2 2
σy2 = M
√ ξ 2 e−ξ /(2σε ) dξ. (7.157)
σε 2π √
M
− 2erfinv( N ) σε
Ljubiša Stanković Digital Signal Processing 433

Solution 7.16. The probability density function for any sample x (n), n 6= n0 , is

1 2 2
p x(n),n6=n0 (ξ ) = √ e−ξ /(2σε ) .
σε 2π
The probability that any of these samples is smaller than a value of λ could be defined using (7.43)

P− (λ) = Probability{ x (n) < λ, n 6= n0 }

Probability{ x (n) < 0, n 6= n0 } + Probability{0 ≤ x (n) < λ, n 6= n0 }
√
= 0.5 + 0.5 erf(λ/( 2σε )).

Since the random variables x (n), 0 ≤ n ≤ N − 1, n 6= n0 , are statistically independent, then the
probability that all of them are smaller than λ is
−
PN −1 ( λ ) = Probability{All N − 1 values of x ( n ) < λ, n 6 = n0 }
h √ i N −1
= 0.5 + 0.5 erf(λ/( 2σε )) .

The probability density function of the sample x (n0 ) is a Gaussian function with the mean value A,
that is
1 2 2
p x(n0 ) (ξ ) = √ e−(ξ − A) /(2σε ) .
σε 2π
The probability that the random variable x (n0 ) takes a value around λ, λ ≤ x (n0 ) < λ + dλ, is

1 2 2
Pn+0 (λ) = Probability{λ ≤ x (n0 ) < λ + dλ} = √ e−(ξ − A) /(2σε ) dλ (7.158)
σε 2π
The probability that all values of x (n), 0 ≤ n ≤ N − 1, n 6= n0 are smaller than λ and that, at the same
time, λ ≤ x (n0 ) < λ + dλ is
N −1
− λ 1 2 2
PA (λ) = PN −1 ( λ ) Pn
+
0
( λ ) = 0.5 + 0.5 erf √ √ e−(ξ − A) /(2σε ) dλ,
2σε σε 2π
while the total probability that all x (n), 0 ≤ n ≤ N − 1, n 6= n0 are bellow x (n0 ) is an integral over
all possible values of λ
Z∞ Z∞ N −1
λ 1 2 2
PA = PA (λ)dλ = 0.5 + 0.5 erf √ √ e−(ξ − A) /(2σε ) dλ. (7.159)
2σε σε 2π
−∞ −∞

This integral cannot be simplified and should be evaluated numerically.

Solution 7.17. The probability density function for the sequence y(n) is
 2
 − (ζ )
py(n) (ζ ) = B √1 e 2σx2 for − A < ζ ≤ A
 σx 2π
0 otherwise.
434 Discrete-Time Random Signals

R∞
A
The constant B can be calculated from −∞ py(n) (ζ )dζ = 1. Its value is B = 1/ erf √ . Now, we
σx 2
have µy(n) = 0 and
 
ZA 2 √ − A22
1 1 − (ζ )2  A 2e 2σx
σy2(n) = ζ2 √ e 2σx
dζ = σx2 1 − 
.
erf A
√ σx 2π σx √π erf A
√
−A σx 2 σx 2
√
By denoting β = A/( 2σx ), the variance σy2(n) can be written as

2
!
e− β
σy2(n) = σx2 1 − 2β √ .
π erf ( β)

Solution 7.18. False detection means that we make a wrong decision by classifying instant n into set
N x . The probability is
1 1 T
PF = P{ε(n) > T } = − erf √
2 2 2σε
Now, we can find T as √
T= 2σε erfinv(1 − 2PF ) ≈ 2.33σε
where erfinv(·) is the inverse erf function. Note that the threshold does not depend on A.

Solution 7.19. The joint probability distribution is

p x(n),y(n) (ξ, ζ ) = p x(n) (ξ ) py(n) (ζ ),

since signals are mutually independent. The probability that x (n) > y(n) can be obtained by integrating
p x(n),y(n) (ξ, ζ ) over the region ξ > ζ,

Z∞ Zξ
1 ( ξ −5)2 1 ( ζ −1)2
P{ x (n) > y(n)} = √ e− 2 √ e− 2 dζdξ ≈ 0.99766.
2π 2π
−∞ −∞

For 1000 instants, we expect that x (n) > y(n) is satisfied in about 998 instants.

Solution 7.20. Since the variable

1 M
M n∑
z= x (n)y(n)
=1
is also of zero-mean, then its variance is
" #
1 M 1 M
σz2 2
∑ M m∑
= E[ z ] = E x (n)y(n) x (m)y(m)
M n =1 =1

1 M M 1 M M
= 2 ∑ ∑
M n =1 m =1
E [ x (n)y(n) x (m)y(m)] = 2 ∑ ∑ E[ x (n) x (m)] E[y(n)y(m)]
M n =1 m =1
1 M 1 M 1 2 2
= ∑ E[ x2 (n)] E[y2 (n)] = 2 ∑ σx2 σy2 = σ σ .
2
M n =1 M n =1 M x y
Ljubiša Stanković Digital Signal Processing 435

Solution 7.21. The moments of the Gaussian distributed random variable follow from the moment
generating function, related to the Fourier transform of the Gaussian distribution (characteristics
function), as
Z∞
1 2 2 2 2 2 2
Mx (θ ) = Φ x (− jθ ) = √ e−(ξ −µ) /(2σ ) e j(− jθ )ξ dξ = e−σ (− jθ ) /2 e j(− jθ )µ = eσ θ /2 eθµ .
σ 2π
−∞

Expanding the moment generating function Mx (θ ) into Taylor’s series around θ = 0 we get
2 2 1 1 1
eθµ eσ θ /2
(θµ)2 + (θµ)3 + (θµ)4 + . . . )
= (1 + (θµ) +
2! 3! 4!
1 1 1
×(1 + (σ2 θ 2 /2) + (σ2 θ 2 /2)2 + (σ2 θ 2 /2)3 + (σ2 θ 2 /2)4 + . . . )
2! 3! 4!
1 1 1
= 1 + θµ + θ 2 (µ2 + σ2 ) + θ 3 (µ3 + 3µσ2 ) + θ 4 (µ4 + 6µ2 σ2 + 3σ4 ) . . .
2! 3! 4!
The moments of the Gaussian distributed random variable are given by (on the right for µ = 0)

M1 = µ, M1 = 0,
2 2
M2 = µ + σ , M2 = σ2 ,
M3 = µ3 + 3µσ2 , M3 = 0,
M4 = µ4 + 6µ2 σ2 + 3σ4 , M4 = 3σ4 .

The cumulants Ki of the random variable x (n) are obtained, by definition, from the Taylor’s
series around θ = 0 of the logarithm of the moment generating function ln( Mx θ )). In the case of the
Gaussian distribute variable
ln( Mx (θ )) = θµ + σ2 θ 2 /2.
Obviously,

K1 = µ, K1 = 0,
2
K2 = σ , K2 = σ 2 ,
K3 = 0, K3 = 0,
K4 = 0, K4 = 0,

and Ki = 0 for i > 2. This is well-known criterion to test if a random variable is Gaussian distributed,
since the cumulants of this distribution should be zero for i > 2. Since the third-oder moments are
zero-valued for any even distribution function, the fourth-order cumulant is used to check if a random
variable is Gaussian distributed. For an even distributed random variable, the fourth-order cumulant is
related to the moments as K4 = M4 − 3M22 , and Mi is statistically estimated as Mi = mean( xi (n))).
The kurtosis is defined as the fourth-order moment of the centered and normalized random
variable x ( n ) − µ 4
x
Kurtx = E{ }.
σx
For the Gaussian random variable Kurtx = 3. Any different value than Kurtx = 3 produces the excess
kurtosis,
ExcessKurtx = Kurtx − 3,
whose value is different than zero, and this value is an indicator of the distribution deviation from the
Gaussian distribution.
436 Discrete-Time Random Signals

Solution 7.22. The probability that the random variable is within −∞ < ξ < ∞ is equal to

Z∞ Z∞
a
1= pε(n) (ξ )dξ = dξ = a arctan(ξ )|∞
−∞ = aπ,
1 + ξ2
−∞ −∞

resulting in a = 1/π. The expected value is

Z∞
1 ξ
µε = dξ = 0,
π 1 + ξ2
−∞

while the variance

Z∞
1 ξ2
σε2 = dξ → ∞
π 1 + ξ2
−∞
does not exist. This noise belongs to the class of impulsive, heavy tailed, noises.

Solution 7.23. The expected value of the Poisson distributed random variable is
∞ ∞ ∞ ∞
λk e−λ λk e−λ λ k −1
µx = ∑ kP(k) = ∑ k = ∑k = λe−λ ∑
k =0 k =0
k! k =1
k! k =1
( k − 1) !
∞ k
λ
= λe−λ ∑ = λe−λ eλ = λ.
k =0
k!

The variance of this random variable is given by

∞ ∞
λk e−λ ∞ 2 λk e−λ
σx2 = ∑ ( k − µ x )2 P ( k ) = ∑ ( k − λ )2 = ∑k − λ2
k =0 k =0
k! k =0
k!
∞ λk e−λ ∞ λ k e−λ ∞
λk e−λ
= ∑ ( k ( k − 1) + k ) − λ2 = ∑ ( k ( k − 1) +∑k ) − λ2
k =0
k! k =2
k! k =1
k!
∞
λk
= e − λ λ2 ∑ + λ − λ2 = e−λ λ2 eλ + λ − λ2 = λ.
k =0
k!

Notice that the variance is the mean-value dependent, σx2 = µ x = λ. In the confidence interval
calculation, this problem can be solved using the variance stabilization (see Section 7.4.7 and Fig. 7.27).
The transformation
√ that would produce a mean value independent estimate of the variance would be
g(λ) = λ.

Solution 7.24. The transfer function of the causal system is

1
H (z) = .
1 − 0.5z−1
The z-transform of the input signal x (n) is
∞ ∞
X (z) = ∑ x (n)z−n = ∑ aδ(n)z−n = a.
n=−∞ n=−∞
Ljubiša Stanković Digital Signal Processing 437

The z-transform of the output signal is given by

a
Y (z) = H (z) X (z) = , |z| > 1/2.
1 − 0.5z−1
Using the power series expansion of Y (z) we can write
∞
Y (z) = a ∑ (1/2)n z−n .
n =0

The output signal is

y ( n ) = a · 2− n u ( n ).
It has been assumed that the random variable a is uniform within [4, 5]. Its probability density function
is defined by 
 1, ξ ∈ [4, 5]
p a (ξ ) =

0, elsewhere.
The expected value and the autocorrelation of the output signal y(n) are
Z ∞
µy (n) = E {y(n)} = y(n) p( a)da = 9 · 2−(n+1) u(n)
−∞

61 −(n+m)
ryy (n, m) = E {y(n)y∗ (m)} = 2 u ( n ) u ( m ).
3
The output signal y(n) is not WSS.

Solution 7.25. (a) The autocorrelation function of the input signal is

r xx (n) = rεε (n) = δ(n).

Its z-transform and the power spectral density are

∞
R xx (z) = ∑ r xx (n)z−n = 1
n=−∞
Sxx (ω ) = 1.

The power spectral density of the output signal is

2

Syy (ω ) = Ryy (e jω ) = Sxx (ω ) H (e jω ) = 1, for ω 6= 0.

The inverse Fourier transform produces the autocorrelation function

πZ
1
ryy (n) = rε h ε h (n) = Syy (ω )e jωn dω = δ(n).
2π −π

(b) The z-transform of the cross-correlation of the input and the output signal y(n) = ε(n) ∗n h(n) =
ε h (n), is
R xy (z) = R xx (z) H (z).
For z = e jω , we get
Rεε h (e jω ) = Sεε (ω ) H (e jω ) = H (e jω ),
438 Discrete-Time Random Signals

resulting in  2

 2 sin (nπ/2)
, n 6= 0
π n
rεε h (n) = h(n) =

 0, n = 0.
It is easy to conclude that the cross-correlation function is antisymmetric r xy (−n) = −r xy (n).

(c) The analytic part of the signal x (n) = ε(n) is

∞
x a (n) = ε a (n) = x (n) + jxh (n) = x (n) + j ∑ h ( k ) x ( n − k ).
k =−∞

The Fourier transform of both sides produces

Xa (e jω ) = X (e jω ) + jH (e jω ) X (e jω ).

If we divide both sides of the previous equation by X (e jω ) we get


 2, ω>0
Xa (e jω ) jω jω
jω
= Ha (e ) = 1 + jH (e ) = 1 + sgn(ω ) = 1, ω=0
X (e ) 
0, ω < 0.

The power spectral density of the output signal is


2 2  4, ω>0

Sε a ε a (ω ) = Ha (e jω ) Sεε (ω ) = Ha (e jω ) = 1, ω=0

0, ω<0

with the autocorrelation function of ε a (n),

 4
Z π  − jnπ , for odd n
1 − jωn
rε a ε a (n) = Sε ε ( ω ) e dω =
2π −π a a 
0, for even n.

Solution 7.26. The power spectral density of the input signal is

∞ ∞
Sxx (ω ) = ∑ r xx (n)e− jωn = ∑ rεε (n)e− jωn = σε2 .
n=−∞ n=−∞

The transfer function is

1 z
H (z) = =
1 − az−1 z−a
with the impulse response h(n) = an u(n). The impulse response is real-valued. The z-transform of
the output signal autocorrelation function is
z
Ryy (z) = H (z) H (1/z) R xx (z) = σ2 .
(z − a)(1 − az) ε

The inverse z-transform results in the autocorrelation function of y(n)

a|n| 2
ryy (n) = σ .
1 − a2 ε
Ljubiša Stanković Digital Signal Processing 439

The power spectral density of the output signal is

σε2 σε2
Syy (ω ) = Ryy (e jω ) = −
= .
(1 − ae jω )(1 − ae jω ) 1 − 2a cos ω + a2

Solution 7.27. The expected value of y(n) is

( )
∞ ∞ n
1 − a n +1
µy (n) = E ∑ h(k) x (n − k) = ∑ ak E{ε(n − k)}u(n − k) = ∑ ak µε = µε 1−a
u ( n ).
k =−∞ k =0 k =0

The variance is
n 2 o
σy2 (n) = E y(n) − µy (n) = E{y2 (n)} − µ2y (n)
!2
n n
k1 k2 1 − a n +1
= ∑ ∑ a a E{ε(n − k1 )ε(n − k2 )}u(n) − µε
1−a
u ( n ).
k 1 =0 k 2 =0

Since E{ε(n − k1 )ε(n − k2 )} = σε2 δ(k1 − k2 ) + µ2ε , we get

1 − a2( n +1)
σy2 (n) = σε2 u ( n ).
1 − a2

Solution 7.28. The expected value is

N
µ x = µε + ∑ a k E{ e j ( ω n + θ ) } = µ ε ,
k k

k =1

since
Zπ
1
E{ e j ( ωk n + θ k ) } = e j(ωk n+θk ) dθk = 0.
2π
−π
The autocorrelation is
N
r xx (n) = σε2 δ(n) + µ2ε + ∑ a2k e jω n ,
k

k =1

while the power spectral density for −π < ω ≤ π is

N
Sxx (e jω ) = FT{r xx (n)} = σε2 + 2πµ2ε δ(ω ) + 2π ∑ a2k δ(ω − ωk ).
k =1

Solution 7.29. For the optimal filtering d(n) = s(n). The cross-correlation of the input signal and
the desired signal is

rdx (n) = E{d(k) x (k − n)} = E{s(k)[s∗ (k − n) + ε∗ (k − n)]} = rss (n) = 0.25|n| .

Its z-transform is
−15z/4
Rdx (z) = Rss (z) = .
(z − 1/4)(z − 4)
440 Discrete-Time Random Signals

The input signal autocorrelation is

r xx (n) = rss (n) + rεε (n) = 0.25|n| + δ(n),

with the z-transform

−15z/4 z2 − 8z + 1
R xx (z) = +1= .
(z − 1/4)(z − 4) (z − 1/4)(z − 4)
The optimal filter transfer function is

Rdx (z) Rss (z) −15z/4

H (z) = = = 2 .
R xx (z) R xx (z) z − 8z + 1

A stable system requires the region of convergence 0.127 < |z| < 7.873. This region of convergence
does not correspond to a causal system.

Solution 7.30. By direct calculation, in the noisy case, we obtain

X (2) = 2009 + j204

and
XR (k) = 8.
Note that the noise-free DFT value X (2) is 8.

Solution 7.31. With the rectangular window, the spectrogram form is given by
2
N −1 N −1 N −1
− j 2π ik 2π
Sx (n, k) = ∑ x (n + i )e N = ∑ ∑ x ( n + i1 ) x ∗ ( n + i 2 ) e − j N ( i1 − i2 ) k .
i =0 i =0 i =0 1 2

a) The expected value of the spectrogram is

N −1 N −1 2π
E{Sx (n, k)} = ∑ ∑ E{ x (n + i1 ) x ∗ (n + i2 )}e− j N (i1 −i2 )k .
i1 =0 i2 =0

Using the fact that the signal s(n) is deterministic and the noise ε(n) is zero-mean, we get
N −1 N −1 2π
E{Sx (n, k)} = ∑ ∑ s ( n + i1 ) s ∗ ( n + i2 ) e − j N ( i1 − i2 ) k
i1 =0 i2 =0
N −1 N −1 2π
+ ∑ ∑ E{ε(n + i1 )ε∗ (n + i2 )}e− j N (i1 −i2 )k
i1 =0 i2 =0

or
N −1 N −1 2π
E{Sx (n, k)} = Ss (n, k) + σε2 ∑ ∑ δ ( i1 − i2 ) e − j N ( i1 − i2 ) k
i1 =0 i2 =0
N −1
= Ss (n, k) + σε2 ∑ 1 = Ss (n, k ) + Nσε2
i =0

since for the noise holds

∗
rεε (i ) = E{ε(n + i )ε (i )} = σε2 δ(i ).
Ljubiša Stanković Digital Signal Processing 441

b) The variance of Sx (n, k) is

σ2 = E{Sx (n, k)S∗x (n, k)} − E{Sx (n, k)}E{S∗x (n, k)}.

The first term can be written as

N −1 N −1 N −1 N −1
E{Sx (n, k)S∗x (n, k)} = ∑ ∑ ∑ ∑ E{ x ( n + i1 ) x ∗ ( n + i2 )
i1 =0 i2 =0 i3 =0 i4 =0
2π
× x ∗ (n + i3 ) x (n + i4 )}e− j N (i1 −i2 −i3 +i4 )k

where

E{ x (n + i1 ) x ∗ (n + i2 ) x ∗ (n + i3 ) x (n + i4 )}
= s ( n + i1 ) s ∗ ( n + i2 ) s ∗ ( n + i3 ) s ( n + i4 )
+ s(n + i1 )s∗ (n + i2 )rεε (i4 − i3 ) + s(n + i1 )s∗ (n + i3 )rεε (i4 − i2 )
+ s∗ (n + i2 )s(n + i4 )rεε (i1 − i3 ) + s∗ (n + i3 )s(n + i4 )rεε (i1 − i2 )
+ E{ε(n + i1 )ε∗ (n + i2 )ε∗ (n + i3 )ε(n + i4 )}.

The facts that odd order moments of a Gaussian zero-mean noise are zero and rεε∗ (k) = rε∗ ε (k) = 0
for a complex-valued noise with i.i.d. are used. According to relation (7.151) from the note, holds

E{ε(n + i1 )ε∗ (n + i2 )ε∗ (n + i3 )ε(n + i4 )} = rεε (i1 − i2 )rεε (i4 − i3 ) + rεε (i1 − i3 )rεε (i4 − i2 ).
(7.160)

After few straightforward transformations, we get

N −1 N −1 N −1 N −1
E{Sx (n, k)S∗x (n, k)} = Ss2 (n, k) + σε2 ∑ ∑ ∑ ∑ [ s ( n + i1 ) s ∗ ( n + i2 ) δ ( i4 − i3 )
i1 =0 i2 =0 i3 =0 i4 =0
+ s ( n + i1 ) s ∗ ( n + i3 ) δ ( i4 − i2 ) + s ( n + i4 ) s ∗ ( n + i2 ) δ ( i1 − i3 )
2π
+s(n + i4 )s∗ (n + i3 )δ(i1 − i2 )]e− j N (i1 −i2 −i3 +i4 )k
N −1 N −1 N −1 N −1 2π
+σε4 ∑ ∑ ∑ ∑ [δ(i1 − i2 )δ(i4 − i3 ) + δ(i1 − i3 )δ(i4 − i2 )]e− j N ( i1 − i2 − i3 + i4 ) k .
i1 =0 i2 =0 i3 =0 i4 =0

The final form of the variance is

σ2 = Ss2 (n, k) + 4Nσε2 Ss (n, k) + 2N 2 σε4 − (Ss (n, k) + Nσε2 )2 = 2Nσε2 Ss (n, k) + N 2 σε4 . (7.161)

The variance is proportional to the signal spectrogram values Ss (n, k).

Solution 7.32. (a) The expected value of the Fourier transform of x (n) is

L
E{Wx (n, ω )} = ∑ E{ x (n + k) x ∗ (n − k)}e− j2ωk .
k=− L

the signal is deterministic and it is not correlated with the white noise ε(n),

E{ x (n + k) x ∗ (n − k)} = s(n + k)s∗ (n − k) + rεε (2k),

442 Discrete-Time Random Signals

where rεε (2k ) is the autocorrelation function of the additive noise ε(n). The noise variance is σε2 . Then,

E{ x (n + k) x ∗ (n − k)} = s(n + k)s∗ (n − k) + σε2 δ(2k).

Final form of the expected value of Wx (n, ω ) is

E{Wx (n, ω )} = Ws (n, ω ) + σε2 .

b) The variance of Wxx (n, ω ) follows from

σ2 = E{Wx (n, ω )Wx∗ (n, ω )} − E{Wx (n, ω )}E{Wx∗ (n, ω )}.

The first term can be written as

L L
E{Wx (n, ω )Wx∗ (n, ω )} = ∑ ∑ E{ x (n + k1 ) x ∗ (n − k1 ) x ∗ (n + k2 ) x (n − k2 )}e− j2ω (k1 −k2 ) .
k1 =− L k2 =− L

In the case of a Gaussian zero-mean white stationary noise complex-valued noise with i.i.d. real and
imaginary parts rεε∗ (k) = rε∗ ε (k) = 0 and we can write

E{ x (n + k1 ) x ∗ (n − k1 ) x ∗ (n + k2 ) x (n − k2 )}
= s(n + k1 )s∗ (n − k1 )s∗ (n + k2 )s(n − k2 ) + s(n + k1 )s∗ (n − k1 )rεε (−2k2 )
+ s(n + k1 )s∗ (n + k2 )rεε (k2 − k1 ) + s∗ (n − k1 )s(n − k2 )rεε (k1 − k2 )
+ s∗ (n + k2 )s(n − k2 )rεε (2k1 ) + rεε (2k1 )rεε (−2k2 ) + rεε
2
( k 1 − k 2 ).

The note from the previous problem is used. Since

E{Wx (n, ω )}E{Wx∗ (n, ω )} = Ws2 (n, ω ) + 2σε2 Ws (n, ω ) + σε4

the Fourier transform variance is

L L
σ2 = ∑ ∑ [s(n + k1 )s∗ (n + k2 )rεε (k2 − k1 ) + rεε
2
(k1 − k2 )
k1 =− L k2 =− L
L
+s∗ (n − k1 )s(n − k2 )rεε (k1 − k2 )]e− jω (k1 −k2 ) = σε2 ∑ (2|s(n + k)|2 + σε2 ).
k=− L

For an FM signal σ2 = σε2 (2L + 1)(2A2 + σε2 ). This variance is constant.

Solution 7.33. The signal s(n) and the noise ε(n) are not correlated. In this case,

r xx (n) = rss (n) + rεε (n) = 4(0.5)|n| + 2δ(n)

1 0.5z 3z
R xx (z) = 4 +4 +1= + 1. (7.162)
1 − 0.5z−1 1 − 0.5z (2z − 1)(2 − z)

(a) For the optimal filtering d(n) = s(n). The cross-correlation of the desired and input signal is

rdx (n) = E{d(k) x (n − k)} = E{s(k)[s(k − n) + ε(k − n)]} = rss (n) = 4(0.5)|n| .

The optimal filter transfer function is

3z
Rdx (z) (2z−1)(2−z) 3z
H (z) = = 3z
= .
R xx (z) +1 −2z2 + 8z − 2
(2z−1)(2−z)
Ljubiša Stanković Digital Signal Processing 443

(b) For the optimal smoothing d(n) = s(n − 1), with

rdx (n) = E{d(k) x (n − k)} = E{s(k − 1)[s(k − n) + ε(k − n)]} = rss (n − 1)

and
∞ −n 3z2
Rdx (z) = ∑ 4(0.5)|n−1|z = zRss (z) =
n=−∞ (2z − 1)(2 − z)
follows
3z2
H (z) = .
−2z2 + 8z − 2
(c) In the case of prediction d(n) = s(n + 1) and

rdx (n) = E{d(k ) x (n − k)} = E{s(k + 1)[s(k − n) + ε(k − n)]} = rss (n + 1),
∞ −n 3
Rdx (z) = ∑ 4(0.5)|n+1|z = z−1 Rss (z) =
n=−∞ (2z − 1)(2 − z)
with
3
H (z) = .
−2z2 + 8z − 2

Solution 7.34. For the optimal filter the desired signal is d(n) = s(n). The correlation functions are

r xx (n) = E{ x (k) x ∗ (k − n)} = E{(s(k) + ε(k))(s∗ (k − n) + ε∗ (k − n))}

= rss (n) + 2rsε (n) + rεε (n) = 3(0.9)|n| + 8δ(n)

and

rdx (n) = E{s(k)[s∗ (k − n) + ε∗ (k − n)]} = rss (n) + rsε (n) = 3(0.9)|n| + 2δ(n).

Calculation of the z-transforms and the filter transfer function is left to the reader.

Solution 7.35. The power spectral densities of the signal and the input noise are

1 − |ω/2| for |ω/2| < 1
Sdd (e jω ) =
0 elsewhere

and
1 − ||ω | − 2| for | ω − 2| < 1
Sεε (e jω ) = .
0 elsewhere.
The optimal filter frequency response is

Sdd (e jω )
H (e jω ) = .
Sdd (e ) + Sεε (e jω )
jω

For 0 ≤ ω < π, we get 


 1 for ω≤1
 2− ω
jω ω for 1<ω≤2
H (e ) =

 0 for 2<ω≤3

1 for 3<ω<π
444 Discrete-Time Random Signals

since for 1 < ω ≤ 2 holds

1 − ω/2 1 − ω/2 2−ω
H (e jω ) = = = .
1 − ω/2 + (1 − |ω − 2|) 1 − ω/2 + (1 + (ω − 2)) ω

The result for −π ≤ ω < 0 is symmetric and is shown in Fig. 7.38(bottom). The input SNR is
Es 2
SNRi = = =1
Eε 2
or 0 [dB]. The output SNR is
Rπ 2 R2 2
1 jω jω
3/2 + 2 1 (1 − ω2 ) 2−ωω dω
2π −π Sdd ( e ) H ( e ) dω
SNRo = Rπ = R 2
1 jω 2 jω 2 2
2π −π Sεε ( e ) H ( e ) dω 2 1 (1 + (ω − 2)) 2−ωω dω
10 − 12 ln 2
= = 18.6181
16 ln 2 − 11
or 12.7 [dB].

Solution 7.36. The calculation model is given by

N −1
cx (n, k) =
W ∑ xQ (n + m) xQ (n − m) + e(n + m, n − m) e− j2πmk/N
m =0
N −1
= ∑ {[ x(n + m) + e(n + m)] [(x(n − m) + e(n − m)] + e(n + m, n − m)}e− j2πmk/N .
m =0

The expected value is calculated from

N −1
cx (n, k)} =
E {W ∑ x (n + m) x (n − m)e− j2πmk/N
m =0
N −1
1 2
+E{ ∑ e(n + m)e(n − m)e− j2πmk/N } = Wx (n, k) + ∆ .
m =0
12

It has been assumed that the errors in two different signal samples are not correlated E{e(n + m)e(n −
m)} = 0 for m 6= 0 and that the signal and the error are not correlated, E{ x (n + m)e(n − m)} = 0
for any m and n.
Part IV

Adaptive Systems and

Neural Networks

445
Chapter 8
Adaptive Systems

8.1 INTRODUCTION

Classic systems for signal processing are designed to satisfy properties defined in advance. Their
parameters are time-invariant. Adaptive systems change their parameters or form, in order to achieve
the best possible performance. These systems are characterized by ability to observe variations in the
input signal behavior and to react to these changes by adapting their parameters in order to improve
the desired performance of the output signal. Adaptive systems have the ability to "learn" so that they
can appropriately adapt the performance when the system environment is changed. By definition the
adaptive systems are time-variant. These systems are often nonlinear as well. These two facts make
the design and analysis of adaptive systems more difficult than in the case of classical time-invariant
systems. Adaptive systems are the topic of this chapter.
Consider an adaptive system with one input and one output signal, as in Figure 8.1. In addition
to the algorithm that transforms the input signal to the output signal, the adaptive system have a part
that tracks the system performance and implements appropriate system changes. This control system
takes into account the input signal, the output signal, and some additional information that can help in
making a decision on how the system parameters should change.

input adaptive output

signal system signal

adaptation
rule other
data

Figure 8.1 General adaptive system

________________________________________
Authors: Ljubiša Stanković, Miloš Daković

446
Part V

Time-Frequency Analysis

549
Chapter 10
Linear Time-Frequency Representations

The Fourier transform provides a unique mapping of a signal from the time domain to the
frequency domain. The frequency domain representation provides the signal’s spectral content.
Although the phase characteristic of the Fourier transform contains information about the time
distribution of the spectral content, it is very difficult to use this information. Therefore, one may say
that the Fourier transform is practically useless for this purpose, that is, that the Fourier transform does
not provide a time distribution of the spectral components.
Depending on problems encountered in practice, various representations have been proposed
to analyze non-stationary signals in order to provide time-varying spectral description. The field of
the time-frequency signal analysis deals with these representations of non-stationary signals and their
properties. Time-frequency representations may roughly be classified as linear, quadratic, and higher
order representations.
Linear time-frequency representations exhibit linearity, that is, the representation of a linear
combination of signals equals the linear combination of the individual representations. From this class,
the most important one is the short-time Fourier transform (STFT) and its variations. The energetic
version of the STFT is called spectrogram. It is the most frequently used tool in time-frequency signal
analysis.
The second class of time-frequency representations are the quadratic ones. The most interesting
representations of this class are those which provide a distribution of signal energy in the time-frequency
plane. They will be referred to as distributions. The concept of a distribution is borrowed from the
probability theory, although there is a fundamental difference. For example, in time-frequency analysis,
distributions may take negative values. Other possible domains for quadratic signal representations are
the ambiguity domain, the time-lag domain and the frequency-Doppler frequency domain. In order to
improve time-frequency representation various higher-order distributions have been defined as well.

10.1 SHORT-TIME FOURIER TRANSFORM

The idea behind the short-time Fourier transform (STFT) is to apply the Fourier transform to a portion
of the original signal, obtained by introducing a sliding window function w(t) to localize the analyzed
signal x (t). The Fourier transform is calculated for the localized part of the signal. It produces the
spectral content of the portion of the analyzed signal within the time interval defined by the width of
the window function. The STFT (a time-frequency representation of the signal) is then obtained by
sliding the window along the signal. Illustration of the STFT calculation is presented in Fig.10.1.

_________________________________________________
Authors: Ljubiša Stanković, Miloš Daković, Thayaparan Thayananthan

550
Ljubiša Stanković Digital Signal Processing 551

w(τ)
x(t)
τ

t
t

x(t+τ)w(τ)

Figure 10.1 Illustration of the signal localization in the STFT calculation.

Analytic formulation of the STFT is

Z∞
STFT (t, Ω) = x (t + τ ) w(τ ) e− jΩτ dτ. (10.1)
−∞

From (10.1) it is apparent that the STFT actually represents the Fourier transform of a signal x (t),
truncated by the window w(τ ) centered at instant t (see Fig. 10.1). From the definition, it is clear that
the STFT satisfies properties inherited from the Fourier transform (e.g., linearity).
By denoting xt (τ ) = x (t + τ ) we can conclude that the STFT is the Fourier transform of the
signal xt (τ )w(τ ),
STFT (t, Ω) = FTτ { xt (τ )w(τ )}.
Another form of the STFT, with the same time-frequency performance, is
Z∞
STFTI I (t, Ω) = x (τ )w∗ (τ − t)e− jΩτ dτ (10.2)
−∞

where w∗ (t) denotes the conjugated window function.

It is obvious that definitions (10.1) and (10.2) differ only in phase, that is, STFTI I (t, Ω) =
e− jΩt STFT (t, Ω) for real valued windows w(τ ). We will mainly use the first STFT form.

Example 10.1. To illustrate the STFT application, let us perform time-frequency analysis of the
following signal
x (t) = δ(t − t1 ) + δ(t − t2 ) + e jΩ1 t + e jΩ2 t . (10.3)
The STFT of this signal equals

STFT (t, Ω) = w(t1 − t)e− jΩ(t1 −t) + w(t2 − t)e− jΩ(t2 −t)
+ W (Ω − Ω1 )e jΩ1 t + W (Ω − Ω2 )e jΩ2 t , (10.4)

where W (Ω) is the Fourier transform of the used window. The STFT is depicted in Fig. 10.2 for
various window lengths, along with the ideal representation. A wide window w(t) in the time
domain is characterized by a narrow Fourier transform W (Ω) and vice versa. Influence of the
window to the results will be studied later.
552 Linear Time-Frequency Representations

STFTwide(t,Ω) STFTnarrow(t,Ω)
Ω Ω

Ω Ω
2 2

Ω Ω
1 1
(a) (b)
t t t t t t
1 2 1 2

STFT (t,Ω) Ideal TFR(t,Ω)

optimal
Ω Ω

Ω Ω
2 2

Ω Ω
1 1
(c) (d)
t1 t2 t t1 t2 t

Figure 10.2 Time-frequency representation of a sum of two delta pulses and two sinusoids obtained using: (a) a
wide window, (b) a narrow window, (c) a medium width window, and (d) an ideal time-frequency representation.

Example 10.2. The STFT of the signal

2
x (t) = e jat (10.5)
can be approximately calculated for a large a, using the method of stationary phase. Find its form
and the relation for the optimal window w(τ ) width, assuming that the window is nonzero for
|τ | < T .

⋆ Applying the stationary phase method (1.69), we get

Z∞ ZT
ja(t+τ )2 − jΩτ 2
STFT (t, Ω) = e w(τ )e dτ = e ja(t+τ ) w(τ )e− jΩτ dτ
−∞ −T
r r
2
j(2at−Ω)τ0 jaτ02 2πj 2 2 Ω − 2at πj
≃e jat
e e w(τ0 ) = e jat e− j(2at−Ω) /4a w (10.6)
2a 2a a
since
2a(t + τ0 ) = Ω.
Ljubiša Stanković Digital Signal Processing 553

Note that the STFT absolute value reduces to

r
Ω − 2at π

|STFT (t, Ω)| ≃ w (10.7)
2a a.

In this case, the width of |STFT (t, Ω)| along frequency does not decrease with an increase of the
window w(τ ) width. The width of |STFT (Ω, t)| around the central frequency Ω = 2at is

D = 4aT,

where 2T is the window width in the time domain. Note that this relation holds for a wide
window w(τ ), such that the stationary phase method may be applied. If the window is narrow,
with respect to the phase variations of the signal, the STFT width is defined by the width of
the Fourier transform of window. It is proportional to 1/T. Thus, the overall STFT width could
be approximated by a sum of the frequency variation caused width and the window’s Fourier
transform width, that is,
2c
Do = 4aT + , (10.8)
T
where c is a constant defined by the window shape (by using the main lobe as the window width,
it will be shown later that c = 2π for a rectangular window or c = 4π for a Hann(ing) window).
This relation corresponds to the STFT calculated as a convolution of an appropriately scaled time
domain window whose width is |τ | < 2aT and the frequency domain form of window W (Ω).
The approximation is checked against the exact STFT calculated by definition. The agreement is
almost complete, Fig.10.3.

Exact absolute STFT value STFT approximation

10 10
log of the window width log2(T)

log of the window width log2(T)

9 9
8 8
7 7
6 6
5 5
4 4
3 3
2 2
1 1
−1 0 1 −1 0 1
frequency Ω frequency Ω

Figure 10.3 Exact absolute STFT value of a linear FM signal at t = 0 for various window widths T =
2, 4, 8, 16, .., 1024 (left) and its approximation calculated as an appropriately scaled convolution of the time and
frequency domain window w(τ ) (right).

Therefore, there is a window width T producing the narrowest possible STFT for this signal.
It is obtained by equating the derivative of the overall width to zero,
2c
4a − = 0,
T2
554 Linear Time-Frequency Representations

which results in r
c
To = . (10.9)
2a
As expected, for a sinusoid, a → 0, To → ∞. This is just an approximation of the optimal window,
since for narrow windows we may not apply the stationary phase method (the term 4aT is then
much smaller than 2c/T and may be neglected anyway).
Note that for a = 1/2, when the instantaneous frequency is a symmetry line for the time
and the frequency axis 2 − T2c2 = 0 or 2T = 2c
T , meaning that the optimal window should have
the widths equal in the time-domain 2T and in the frequency domain 2c/T (main lobe width).

The STFT can be expressed in terms of the signal’s Fourier transform

Z∞ Z∞
1
STFT (t, Ω) = X (θ ) e j(t+τ )θ w(τ ) e− jΩτ dθ dτ
2π
−∞ −∞
Z∞ h i
1
= X (θ )W (Ω − θ ) e jtθ dθ = X (Ω)e jtΩ ∗Ω W (Ω), (10.10)
2π
−∞

where ∗Ω denotes convolution in Ω. It may be interpreted as an inverse Fourier transform of the

frequency localized version of X (Ω), with localization window W (Ω) = FT{w(τ )}.
The energetic version of the STFT, called the spectrogram, is defined by
2 2
Z∞ Z∞
∗ −
∗ −

SPEC (t, Ω) =| STFT (t, Ω) | = x (τ )w (τ − t)e
2 jΩτ
dτ = x (t + τ )w (τ )e jΩτ
dτ .
−∞ −∞

Obviously, linearity property is lost in the spectrogram.

Example 10.3. For illustration consider two different signals x1 (t) and x2 (t) producing the same
amplitude of the Fourier transform, Fig. 10.4,
!
t t 16 t − 128 2
x1 (t) = sin 122π − cos 42π − π
128 128 11 64
2 !
t t − 128 t − 120 3 −( t−140 )2
− 1.2 cos 94π − 2π −π e 75
128 64 64
!
t t − 50 2 −( t−50 )2
− 1.6 cos 15π − 2π e 16 (10.11)
128 64
x2 (t) = x1 (255 − t).

Their spectrograms are shown in Fig.10.5. From the spectrograms, we can follow time variations
of the spectral content. The signals obviously consist of one constant high frequency component,
one linear frequency component (in the first signal with decreasing frequency as time progresses,
and in the second signal with increasing frequency), and two chirps (one appearing at different
time instants and the other having different frequency variations).
Ljubiša Stanković Digital Signal Processing 555

x1(t)

x2(t)
(a) (b)
t t
|X1(Ω)|

|X2(Ω)|
(c) (d)
Ω Ω

Figure 10.4 Two different signals x1 (t) 6= x2 (t) with the same amplitudes of their Fourier transforms, | X1 (Ω)| =
| X2 (Ω)|.

10.2 STFT INVERSION

The signal can be obtained from the STFT calculated at an instant t, STFT (t, Ω), as its inverse Fourier
transform
Z∞
1
x (t + τ )w(τ ) = STFT (t, Ω)e− jΩτ dΩ.
2π
−∞
This relation can be theoretically used for the signal within the region w(τ ) 6= 0,
Z∞
1 1
x (t + τ ) = STFT (t, Ω)e− jΩτ dΩ.
w(τ ) 2π
−∞

In practice it is used within the region of significant window w(τ ) values.

If the STFT is calculated at a set of the discrete-time instants t = iR, shifted for R for each next
STFT calculation, then a set of the STFT values and its inverse transforms
Z∞
x (iR + τ )w(τ ) = STFT (iR, Ω)e− jΩτ dτ
−∞

is obtained. If the value of step R is smaller than the window duration then the same signal value is
used within two (several) windows.
For the correct reconstruction, the segments ri (τ ) = x (iR + τ )w(τ ), whose positions on the time
axis are illustrated in Fig. 10.1, should be properly re-positioned in terms of t-axis, using τ = t − iR,
and summed

∑ ri (t − iR) = ∑ x(iR + t − iR)w(t − iR) = x(t) ∑ w(t − iR). (10.12)

i i i
556 Linear Time-Frequency Representations

SPEC1(t,Ω)

250

200

150

100
t
50
(a)
0 2.5 3
0.5 1 1.5 2
0

Ω
SPEC2(t,Ω)

250

200

150

100
t
50
(b)
0 2.5 3
0.5 1 1.5 2
0

Figure 10.5 Spectrograms of the signals presented in Fig.10.4.

If the sum of shifted versions of the windows is constant (without loss of generality assume equal to 1),

∑ w(τ − iR) = 1, (10.13)

then
Z∞
1
x (t) = ∑ STFT (iR, Ω)e− jΩ(t−iR) dΩ.
i
2π
−∞
The condition ∑ w(τ − iR) = 1 is equivalent to the requirement that the periodic extension of the
i
window, with the period R, is constant (see Fig. 10.6). The periodic extension of a continuous signal
corresponds to the sampling of the window Fourier transform at Ω = 2πk/R in the Fourier domain,
(1.66). This means that
2π
W( k) = 0
R
Ljubiša Stanković Digital Signal Processing 557

for an integer value of k, k 6= 0, when ∑ w(λ − iR) = 1.

i
The inversion of the STFT, defined in (10.2), gives
Z∞
1
x (τ )w∗ (τ − t) = STFT (t, Ω)e jΩτ dτ. (10.14)
2π
−∞

If this STFT is available at a discrete set of instants, t = iR (or any other set of discrete instants ti ),
then the summation of STFT (t, Ω) over all values calculated at different instants t is
Z∞
1
∑ 2π STFT (iR, Ω)e jΩτ dΩ = ∑ x (τ )w∗ (τ − iR) = x (τ ) (10.15)
i −∞ i

if the condition in (10.13) holds for the window w∗ (τ ).

This kind of inversion is called the constant overlap-add (COLA) method. Many window forms
can easily satisfy the condition in (10.13). The reconstruction of discrete-time signals will be considered
in Section 10.5.
The other signal reconstruction approach, called the weighted overlap-add (WOLA) method uses
a weighted inversion of the STFT. Namely, the signal x (τ )w∗ (τ − t) inverted in (10.14), commonly
with a real-valued window, is additionally weighted by the synthesis window, unusually the same as
the analysis window, to produce

ri (τ ) = [ x (τ )w∗ (τ − iR)]w(τ − iR) = x (τ )|w(τ − iR)|2 .

The inversion relation (for a real-valued window w(τ )) is now

1 Z∞
∑ 2π STFT (iR, Ω)e jΩτ dΩ w(τ − iR) = ∑ x(τ )w2 (τ − iR) = x(τ ) (10.16)
i −∞ i

if the condition
∑ w2 (τ − iR) = 1, (10.17)
i
holds. The same condition can be derived from the following analysis.
The STFT can be considered as a projection (inner product) of the signal, x (τ ), to the time-
frequency kernel function
ht,Ω (τ ) = w(τ − t)e jΩτ ,
that is
D E D E Z∞
STFT (t, Ω) = x (τ ), ht,Ω (τ ) = x (τ ), w(τ − t)e jΩτ = x (τ )w∗ (τ − t)e− jΩτ dτ.
τ τ
−∞
D E (10.18)
The back-projection of the STFT to the same (conjugate) kernel is STFT (t, Ω), h∗t,Ω (τ ) or
t,Ω

D E Z∞
1
STFT (t, Ω), w∗ (τ − t)e− jΩτ =∑ STFT (ti , Ω)w(τ − ti )e jΩτ dΩ. (10.19)
t,Ω ti 2π
−∞

With t = ti = iR, relation (10.19) reduces to (10.16), with the same reconstruction condition.
558 Linear Time-Frequency Representations

10.3 WINDOWS

The window function plays a crucial role in the localization of the signal in the time-frequency plane.
The most commonly used windows will be presented next.

10.3.1 Rectangular Window

The simplest window is the rectangular one, defined by

1 for |τ | < T
w(τ ) = (10.20)
0 elsewhere

whose Fourier transform is

ZT
2 sin(ΩT )
WR (Ω) = e− jΩτ dτ = . (10.21)
Ω
−T

The rectangular window function has very strong and oscillatory sidelobes in the frequency domain,
since the function sin(ΩT )/Ω converges very slowly, toward zero, in Ω as Ω → ±∞. Slow
convergence in the Fourier domain is caused by a significant discontinuity in time domain, at t = ± T.
The mainlobe width of WR (Ω) is dΩ = 2π/T. In order to enhance signal localization in the frequency
domain, other window functions have been introduced.
The discrete-time form of the rectangular window is

w(n) = u(n + N/2) − u(n − N/2)

with the Fourier transform

N/2−1
sin(ωN/2)
W (e jω ) = ∑ e− jωn = .
n=− N/2
sin(ω/2)

10.3.2 Triangular (Bartlett) Window

The triangular window is defined by

1 − |τ/T | for |τ | < T
w(τ ) = (10.22)
0 elsewhere.

This window could be considered as a self convolution of the rectangular window of the duration T,
since

[u(t + T/2) − u(t − T/2)] ∗t [u(t + T/2) − u(t − T/2)]

= (1 − |τ/T |) [u(t + T ) − u(t − T )].

The Fourier transform of the triangular window is a product of two Fourier transforms of the rectangular
window of the width T,
4 sin2 (ΩT/2)
WT (Ω) = . (10.23)
Ω2
Convergence of this function toward zero, as Ω → ±∞, is of the 1/Ω2 order. It is a continuous
function of time, with discontinuities in the first derivative at t = 0 and t = ± T. The mainlobe of this
Ljubiša Stanković Digital Signal Processing 559

window function is twice wider in the frequency domain than in the rectangular window case. Its width
follows from ΩT/2 = π as dΩ = 4π/T.
The discrete-time form is

2 |n|
w(n) = 1 − [u(n + N/2) − u(n − N/2)].
N
In the frequency domain its form is
N/2−1
2 |n| − jωn sin2 (ωN/4)
W (e jω ) = ∑ 1− e = .
n=− N/2
N sin2 (ω/2)

10.3.3 Hann(ing) Window

The Hann(ing) window is of the form

0.5(1 + cos (πτ/T )) = cos2 (πτ/(2T )) , for |τ | < T
w(τ ) = (10.24)
0, elsewhere.

Since cos (πτ/T ) = [exp ( jπτ/T ) + exp (− jπτ/T )]/2, the Fourier transform of this window is
related to the Fourier transform of the rectangular window of the same width as
1 1 1
WH (Ω) = WR (Ω) + WR (Ω − π/T ) + WR (Ω + π/T )
2 4 4
π 2 sin(ΩT )
= . (10.25)
Ω ( π 2 − Ω2 T 2 )

The function WH (Ω) decays in frequency as Ω3 , much faster than WR (Ω).

The discrete-time domain form is

2πn
w(n) = 0.5 1 + cos [u(n + N/2) − u(n − N/2)],
N
with the DFT of the form
N N N
W (k) = δ ( k ) + δ ( k + 1) + δ ( k − 1).
2 4 4
When the window is used on the data set from 0 to N − 1 then

2πn
w(n) = 0.5 1 − cos [u(n) − u(n − N )]
N
N N N
W ( k ) = δ ( k ) − δ ( k + 1) − δ ( k − 1).
2 4 4
If a signal is multiplied by the Hann(ing) window the previous relation also implies the relationship
between the DFTs of the signal x (n) calculated using the rectangular and Hann(ing) windows. The
DFT of the windowed signal is a moving average (smoothed) form of the original signal,
1
DFT{ x (n)w(n)} = DFT{ x (n)} ∗k DFT{w(n)}
N
1 1 1
= X ( k + 1) + X ( k ) + X ( k − 1).
4 2 4
560 Linear Time-Frequency Representations

Example 10.4. Find the window that will correspond to the frequency smoothing ( X (k + 1) + X (k ) +
X (k − 1))/3, that is, to

1
DFT{ x (n)w(n)} = DFT{ x (n)} ∗k DFT{w(n)}
N
1 1 1
= X ( k + 1) + X ( k ) + X ( k − 1).
3 3 3

⋆ The DFT of this window is

N N N
W (k) = δ ( k ) + δ ( k + 1) + δ ( k − 1).
3 3 3
In the discrete-time domain, the window form is

1 2πn
w(n) = 1 + 2 cos [u(n) − u(n − N )].
3 N

Example 10.5. Find the formula to calculate the STFT with a Hann(ing) window, if the STFT
calculated with a rectangular window is known.

⋆ From the frequency domain STFT definition

Z∞
1
STFT (t, Ω) = X (θ )W (Ω − θ )e jtθ dθ
2π
−∞

easily follows that, if we use the window,

1 1 1
WH (Ω) = WR (Ω) + WR (Ω − π/T ) + WR (Ω + π/T ),
2 4 4
then
1 1 π 1 π
STFTH (t, Ω) = STFTR (t, Ω) + STFTR t, Ω − + STFTR t, Ω + . (10.26)
2 4 T 4 T

For the Hann(ing) window w(τ ) of the width 2T, we may roughly assume that its Fourier
transform WH (Ω) is nonzero within the main lattice | Ω |< 2π/T only, since the sidelobes decay
very fast. Then, we may write dΩ = 4π/T. It means that the STFT is nonzero valued in the shaded
regions in Fig. 10.2.
We see that the duration in time of the STFT of a delta pulse is equal to the widow width
dt = 2T. The STFTs of two delta pulses (very short duration signals) do not overlap in time-frequency
domain if their distance is greater than the window duration |t1 − t2 | > dt . Then, these two pulses
can be resolved. Thus, the window width is here a measure of time resolution. Since the Fourier
Ljubiša Stanković Digital Signal Processing 561

transform of the Hann(ing) window converges fast, we can roughly assume that a measure of duration
in frequency is the width of its mainlobe, dΩ = 4π/T. Then, we may say that the Fourier transforms
of two sinusoidal signals do not overlap in frequency if the condition |Ω1 − Ω2 | > dΩ holds. It is
important to observe that the product of the window durations in time and frequency is a constant. In
this example, considering time domain duration of the Hann(ing) window and the width of its mainlobe
in the frequency domain, this product is dt dΩ = 8π. Therefore, if we improve the resolution in the
time domain dt , by decreasing T, we inherently increase the value of dΩ in the frequency domain. This
essentially prevents us from achieving the ideal resolution (dt = 0 and dΩ = 0) in both domains. A
general formulation of this principle, stating that the product of effective window durations in time and
in frequency cannot be arbitrarily small, will be presented later.
The Hann(ing) window satisfies the constant overlap-add (COLA) reconstruction condition,
∑i w(τ − iR) = 1, with R = T, as shown in Fig. 10.6. This property follows from cos2 (πτ/(2T )) +
cos2 (π (τ ± T )/(2T )) = cos2 (πτ/(2T )) + sin2 (πτ/(2T )) = 1.
The same condition can be satisfied with R = T/2, R = T/4, . . . , after an appropriate scaling of
the window amplitude.

1.2

0.8

0.6

0.4

0.2

0
-4 -3 -2 -1 0 1 2 3 4

Figure 10.6 Hann(ing) window and its shifted versions, that satisfy the constant overlap-add (COLA) reconstruc-
tion condition ∑i w(τ − iR) = 1, with R = 1.

10.3.4 Hamming Window

This window has the form

0.54 + 0.46 cos (πτ/T )) for |τ | < T
w(τ ) = (10.27)
0 elsewhere.

A similar relation between the Hamming and the rectangular window transforms holds, as in the case
of Hann(ing) window.
The Hamming window was derived starting from

w(τ ) = a + (1 − a) cos (πτ/T ))

within |τ | < T, with

2 sin(ΩT ) sin((Ω − π/T ) T ) sin((Ω + π/T ) T )
W (Ω) = a + (1 − a ) + .
Ω Ω − π/T Ω + π/T
562 Linear Time-Frequency Representations

If we choose such a value of a to cancel out the second sidelobe at its maximum (that is, at ΩT ∼
= 2.5π)
then we get
2aT T T
0= − (1 − a ) +
2.5π 1.5π 3.5π
resulting in
a = 25/46 ∼ = 0.54. (10.28)
This window has several sidelobes, next to the mainlobe, lower than the previous two windows.
However, since it is not continuous at t = ± T, its decay in frequency, as Ω → ±∞, is not fast. Note
that we let the mainlobe to be twice wider than in the rectangular window case, so we cancel out not
the first but the second sidelobe, at its maximum.
The discrete-time domain form is

2πn
w(n) = 0.54 + 0.46 cos [u(n + N/2) − u(n − N/2)]
N
with
W (k) = 0.54Nδ(k) + 0.23Nδ(k + 1) + 0.23Nδ(k − 1).

10.3.5 Blackman and Kaiser Windows

In some applications, it is crucial that the sidelobes are suppressed, as much as possible. This is achieved
using windows of more complicated forms, like the Blackman window. It is defined by

0.42 + 0.5 cos (πτ/T ) + 0.08 cos (2πτ/T ) for |τ | < T
w(τ ) = (10.29)
0 elsewhere.

This window is derived from

w(τ ) = a0 + a1 cos (πτ/T ) + a2 cos (2πτ/T )

with a0 + a1 + a2 = 1 and canceling out the Fourier transform values W (Ω) at the positions of the third
and the fourth sidelobe maxima (that is, at ΩT ∼ = 3.5π and ΩT ∼= 4.5π). Here, we let the mainlobe to
be three times wider than in the rectangular window case, so we cancel out not the first nor the second
but the third and fourth sidelobes, at their maxima.
The discrete-time and frequency domain forms are

2πn 4πn N N
w(n) = 0.42 + 0.5 cos + 0.08 cos [u(n + ) − u(n − )]
N N 2 2
W (k) = [0.42δ(k) + 0.25(δ(k + 1) + δ(k − 1)) + 0.04(δ(k + 2) + δ(k − 2))] N.

Further reduction of the sidelobes can be achieved by, for example, the Kaiser (Kaiser-Bessel) window.
It is an approximation to a restricted time duration function with minimum energy outside the mainlobe.
This window is defined by using the zero-order Bessel functions, with a localization parameter. It has
the ability to keep the maximum energy within the mainlobe, while minimizing the sidelobe energy.
The sidelobe level can be as low as −70 dB, as compared to the mainlobe, and even lower. This kind of
window is used in the analysis of signals with significantly different amplitudes, when the sidelobe of
one component can be much higher than the amplitude of the mainlobe of other components.
These are just a few of the windows used in signal processing. Some windows, along with the
corresponding Fourier transforms, are presented in Fig. 10.7.
Ljubiša Stanković Digital Signal Processing 563

10 log| W( )|
W( )
w( )

Figure 10.7 Windows in the time and frequency domains: rectangular window (first row), triangular (Bartlett)
window (second row), Hann(ing) window (third row), Hamming window (fourth row), and Blackman window (fifth
row).

Example 10.6. Calculate the STFT of the signals x1 (t) = 2 cos(4πt/T ) + 2 cos(12πt/T ) and
x2 (t) = 2 cos(4πt/T ) + 0.001 cos(64πt/T ) at t = 0. Use a Hamming and a Blackman window
with T = 128 and ∆t = 1. Comment the results.

⋆ The STFT at t = 0 is shown in Fig.10.8. The resolution of close components in x1 (t) is better
when the Hamming window is used, since the main lobe of the Blackman window is wider. Small
564 Linear Time-Frequency Representations

signal component in x2 (t) is visible in the STFT with the Blackman window since its side-lobes
are lower.

0 0
|STFT(0,Ω)|

|STFT(0,Ω)|
10 10

−5 −5
10 10

−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1

frequency Ω frequency Ω

0 0
|STFT(0,Ω)|

|STFT(0,Ω)|

10 10

−5 −5
10 10

−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1

frequency Ω frequency Ω

Figure 10.8 The STFT at n = 0 calculated using the Hamming window (left) and the Blackman window (right)
of the signals x1 (n) (top) and x2 (n) (bottom).

10.4 REALIZATIONS OF THE STFT

Discretization and realizations of the STFT will be discussed in this section. Recursive realization
which is appropriate for the on-line implementation of the STFT will be presented, along with the filter
bank form of the STFT.

10.4.1 Discrete Form and Realizations of the STFT

In numerical calculations, the integral form of the STFT should be discretized. By sampling the signal
with sampling interval ∆t we get
Z∞ ∞
STFT (t, Ω) = x (t + τ )w(τ )e− jΩτ dτ ≃ ∑ x ((n + m)∆t)w(m∆t)e− jm∆tΩ ∆t.
−∞ m=−∞

By denoting
x (n) = x (n∆t)∆t
Ljubiša Stanković Digital Signal Processing 565

and normalizing the frequency Ω by ∆t, ω = ∆tΩ, we get the time-discrete form of the STFT as
∞
STFT (n, ω ) = ∑ w(m) x (n + m)e− jmω . (10.30)
m=−∞

We will use the same notation for continuous-time and discrete-time signals, x (t) and x (n). However,
we hope that this will not cause any confusion since we will use different sets of variables, for example
t and τ for continuous time and n and m for discrete time. Also, we hope that the context will be
always clear, so that there is no doubt what kind of signal is considered.
It is important to note that STFT (n, ω ) is periodic in frequency with period 2π. The relation
between the analog and the discrete-time form is
∞
STFT (n, ω ) = ∑ STFT (n∆t, Ω + 2kΩ0 ) with ω = ∆tΩ.
k =−∞

The sampling interval ∆t is related to the period in frequency as ∆t = π/Ω0 . According to the
sampling theorem, in order to avoid the overlapping of the STFT periods (aliasing), we should take
π π
∆t = ≤
Ω0 Ωm

where Ωm is the maximum frequency in the STFT. Strictly speaking, the windowed signal x (t + τ )w(τ )
is time limited, thus it is not frequency limited. Theoretically, there is no maximum frequency since
the width of the window’s Fourier transform is infinite. However, in practice we can always assume
that the value of spectral content of x (t + τ )w(τ ) above frequency Ωm , that is, for |Ω| > Ωm , can
be neglected, and that overlapping of the frequency content above Ωm does not degrade the basic
frequency period.
The discretization in frequency should be done by a number of samples greater than or equal to
the window length N. If we assume that the number of discrete frequency points is equal to the window
length, then
N/2−1
STFT (n, k ) = STFT (n, ω )|ω = 2π k = ∑ w(m) x (n + m)e− j2πmk/N (10.31)
N
m=− N/2

and it can be efficiently calculated using the fast DFT routines

STFT (n, k) = DFTm {w(m) x (n + m)},

for a given instant n. When the DFT routines with indices from 0 to N − 1 are used, then a shifted
version of w(m) x (n + m) should be formed for the calculation for N/2 ≤ m ≤ N − 1. It is obtained
as w(m − N ) x (n + m − N ), since in the DFT calculation periodicity of the signal w(m) x (n + m),
with period N, is inherently assumed.

Example 10.7. Consider a signal with M = 16 samples, x (0), x (1), . . . , x (15), write a matrix form
for the calculation of a four-sample STFT. Present nonoverlapping and overlapping cases of the
STFT calculation.
566 Linear Time-Frequency Representations

⋆ For the calculation of (10.31) with N = 4, when k = −2, −1, 0, 1, for given instant n, the
following matrix notation can be used
   4  
STFT (n, −2) W4 W42 1 W4−2 x ( n − 2)
 STFT (n, −1)   W 2 W 1 1 W −1   x (n − 1) 
   4 4 4  
 STFT (n, 0)  =  1 1 1 1   x (n) 
STFT (n, 1) W4−2 W4−1 1 W41 x ( n + 1)

or
STFT(n) = W4 x(n)
with STFT(n) = [STFT (n, −2) STFT (n, −1) STFT (n, 0) STFT (n, 1)] T , x(n) = [ x (n − 2)
x (n − 1) x (n) x (n + 1)] T , and W4 is the DFT matrix of order four with elements W4mk =
exp(− j2πmk/N ). Here, a rectangular window is assumed. Including the window function, the
previous relation can be written as

STFT(n)= W4 H4 x(n),

with  
w(−2) 0 0 0
 0 w(−1) 0 0 
H4 = 
 0

0 w (0) 0 
0 0 0 w (1)
being a diagonal matrix whose elements are the window values w(m), H4 =diag(w(m)),
m = −2, −1, 0, 1 and
 
w(−2)W44 w(−1)W42 w(0) w(1)W4−2
 w(−2)W 2 w(−1)W 1 w(0) w(1)W −1 
W4 H 4 = 
 w(−2)
4 4 4 .
w(−1) w(0) w(1) 
w(−2)W4−2 w(−1)W4−1 w(0) w(1)W41

All STFT values for the nonoverlapping case are obtained as

 
x (0) x (4) x (8) x (12)
 x (1) x (5) x (9) x (13) 
STFT = W4 H4  
 x (2) x (6) x (10) x (14)  = W4 H4 X4,4 ,
x (3) x (7) x (11) x (15)

where STFT is a matrix of the STFT values with columns corresponding to the calculation
instants and the rows to the frequencies. This matrix is of the form

STFT = STFT M (0) STFT M ( M) . . . STFT M ( N − M )
 
STFT (2, −2) STFT (6, −2) STFT (10, −2) STFT (14, −2)
 STFT (2, −1) STFT (6, −1) STFT (10, −1) STFT (14, −1) 
=  STFT (2, 0) STFT (6, 0) STFT (10, 0) STFT (14, 0)  .


STFT (2, 1) STFT (6, 1) STFT (10, 1) STFT (14, 1)

The matrix X4,4 is formed of four successive signal values in each column. Notation X N,R will
be used to denote the signal matrix with columns containing N signal values and the difference
of the first signal value indices in the successive columns is R. For R = N the nonoverlapping
calculation is performed.
Ljubiša Stanković Digital Signal Processing 567

For a STFT calculation with overlapping, R < N, for example with the time step in the
STFT calculation R = 1, we get
 
x (0) x (1) x (2) . . . x (10) x (11) x (12)
 x (1) x (2) x (3) . . . x (11) x (12) x (13) 
STFT = H4 W4  
 x (2) x (3) x (4) . . . x (12) x (13) x (14) 
x (3) x (4) x (5) . . . x (13) x (14) x (15)
STFT =W4 H4 X4,1 .

The step R defines the difference of the arguments in two neighboring columns. In the first
case the difference of arguments in two neighboring columns was 4 (time step in the STFT
calculation was R = 4 equal to the window width, meaning nonoverlapped calculation). In the
second example difference is R = 1 < 4, meaning the overlapped STFT calculation. Note that
the window function HN and the DFT matrix WN remain the same for both cases.

Example 10.8. Consider the signal

2 2 2 2
x (t) = e−t e− j6πt − j32πt + e−4(t−1) e j16πt + j160πt .

Assuming that the values of the signal with amplitudes bellow 1/e4 could be neglected, find the
sampling rate for the STFT-based analysis of this signal. Write the approximate spectrogram
expression for the Hann(ing) window of N = 32 samples in the analysis. What signal will be
presented in the time-frequency plane, within the basic frequency period, if the signal is sampled
at ∆t = 1/128?

⋆ The time interval, with significant signal content, for the first signal component is −2 ≤ t ≤ 2,
with the frequency content within −56π ≤ Ω ≤ −8π, since the instantaneous frequency
is Ω(t) = −12πt − 32π. For the second component these intervals are 0 ≤ t ≤ 2 and
160π ≤ Ω ≤ 224π. The maximum frequency in the signal is Ωm = 224π. Here, we have to take
into account possible spreading of the spectrum caused by the lag window. Its width in the time
domain is dt = 2T = N∆t = 32∆t. The width of the mainlobe in frequency domain dw is defined
by 32dw ∆t = 4π, or Ωw = π/(8∆t). Thus, taking the sampling interval ∆t = 1/256, we will
satisfy the sampling theorem condition in the worst instant case, since π/(Ωm + dw ) = 1/256.
In the case of the Hann(ing) window with N = 32 and ∆t = 1/256, the lag interval is
N∆t = 1/8. We will assume that the amplitude variations within the window are small, that
is, w(τ )e−(t+τ ) ∼
2 2
= w(τ )e−t for −1/16 < τ ≤ 1/16. Then, according to the stationary phase
method, we can write the STFT approximation,
2

+32π + 1 e−8(t−1)2 w2 Ω−32πt−160π
|STFT (t, Ω)|2 = 61 e−2t w2 Ω+12πt 12π 32 32π

with t = n/256 and Ω = 256ω within −π ≤ ω < π.

In the case of ∆t = 1/128, the signal will be periodically extended with period 2Ω0 = 256π.
The basic period will be for −128π ≤ Ω < 128π. It means that the first component will remain
unchanged within the basic period, while the second component is outside the basic period.
However, its replica shifted for one period to the left, that is, for −256π, will be within the
basic period. It will be located within 160π − 256π ≤ Ω ≤ 224π − 256π, that is, within
568 Linear Time-Frequency Representations

−96π ≤ Ω ≤ −32π. Thus, the signal represented by the STFT in this case will correspond to
2 2 2 2
xr (t) = e−t e− j6πt − j32πt + e−4(t−1) e j16πt + j(160−256)πt ,

with approximation,
2

1 −8( t −1)2 2
|STFT (t, Ω)|2 = 16 e−2t w2 Ω+12πt+32π
12π + 32 e w Ω−32πt−96π
32π , (10.32)

with t = n/128 and Ω = 128ω within −π ≤ ω < π or −128π ≤ Ω < 128π.

10.4.2 Recursive STFT Realization

For the rectangular window, the STFT values at an instant n can be calculated recursively from the
STFT values at n − 1, as

STFTR (n, k) = [ x (n + N/2 − 1) − x (n − N/2 − 1)](−1)k e j2πk/N + STFTR (n − 1, k)e j2πk/N .

This recursive formula follows easily from the STFT definition (10.31).
For other window forms, the STFT can be obtained from the STFT obtained by using the
rectangular window. For example, according to (10.26) the STFT with Hann(ing) window STFTH (n, k)
is related to the STFT with rectangular window STFTR (n, k) as

1 1 1
STFTH (n, k) = STFTR (n, k) + STFTR (n, k − 1) + STFTR (n, k + 1).
2 4 4
This recursive calculation is important for hardware implementation of the STFT and other related
time-frequency representations (e.g., the higher order representations implementations based on the
STFT).

(−1)k e j2kπ/N
x(n+N/2−1) + + STFTR(n,k)

z−N z−1
−1

a1
STFTR(n,k+1)
a0
STFTR(n,k) + STFTH(n,k)
a−1
STFTR(n,k−1)

Figure 10.9 Recursive implementation of the STFT for the rectangular and other windows.
Ljubiša Stanković Digital Signal Processing 569

A system for the recursive implementation of the STFT is shown in Fig. 10.9. The STFT obtained
using the rectangular window is denoted by STFTR (n, k), Fig.10.9, while the values of coefficients are

1 1 1
( a −1 , a0 , a1 ) = ( , , ),
4 2 4
( a−1 , a0 , a1 ) = (0.23, 0.54, 0.23),
( a−2 , a−1 , a0 , a1 , a2 ) = (0.04, 0.25, 0.42, 0.25, 0.04)

for the Hann(ing), Hamming and Blackman windows, respectively.

Note that, in general, instead of multiplying the signal by the previous window functions, for
each calculation instant n, the STFT matrix STFT can be calculated without window multiplication
(using a rectangular window). The STFT matrix for the Hann(ing) window, for example, is obtained as
STFT H = 0.5STFT+0.25STFT↓ + 0.25STFT↑ , where STFT↓ and STFT↑ are the STFT matrices
with circularly shifted rows down and up for one position, respectively.

10.4.3 Filter Bank STFT Implementation

According to (10.1), the STFT can be written as a convolution

Z∞ Z∞ h i
STFT (t, Ω) = x (t + τ ) w(τ ) e− jΩτ dτ = x (t − τ )w(τ )e jΩτ dτ = x (t) ∗t w(t)e jΩt
−∞ −∞

where an even, real valued, window function is assumed, w(τ ) = w(−τ ). For a discrete set of
frequencies Ωk = k∆Ω = 2πk/( N∆t), k = 0, 1, 2, . . . , N − 1, and discrete values of signal, we get
that the discrete STFT, (10.31), is an output of the filter bank with impulse responses
h i
STFT (n, k) = x (n) ∗n w(n)e j2πkn/N = x (n) ∗n hk (n)

hk (n) = w(n)e j2πkn/N

k = 0, 1, . . . , N − 1

what is illustrated in Fig.10.10. The next STFT can be calculated with time step R∆t, meaning
downsampling in time with factor 1 ≤ R ≤ N. Two special cases are: no downsampling, R = 1, and
nonoverlapping calculation, R = N. Influence of R to the signal reconstruction will be discussed later.

10.4.3.1 Overlapping windows

Nonoverlapping cases are important and easy for analysis. They also keep the number of the STFT
coefficients equal to the number of the signal samples. However, the STFT is commonly calculated
using overlapping windows. There are several reasons for introducing overlapped STFT representations.
Rectangular windows have poor localization in the frequency domain. The localization is improved by
other window forms. In the case of nonrectangular windows some of the signal samples are weighted
in such a way that their contribution to the final representation is small. Then we want to use additional
STFT with a window positioned in such a way that these samples contribute more to the STFT
calculation. Also, in the parameters estimation and detection the task is to achieve the best possible
estimation or detection for each time instant instead of using interpolations for the skipped instants when
the STFT with a big step (equal to the window width) is calculated. Commonly, the overlapped STFTs
are calculated using, for example, rectangular, Hann(ing), Hamming, Bartlett, Kaiser, or Blackman
window of a constant window width N with steps N/2, N/4, N/8, . . . in time. Computational cost
is increased in the overlapped STFTs since more STFTs are calculated. A way of composing STFTs
570 Linear Time-Frequency Representations

STFT(n,0)
w(n) ↓R

STFT(n,1)
w(n) e j2πn/N ↓R

x(n)

...

STFT(n,N−1)
w(n) e j2πn(N−1)/N ↓R

Figure 10.10 Filter bank realization of the STFT

calculated with a rectangular window into a STFT with, for example, the Hann(ing), Hamming, or
Blackman window, is presented in Fig.10.9.
If a signal x (n) is of duration M, with 0 ≤ n ≤ M − 1, in some cases in addition to the overlapping
in time, an interpolation in frequency is done, for example up to the DFT grid with M samples. The
overlapped and interpolated STFT of this signal is calculated, using a window w(m) whose width is
N ≤ M, as
N/2−1
STFTN (n, k) = ∑ w(m) x (n + m)e− j2πmk/M
m=− N/2
k = − M/2, − M/2 + 1, . . . , −1, 0, 1, . . . , M/2 − 1.

Example 10.9. The STFT calculation of a signal whose frequency changes linearly is done by using a
rectangular window. Signal samples within 0 ≤ n ≤ M − 1 with M = 64 were available. The
nonoverlapping STFT of this signal is calculated with a rectangular window of the width N = 8
and presented in Fig.10.11. Its values are STFT8 (n, k) at n = 4, 12, 20 . . . , 60 and −4 ≤ k ≤ 3.
The nonoverlapping STFT values obtained using the rectangular window are shifted in frequency,
scaled, and added up, Fig. 10.12, to produce the STFT with a Hamming window, Fig. 10.13.
The STFT calculation for the same linear FM signal will be repeated for the overlapping
STFT with step R = 1, when n = 0, 1, 2, 3, 5, . . . , 63 is used. Here, it has been assumed that
the linear FM signal is available for all − N/2 ≤ m + n ≤ M − 1 + N/2 − 1. Results for the
rectangular and Hamming window (obtained by a simple matrix calculation from the rectangular
window case) are presented in Fig.10.14. Three window widths are used here.
The same procedure is repeated with the windows zero padded up to the widest used window
(interpolation in frequency). The results are presented in Fig.10.15. Note that regarding to the
amount of information all these figures do not differ from the basic time-frequency representation
presented in Fig.10.11.
Ljubiša Stanković Digital Signal Processing 571

STFT with rectangular window

31
30
29
28 S8(4,3) S8(12,3) S8(20,3) S8(28,3) S8(36,3) S8(44,3) S8(52,3) S8(60,3)
27
26
25
24
23
22
21
20 S (4,2)
19 8
S (12,2) S (20,2) S (28,2) S (36,2) S (44,2) S (52,2) S (60,2)
8 8 8 8 8 8 8
18
17
16
15
14
13
12 S (4,1) S (12,1) S (20,1) S (28,1) S (36,1) S (44,1) S (52,1) S (60,1)
11 8 8 8 8 8 8 8 8
10
9
8
7
6
5
4 S (4,0) S (12,0) S (20,0) S (28,0) S (36,0) S (44,0) S (52,0) S (60,0)
3 8 8 8 8 8 8 8 8
2
1
0
−1
−2
−3
−4 S (4,−1) S (12,−1) S (20,−1) S (28,−1) S (36,−1) S (44,−1) S (52,−1) S (60,−1)
−5 8 8 8 8 8 8 8 8
−6
−7
−8
−9
−10
−11
−12 S (4,−2) S (12,−2) S (20,−2) S (28,−2) S (36,−2) S (44,−2) S (52,−2) S (60,−2)
−13 8 8 8 8 8 8 8 8
−14
−15
−16
−17
−18
−19
−20 S8(4,−3) S (12,−3) S (20,−3) S (28,−3) S (36,−3) S (44,−3) S (52,−3) S (60,−3)
−21 8 8 8 8 8 8 8
−22
−23
−24
−25
−26
−27
−28 S (4,−4) S (12,−4) S8(20,−4) S (28,−4) S (36,−4) S8(44,−4) S (52,−4) S (60,−4)
−29 8 8 8 8 8 8
−30
−31
−32
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63

Figure 10.11 The STFT of a linear FM signal x (n) calculated using a rectangular window of the width N = 8.

STFTR(n,k−1) STFTR(n,k) STFTR(n,k+1)

0.23 STFTR(n,k−1) + 0.54 STFTR(n,k) + 0.23 STFTR(n,k+1)

= STFTH(n,k)

Figure 10.12 The STFT of a linear FM signal calculated using a rectangular window (from the previous figure),
along with its frequency shifted versions STFTR (n, k − 1) and STFTR (n, k + 1). Their weighted sum produces
the STFT of the same signal with a Hamming window STFTH (n, k ).
572 Linear Time-Frequency Representations

STFT with Hamming window

31
30
29
28 S8(4,3) S8(12,3) S8(20,3) S8(28,3) S8(36,3) S (44,3) S8(52,3) S8(60,3)
27 8
26
25
24
23
22
21
20
19
S (4,2) S (12,2) S (20,2) S (28,2) S (36,2) S8(44,2) S (52,2) S (60,2)
8 8 8 8 8 8 8
18
17
16
15
14
13
12 S8(4,1) S8(12,1) S8(20,1) S8(28,1) S8(36,1) S8(44,1) S8(52,1) S8(60,1)
11
10
9
8
7
6
5
4 S8(4,0) S8(12,0) S8(20,0) S8(28,0) S8(36,0) S8(44,0) S8(52,0) S8(60,0)
3
2
1
0
−1
−2
−3
−4 S8(4,−1) S8(12,−1) S8(20,−1) S8(28,−1) S8(36,−1) S8(44,−1) S8(52,−1) S8(60,−1)
−5
−6
−7
−8
−9
−10
−11
−12 S8(4,−2) S8(12,−2) S8(20,−2) S8(28,−2) S8(36,−2) S8(44,−2) S8(52,−2) S8(60,−2)
−13
−14
−15
−16
−17
−18
−19
−20 S8(4,−3) S8(12,−3) S8(20,−3) S8(28,−3) S8(36,−3) S8(44,−3) S8(52,−3) S8(60,−3)
−21
−22
−23
−24
−25
−26
−27
−28 S8(4,−4) S (12,−4) S (20,−4) S (28,−4) S8(36,−4) S8(44,−4) S8(52,−4) S8(60,−4)
−29 8 8 8
−30
−31
−32
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63

Figure 10.13 The STFT of a linear FM signal x (n) calculated using the Hamming window with N = 8.
Calculation is illustrated in the previous figure.

10.5 SIGNAL RECONSTRUCTION FROM THE DISCRETE STFT

Signal reconstruction from non-overlapping STFT values is obvious for a rectangular window. A simple
illustration is presented in Fig.10.16. Windowed signal values are reconstructed from the STFTs by a
simple inversion of each STFT

STFT(n) = W N Hw x(n)
Hw x(n) = IDFT{STFT(n)} = W− 1
N STFT ( n )

where Hw is a diagonal matrix with the window values as its elements, Hw = diag(w(m)).
Ljubiša Stanković Digital Signal Processing 573

STFT with rectangular window, N=48 STFT with Hamming window, N=48

STFT with rectangular window, N=16 STFT with Hamming window, N=16

STFT with rectangular window, N=8 STFT with Hamming window, N=8

Figure 10.14 Time-frequency analysis of a linear frequency modulated signal with overlapping windows of
various widths. Time step in the STFT calculation is R = 1.
574 Linear Time-Frequency Representations

STFT with rectangular window, N=48 STFT with Hamming window, N=48

STFT with rectangular window, N=16 STFT with Hamming window, N=16

STFT with rectangular window, N=8 STFT with Hamming window, N=8

Figure 10.15 Time-frequency analysis of a linear frequency modulated signal with overlapping windows of various
widths. Time step in the STFT calculation is R = 1. For each window width the frequency axis is interpolated
(signal in time is zero padded) up to the total number of available signal samples M = 64.
Ljubiša Stanković Digital Signal Processing 575

STFT(2,k) STFT(6,k) STFT(10,k) STFT(14,k)

6
S4(2,1) S4(6,1) S (10,1) S (14,1)
4 4
5

2
S4(2,0) S4(6,0) S (10,0)
4
S (14,0)
4
1

−1

−2
S (2,−1) S4(6,−1) S4(10,−1) S4(14,−1)
4
−3

−4

−5

−6
S4(2,−2) S4(6,−2) S (10,−2) S4(14,−2)
4
−7

−8

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

x(0), x(1), x(2), x(3) x(4), x(5), x(6), x(7) x(8), x(9),x(10),x(11) x(12),x(13),x(14),x(15)

x(2+m)w(m)= x(6+m)w(m)= x(10+m)w(m)= x(14+m)w(m)=

IDFT{STFT(2,k)} IDFT{STFT(6,k)} IDFT{STFT(10,k)} IDFT{STFT(14,k)}
m=−2,−1,0,1 m=−2,−1,0,1 m=−2,−1,0,1 m=−2,−1,0,1

Figure 10.16 Illustration of the signal reconstruction from the STFT with nonoverlapping windows.

Example 10.10. Consider a signal with M = 16 samples, x (0), x (1), . . . , x (15). Write a matrix form
for the signal inversion using a four-sample STFT (N = 4) calculated with the rectangular and a
Hann(ing) window: (a) Without overlapping, R = 4. (b) With a time step in the STFT calculation
of R = 2.
576 Linear Time-Frequency Representations

⋆ (a) For the nonoverlapping case the STFT calculation is done according to:
 
x (0) x (4) x (8) x (12)
 x (1) x (5) x (9) x (13) 
STFT = W4 H4  
 x (2) x (6) x (10) x (14)  .
x (3) x (7) x (11) x (15)

with H4 =diag([w(−2) w(−1) w(0) w(1)]) and W4 is the corresponding four sample DFT
matrix.
The inversion relation is
 
x (0) x (4) x (8) x (12)
 x (1) x (5) x (9) x (13)  − 1 −1
 
 x (2) x (6) x (10) x (14)  = H4 W4 STFT
x (3) x (7) x (11) x (15)

where the elements of diagonal matrix H− 1 −1

4 are proportional to 1/w ( m ), H4 =diag([1/w (−2)
1/w(−1) 1/w(0) 1/w(1)]). If a rectangular window is used in the STFT calculation then
H− 1
4 = I4 is unity matrix and this kind of calculation can be used. However if a nonrectangular
window is used then some of the window values are quite small. The signal value is then obtained
by multiplying the inverse DFT with large values 1/w(m). This kind of division with small
values is very imprecise, if any noise in the reconstructed signal is expected. In the Hann(ing)
window case the ending point is even zero-valued, so 1/w(m) does not exist.
(b) The STFT calculation is done with overlapping with step R = 2, Fig.10.17. For N = 4
and calculation step R = 2 the STFT calculation corresponds to
 
0 x (0) x (2) x (4) x (6) x (8) x (10) x (12) x (14)
0 x (1) x (3) x (5) x (7) x (9) x (11) x (13) x (15) 
STFT = W4 H4   x (0) x (2) x (4) x (6) x (8) x (10) x (12) x (14) 0


x (1) x (3) x (5) x (7) x (9) x (11) x (13) x (15) 0

The inversion is

W4−1 STFT = H4 X =
 
0 x (0)w(−2) x (2)w(−2) x (4)w(−2) . . . x (14)w(−2)
 0 x (1)w(−1) x (3)w(−1) x (5)w(−1) . . . x (15)w(−1) 
 ,
 x (0) w (0) x (2) w (0) x (4) w (0) x (6) w (0) . . . 0 
x (1) w (1) x (3) w (1) x (5) w (1) x (7) w (1) . . . 0

where X is the matrix with signal elements. The window matrix is left on the right side, since in
general it may be not invertible. By calculating W4−1 STFT we can then recombine the signal
values. For example, the element producing x (0)w(0) in the first column is combined with the
element producing x (0)w(−2) in the second column to get x (0)w(0) + x (0)w(−2) = x (0),
since for the Hann(ing) window of the width N holds w(n) + w(n − N/2) = 1. The same is
done for other signal values in the matrix obtained after inversion,

x (0)w(0) + x (0)w(−2) = x (0)

x (1)w(1) + x (1)w(−1) = x (1)
x (2)w(0) + x (2)w(−2) = x (2)
..
.
x (15)w(1) + x (15)w(−1) = x (15).
Ljubiša Stanković Digital Signal Processing 577

Note that the same relation would hold for a triangular window, while for a Hamming window a
similar relation would hold, with w(n) + w(n − N/2) = 1.08. The results should be corrected
in that case, by a constant factor of 1.08.
Illustration of the STFT calculation for an arbitrary window width N at n = n0 is presented
in Fig.10.17. Its inversion produces x (n0 + m)w(m) = IDFT{STFTN (n0 , k)}. Consider the
previous STFT value in the case of nonoverlapping windows. It would be STFTN (n0 − N, k).
Its inverse
IDFT{STFTN (n0 − N, k)} = x (n0 − N + m)w(m)
is also presented in Fig.10.17. As it can be seen, by combining these two inverse transforms we
would get signal with very low values around n = n0 − N/2̇. If one more STFT is calculated at
n = n0 − N/2 and its inverse combined with previous two it will improve the signal presentation
within the overlapping region n0 − N ≤ n < n0 . In addition for the most of common windows
w(m − N ) + w(m − N/2) + w(m) = 1 (or a constant) within 0 ≤ m < N meaning that
the sum of overlapped inverse STFTs, as in Fig.10.17, will give the original signal within
n0 − N ≤ n < n0 .

In general, let us consider the STFT calculation with overlapping windows. Assume that the
STFTs are calculated with a step 1 ≤ R ≤ N in time. Available STFT values are

...
STFT(n0 − 2R),
STFT(n0 − R), (10.33)
STFT(n0 ),
STFT(n0 + R),
STFT(n0 + 2R),
...

Based on the available STFT values (10.33), the windowed signal values can be reconstructed as

Hw x(n0 + iR) = W− 1
N STFT ( n0 + iR ), i = · · · − 2, −1, 0, 1, 2, . . .

For m = − N/2, − N/2 + 1, . . . , N/2 − 1 we get the signal values x (n0 + iR + m)

1 N/2−1
w(m) x (n0 + iR + m) = ∑ STFT (n0 + iR, k)e j2πmk/N .
N k=−
(10.34)
N/2

Since R < N we we will get the same signal value within different STFT, for different i. For example,
for N = 8, R = 2 and n0 = 0 we will get the value x (0) for m = 0 and i = 0, but also for m = −2
and i = 1 or m = 2 and i = −1, and so on. Then in the reconstruction we should use all these values
to get the most reliable reconstruction.
Let us re-index the reconstructed signal values (10.34) by substitution m = l − iR, as in (10.12),

1 N/2−1
w(l − iR) x (n0 + l ) = ∑ STFT (n0 + iR, k)e j2πlk/N e− j2πiRk/N
N k=− N/2
− N/2 ≤ l − iR ≤ N/2 − 1.
578 Linear Time-Frequency Representations

x(n)

n
n −N n −N/2 n
0 0 0

w(m) w(m)

m
x(n0−N+m)w(m) x(n0+m)w(m)

w(m)

x(n)w(n−n0+N/2) x(n0−N/2+m)w(m)

m
n

x(n)w(n−n0+N)+x(n)w(n−n0+N/2)+x(n)w(n−n0)

Figure 10.17 Illustration of the STFT calculation with windows overlapping in order to produce an inverse STFT
whose sum will give the original signal within n0 − N ≤ n < n0 .

If R < N, then a value of the signal x (n0 + l ) will be obtained by inverting the STFT

1 N/2−1
w ( l ) x ( n0 + l ) = ∑ STFT (n0 , k)e j2πlk/N
N k=− N/2
Ljubiša Stanković Digital Signal Processing 579

The same signal value, x (n0 + l ), will be obtained within the other overlapping inversions

..
.
1 N/2−1
w(l − 2R) x (n0 + l ) = ∑ STFT (n0 + 2R, k)e j2πlk/N e− j2π2Rk/N
N k=− N/2

1 N/2−1
w ( l − R ) x ( n0 + l ) = ∑ STFT (n0 + R, k)e j2πlk/N e− j2πRk/N
N k=− N/2

1 N/2−1
w ( l + R ) x ( n0 + l ) = ∑ STFT (n0 − R, k)e j2πlk/N e j2πRk/N
N k=− N/2

1 N/2−1
w(l + 2R) x (n0 + l ) = ∑ STFT (n0 − 2R, k)e j2πlk/N e j2π2Rk/N
N k=− N/2
..
.

as far as w(l − 2iR), for i = 0, ±1, ±2, . . . is within

− N/2 ≤ l − 2iR < N/2.

By summing all the reconstructions over i satisfying − N/2 ≤ l − iR ≤ N/2 − 1 we get the final
reconstructed signal x (n0 + l ). Obviously, this sum produces the exact, up to a constant undistorted
signal value, if
∑ w(l − iR) = 1 (10.35)
i
or
c(l ) = ∑ w(l − iR) = const. = C (10.36)
i
since
∑ w(l − iR)x(n0 + l ) = Cx(n0 + l )
i

for any n0 and l. Note that ∑i w(l − iR) is a periodic extension of w(l ) with a period R. If W (e jω )
is the Fourier transform of w(l ) then the Fourier transform of its periodic extension is equal to the
samples of W (e jω ) at ω = 2πk/R. The condition (10.36) is equivalent to

W (e j2πk/R ) = CNδ(k) for k = 0, 1, . . . , R − 1.

Special cases:

1. For R = N (nonoverlapping), relation (10.36) is satisfied for the rectangular window, only.

2. For a half of the overlapping period, R = N/2, condition (10.36) is met for the rectangular,
Hann(ing), Hamming, and triangular window. Realization in this case for N = 8 and R =
N/2 = 4 is presented in Fig.10.18. Signal values with a delay of N/2 = 4 samples are obtained
at the exit. The STFT calculation process is repeated after each 4 samples, producing blocks of
4 signal samples at the output.

3. The same holds for R = N/2, N/4, N/8, if the values of R are integers.
580 Linear Time-Frequency Representations

x(n) ..., STFT(n−7,k), STFT(n−3,k), STFT(n+1,k), ...

N/2
w(3) STFT(n−3,0) w(3) x(n−0)
x(n−0) ↓ −4
z
z−1
w(2) STFT(n−3,1) w(2) x(n−1)
x(n−1) ↓ z−4
−1
z
w(1) STFT(n−3,2) w(1) x(n−2)
x(n−2) ↓ −4
z
−1
z
w(0) STFT(n−3,3) w(0) x(n−3)
x(n−3) ↓ −4
z
−1 STFT IDFT
z
w(−1) STFT(n−3,4) w(−1) x(n−4) x(n−4)
x(n−4) ↓ +
−1
(DFT)
z
w(−2) STFT(n−3,5) w(−2) x(n−5) x(n−5)
x(n−5) ↓ +
z−1
w(−3) STFT(n−3,6) w(−3) x(n−6) x(n−6)
x(n−6) ↓ +
−1
z
w(−4) STFT(n−3,7) w(−4) x(n−7) x(n−7)
x(n−7) ↓ +
R=N/2=4

Figure 10.18 Signal reconstruction from the STFT for the case N = 8, when the STFT is calculated with step
R = N/2 = 4 and the window satisfies w(m) + w(m − N/2) = 1. This is the case for the rectangular, Hann(ing),
Blackman and triangular windows. The same holds for the Hamming window up to a constant scaling factor of
1.08.

4. For R = 1, (the STFT calculation in each available time instant), any window satisfies the
inversion relation. In this case we may also use a simple reconstruction formula, Fig.10.19
!
1 N/2−1 1 N/2−1 N/2−1
− j2πmk/N
N k=− ∑ STFT ( n, k ) = ∑
N m=−
w ( m ) x ( n + m ) ∑ e
N/2 N/2 k=− N/2
= w (0) x ( n ).

Very efficient realizations, for this case, are the recursive ones, instead of the direct DFT
calculation, Fig.10.9.

In analysis of non-stationary signals our primary interest is not in signal reconstruction with
the fewest number of calculation points. Rather, we are interested in tracking signals’ non-stationary
parameters, like for example, instantaneous frequency. These parameters may significantly vary between
neighboring time instants n and n + 1. Quasi-stationarity of signal within R samples (implicitly
assumed when down-sampling by factor of R is done) in this case is not a good starting point for the
analysis. Here, we have to use the time-frequency analysis of signal at each instant n, without any
down-sampling.
If the reconstructed signal is weighted by the same analysis window, that is w(l − iR) x (n0 + l )
is multiplied by w(l − iR), then the reconstruction condition for the weighted overlap-add method is

∑ w2 (l − iR) = 1. (10.37)
i

For more details on this from and its kernel framework interpretation see Section 10.2.
Ljubiša Stanković Digital Signal Processing 581

x(n)

w(3) STFT(n−3,0)
x(n−0)
−1
z
w(2) STFT(n−3,1)
x(n−1)
−1
z
w(1) STFT(n−3,2)
x(n−2)
−1
z
w(0) STFT(n−3,3) 1/(Nw(0)) x(n−3)
x(n−3) +

z−1 STFT
w(−1) STFT(n−3,4)
x(n−4)
(DFT)
z−1
w(−2) STFT(n−3,5)
x(n−5)
−1
z
w(−3) STFT(n−3,6)
x(n−6)

z−1
w(−4) STFT(n−3,7)
x(n−7)

Figure 10.19 Signal reconstruction when the STFT is calculated with step R = 1.

10.6 VARYING WINDOWS IN THE STFT

Window with and form can be varying for different time instants, frequency bands, or can be time-
frequency varying. These forms of the windows will be presented next.

10.6.1 Time-Varying Windows

In general, varying window widths could be used for different time-frequency points. When Ni changes
with ni we have the case of a time-varying window. Assuming a rectangular window we can write,
Ni /2−1 − j 2π
N mk
STFTNi (ni , k) = ∑ x ( ni + m ) e i (10.38)
m=− Ni /2

Notation STFTNi (ni , k) means that the STFT is calculated using signal samples within the window
[ni − Ni /2, ni + Ni /2 − 1] for − Ni /2 ≤ k ≤ Ni /2 − 1, corresponding to an even number of Ni
discrete frequencies from −π to π. For an odd Ni , the summation limits are ±( Ni − 1)/2. Let us
restate that a wide window includes signal samples over a wide time interval, losing the possibility to
detect fast changes in time, but achieving high frequency resolution. A narrow window in the STFT
will track time changes, but with a low resolution in frequency. Two extreme cases are Ni = 1 when

STFT1 (n, k) = x (n)

and Ni = M when
STFTM (n, k) = X (k),
582 Linear Time-Frequency Representations

where M is the total number of all available signal samples and X (k) = DFT{ x (n)}.
In vector notation
STFT Ni (ni ) = W Ni x Ni (ni ),
where STFT Ni (ni ) and x Ni (ni ) are column vectors. Their elements are STFTNi (ni , k), k =
− Ni /2,. . . , Ni /2 − 1 and x (ni + m), m = − Ni /2,dots, Ni /2 − 1, respectively

STFT Ni (ni ) = [STFTNi (ni , − Ni /2) . . . STFTNi (ni , Ni /2 − 1)] T

x Ni (ni ) = [ x (ni − Ni /2) . . . x (ni + Ni /2 − 1)] T .

Matrix W Ni is an Ni × Ni DFT matrix with elements

WNi (m, k) = exp(− j2πmk/Ni ),

where m is the column index and k is the row index of the matrix. The STFT value STFTNi (ni , k) is
presented as a block in the time-frequency plane of the width Ni in the time direction, covering all
time instants [ni − Ni /2, ni + Ni /2 − 1] used in its calculation. The frequency axis can be labeled
with the DFT indices p = − M/2, . . . , M/2 − 1 corresponding to the DFT frequencies 2π p/M (dots
in Fig.10.20). With respect to this axis labeling, the block STFTNi (ni , k) will be positioned at the
frequency 2πk/Ni = 2π (kM/Ni )/M, that is, at p = kM/Ni . The block width in frequency is M/Ni
DFT samples. Therefore the block area in time and DFT frequency is always equal to the number of all
available signal samples M as shown in Fig.10.20 where M = 16.

Example 10.11. Consider a signal x (n) with M = 16 samples. Write the expression for calculation of
the STFT value STFT4 (2, 1) with a rectangular window. Indicate graphically the region of time
instants used in the calculation and the frequency range in terms of the DFT frequency values
included in the calculation of STFT4 (2, 1)?

⋆ The STFT value STFT4 (2, 1) is:

1 2π
STFT4 (2, 1) = ∑ x (2 + m ) e − j 4 m .
m=−2

It uses discrete-time samples of x (n) within

−2 ≤ 2 + m < 1
0 ≤ n ≤ 3.

The frequency term is exp(− j2πm/4). For the DFT of a signal with M = 16
15 2π
X (k) = ∑ x (m)e− j 16 mk
m =0
k = −8, −7, · · · − 1, 0, 1, . . . , 6, 7

this frequency would correspond to the term exp(− j2π4m/16). Therefore k = 1 corresponds
to the frequency p = 4 in the DFT. Since the whole frequency range −π ≤ ω < π in the case
of Ni = 4 is covered with 4 STFT values STFT4 (2, −2), STFT4 (2, −1), STFT4 (2, 0), and
STFT4 (2, 1) and the same frequency range in the DFT has 16 frequency samples, it means that
each STFT value calculated with Ni = 4 corresponds to a range of frequencies corresponding to
Ljubiša Stanković Digital Signal Processing 583

7 7
6 6
S (2,1) S (6,1) S (10,1) S (14,1)
5 4 4 4 4 5

S (11,0)

S (13,0)

S (15,0)
S (1,0)

S (3,0)

S (5,0)

S (7,0)

S (9,0)
4 4
3 3

2
2 2
S (2,0) S (6,0) S (10,0) S (14,0)
1 4 4 4 4 1
0 0
-1 -1
-2 -2
S (2,-1) S (6,-1) S (10,-1) S (14,-1)
-3 4 4 4 4 -3

S (11,-1)

S (13,-1)

S (15,-1)
S (1,-1)

S (3,-1)

S (5,-1)

S (7,-1)

S (9,-1)
-4 -4
-5 -5

2
-6 -6
S (2,-2) S (6,-2) S (10,-2) S (14,-2)
-7 4 4 4 4 -7
-8 -8
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

7 7
S (12,3)
6 6 8
S (4,1) S (8,1) S (6,1)
5 4 4 5 4
S (12,2)
S (11,0)

S (13,0)
S (1,0)

S (1,0)

S (3,0)
4 4 8

3 3
2

2
2

S (12,1)
2 2 8
S (4,0) S (8,0) S (6,0)
1 4 4 1 4
S (12,0)
S (14,0)
S (15,0)

0 0 8

-1 -1
1
1

S (12,-1)
-2 -2 8
S (4,-1) S (8,-1) S (6,-1)
-3 4 4 -3 4
S (12,-2)
S (11,-1)

S (13,-1)
S (1,-1)

S (1,-1)

S (3,-1)

-4 -4 8

-5 -5
2

S (12,-3)
2

-6 -6 8
S (4,-2) S (8,-2) S (6,-2)
-7 4 4 -7 4
S (12,-4)
-8 -8 8

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Figure 10.20 The nonoverlapping STFTs with: (a) constant window of the width N = 4, (b) constant window
of the width N = 2, (c)-(d) time-varying windows. Time index is presented on the horizontal axis, while the DFT
frequency index is shown on the vertical axis (the STFT is denoted by S for notation simplicity).

4 DFT values,

k = −2, corresponds to p = −8, −7, −6, −5

k = −1, corresponds to p = −4, −3, −2, −1
k = 0, corresponds to p = 0, 1, 2, 3
k = 1, corresponds to p = 4, 5, 6, 7.

This discrete-time and the DFT frequency region, 0 ≤ n ≤ 3 and 4 ≤ p ≤ 7, is represented by a

square denoted by S4 (2, 1) in Fig.10.20(a).

In a nonoverlapping STFT, covering all signal samples

x =[ x (0), x (1), . . . , x ( M − 1)] T

584 Linear Time-Frequency Representations

with STFT Ni (ni ), the STFT should be calculated at n0 = N0 /2, n1 = N0 + N1 /2, n2 = N0 + N1 +

N2 /2, . . . , nK = M − NK /2. A matrix form for all STFT values is
 
W N0 0 · · · 0
 0 W N1 · · · 0 
 
STFT =  . .. . . ..  x
 .. . . . 
0 0 · · · W NK
−1
STFT= W̃x = W̃W M X, (10.39)

where STFT is a column vector containing all STFT vectors STFT Ni (ni ), i = 0, 1, . . . , K, X = W M x
is a DFT of the whole signal x (n), while W̃ is a block matrix (M × M) formed from the smaller
DFT matrices W N0 , W N1 , . . . ,W NK , as in (10.38). Since the time-varying nonoverlapping STFT
corresponds to a decimation-in-time DFT scheme, its calculation is more efficient than the DFT
calculation of the whole signal. Illustration of time-varying window STFTs is shown in Fig.10.20(c),
(d). For a signal with M samples, there is a large number of possible nonoverlapping STFTs with a
time-varying window Ni ∈ {1, 2, 3, . . . , M }. The exact number will be derived later.

Example 10.12. Consider a signal x (n) with M = 16 samples, whose values are x = [0.5, 0.5,
−0.25, j0.25, 0.25, − j0.25, −0.25, 0.25, −0.25, 0.25, 0.5, 0.5, − j0.5, j0.5, 0, −1]. Some of its
nonoverlapping STFTs are calculated according to (10.38) and shown in Fig.10.20. Different
representations can be compared based on the concentration measures, for example,

µ[STFTN (n, k)] = ∑∑ |STFTN (n, k)| = kSTFTk1 .

n k

The best STFT representation, in this sense, would be the one with the smallest µ[STFTN (n, k)].
For the considered signal and its four representations shown in Fig.10.20 the best representation,
according to this criterion, is the one shown in Fig.10.20(b).

Example 10.13. Consider a signal x (n) with M = 8 samples. Its values are x (0) = 0, x (1) = 1,
x (2) = 1/2, x (3) = −1/2, x (4) = 1/4, x (5) = − j/4, x (6) = −1/4, and x (7) = j/4.
(a) Calculate the STFTs of this signal with rectangular window of the widths N = 1, N = 2,
N = 4. Use the following STFT definition
N/2−1
STFTN (n, k) = ∑ x (n + m)e− j2πmk/N .
m=− N/2

For an odd N, the summation limits are ±( N − 1)/2. Calculate STFT1 (n, k) for n =
0, 1, 2, 3, 4, 5, 6, 7, then STFT2 (n, k) for n = 1, 3, 5, 7, then STFT4 (n, k) for n = 2, 6 and
STFT8 (n, k) for n = 4. For frequency axis use notation k = 0, 1, 2, 3, 4, 5, 6, 7.
(b) Assuming that time-varying approach is used in the nonoverlapping STFT calculation,
find the total number of possible representations.
(c) Calculate the concentration measure of µ[STFT (n, k)]1/2 for each of the cases in (b)
and find the representation (nonoverlapping combination of previous STFTs) when the signal
is represented with the smallest number of coefficients. Does it correspond to the minimum of
µ[STFT (n, k)]1/2 ?
Ljubiša Stanković Digital Signal Processing 585

⋆ (a) The STFT values are:

– for N = 1

STFT1 (n, 0) = x (n), for all n = 0, 1, 2, 3, 4, 5, 6, 7;

– for N = 2

STFT2 (n, 0) = x (n) + x (n − 1)

STFT2 (1, 0) = 1,
STFT2 (3, 0) = 0,
STFT2 (5, 0) = (1 − j)/4,
STFT2 (7, 0) = (−1 + j)/4

STFT2 (n, 1) = x (n) − x (n − 1)

STFT2 (1, 1) = 1,
STFT2 (3, 1) = −1,
STFT2 (5, 1) = (−1 − j)/4,
STFT2 (7, 1) = (1 + j)/4

– for N = 4 and n = 2, 6

STFT4 (n, 0) = x (n − 2) + x (n − 1) + x (n) + x (n + 1)

STFT4 (2, 0) = 1,
STFT4 (6, 0) = 0

STFT4 (n, 1) = − x (n − 2) + jx (n − 1) + x (n) − jx (n + 1)

STFT4 (2, 1) = (1 + 3j)/2,
STFT4 (6, 1) = 0

STFT4 (n, 2) = x (n − 2) − x (n − 1) + x (n) − x (n + 1)

STFT4 (2, 2) = 0,
STFT4 (6, 2) = 0,

STFT4 (n, 3) = − x (n − 2) − jx (n − 1) + x (n) + jx (n + 1)

STFT4 (2, 3) = (1 − 3j)/2,
STFT4 (6, 3) = −1

(b) Now we have to make all possible nonoverlapping combinations of these transforms and to
calculate the concentration measure for each of them. Total number of combinations is 25. The
absolute STFT values are shown in Fig. 10.21, along with measure

µ[STFT (n, k)] = ∑ ∑ |STFT (n, k)|1/2

n k

for each case.

586 Linear Time-Frequency Representations

M = 4. 41 M = 4. 60 M = 4. 60 M = 4. 79 M = 3. 41

M = 4. 00 M = 4. 19 M = 4. 19 M = 4. 38 M = 3. 00

M = 5. 41 M = 5. 60 M = 5. 60 M = 5. 79 M = 4. 41

M = 5. 00 M = 5. 19 M = 5. 19 M = 5. 38 M = 4. 00

M = 5. 51 M = 5. 70 M = 5. 70 M = 5. 89 M = 4. 51

Figure 10.21 Time-frequency representation in various lattices (grid-lines are shown), with concentration measure
M = µ[SPEC (n, k)]1/2 value. The optimal representation, with respect to this measure, is presented with thicker
gridlines. Time axis is n = 0, 1, 2, 3, 4, 5, 6, 7 and the frequency axis is k = 0, 1, 2, 3, 4, 5, 6, 7.

(c) By measuring the concentration for all of them, we will get that the optimal combination,
to cover the time-frequency plane, is

STFT1 (0, 0) = x (0) = 0

STFT1 (1, 0) = x (1) = 1
STFT2 (3, 1) = x (3) − x (2) = −1
STFT2 (3, 0) = x (3) + x (2) = 0
STFT4 (6, 0) = x (4) + x (5) + x (6) + x (7) = 0
STFT4 (6, 1) = − x (4) + jx (5) + x (6) − jx (7) = 0
STFT4 (6, 2) = x (4) − x (5) + x (6) − x (7) = 0
STFT4 (6, 3) = − x (4) − jx (5) + x (6) + jx (7) = −1

with just three nonzero transformation coefficients. It corresponds to the minimum of µ[SPEC (n, k)].
Ljubiša Stanković Digital Signal Processing 587

In this case there is an algorithm for efficient optimal lattice determination, based on two
regions consideration, starting from lattices 1, 19, and 25 from the Fig. 10.21, corresponding to
the constant window widths of N = 1, N = 2, and N = 4 samples.

Example 10.14. Discrete signal x (n) for n = 0, 1, 2, 3, 4, 5 is considered. Time-frequency plane is

divided as presented in Fig. 10.22.
(a) Denote each region in the figure by appropriate coefficient STFTNi (n, k), where N is window
length, n is the time index, and k is the frequency index.
(b) Write relations for coefficients calculation and write transformation matrix T.
(c) By using the transformation
√ matrix,√find STFT values if signal samples are x (0) = 2, x (1) = −2,
x (2) = 4, x (3) = 3, x (4) = − 3, x (5) = 0.
(d) If the STFT coefficients for signal y(n) are

STFT2 (1, 0) = 4, STFT2 (1, 1) = 0

STFT1 (2, 0) = 1, STFT3 (4, 0) = 0
STFT3 (4, 1) = 3, STFT3 (4, 2) = 3

find the signal samples y(n).

3π/4
frequency

π/2

π/4

0
0 1 2 3 4 5
time

Figure 10.22 Areas in the time-frequency plane.

⋆ (a) Denoted areas are presented in Fig. 10.23.

(b) The STFT values are obtained using
( N −1)/2−1
STFTN (n, k) = ∑ x (n + m)e− j2πmk/N or
m=−( N −1)/2
N/2−1
STFTN (n, k) = ∑ x (n + m)e− j2πmk/N
m=− N/2
588 Linear Time-Frequency Representations

STFT2(1,1)
STFT3(4,2)
3π/4

STFT1(2,0)
frequency
π/2 STFT3(4,1)

STFT2(1,0)
π/4
STFT3(4,0)

0
0 1 2 3 4 5
time

Figure 10.23 Denoted areas in the time-frequency plane.

for and odd and even number of samples N, respectively. The elements are

STFT2 (1, 0) = x (0) + x (1)

STFT2 (1, 1) = − x (0) + x (1)
STFT1 (2, 0) = x (2)
STFT3 (4, 0) = x (3) + x (4) + x (5)
√ √
−1 + j 3 −1 − j 3
STFT3 (4, 1) = x (3) + x (4) + x (5)
2 2
√ √
−1 − j 3 −1 + j 3
STFT3 (4, 2) = x (3) + x (4) + x (5).
2 2
The transformation matrix (where the STFT elements are arranged into column vector S) is
 
1 1 0 0 0 0
 −1 1 0 0 0 0 
 
 0 0 1 0 0 0 
 
T= 0 0 0 1 1 1 .
 √ √ 
 −1+ j 3 −1− j 3 
 0 0 0 2√ 1 2√ 
−1− j 3 −1+ j 3
0 0 0 2 1 2

(c) The STFT elements are

    
1 1 0 0 0 0 2 0
 −1 1 0 0 0 0  −2   −4 
    
 0 0 1 0 0 0    4 

S= 0

 √4 = 
.
 0 0 1√ 1 1√    0 √
  √3   
 0 0 0
−1+ j 3
1
−1− j 3
 − 3   3+ j3
 − 2√
3 

2√ 2√
−1− j 3 −1+ j 3 0 3− j3 3
0 0 0 2 1 2 2
Ljubiša Stanković Digital Signal Processing 589

(d) The signal samples y(n) are obtained as T−1 S resulting in

T T
y (5) y (4) y (3) y (2) y (1) y (0) = 2 2 1 −1 2 −1 .

Example 10.15. A discrete signal x (n) is considered for 0 ≤ n < M. Find the number of the STFTs
of this signal with time-varying windows.
(a) Consider arbitrary window widths from 1 to M.
(b) Consider dyadic windows, that is, windows whose width is 2m , where m is an
integer, such that 2m ≤ M. In this case find the number of time-varying window STFTs for
M = 1, 2, 3, . . . , 15, 16.

⋆ (a) Let us analyze the problem recursively. Denote by F ( M) the number of STFTs for a signal
with M samples. It is obvious that F (1) = 1, that is, for one-sample signal there is only one
STFT (signal sample itself). If M > 1, we can use window with widths k = 1, 2, . . . M, as the first
analysis window. Now let us analyze remaining ( M − k) samples in all possible ways, so we can
write a recursive relation for the total number of the STFTs. If the first window is one-sample
window, then the number of the STFTs is F ( M − 1). When the first window is a two-sample
window, then the total number of the STFTs is F ( M − 2), and so on, until the first window is the
M-sample window, when F ( M − M) = 1. Thus, the total number of the STFTs for all cases is

F ( M) = F ( M − 1) + F ( M − 2) + . . . + F (1) + 1.

We can introduce F (0) = 1 (meaning that if there are no signal samples we have only one way to
calculate time-varying window STFT) and obtain
M
F ( M ) = F ( M − 1) + F ( M − 2) + . . . F (1) + F (0) = ∑ F( M − k)
k =1

Now, for M > 1 we can write

M −1 M
F ( M − 1) = ∑ F( M − 1 − k) = ∑ F( M − k)
k =1 k =2

and
M M
F ( M ) − F ( M − 1) = ∑ F ( M − k ) − ∑ F ( M − k ) = F ( M − 1)
k =1 k =2
F ( M) = 2F ( M − 1).

resulting in F ( M) = 2 M−1 .
(b) In a similar way, following the previous analysis, we can write
⌊log2 M⌋
F ( M) = F ( M − 20 ) + F ( M − 21 ) + F ( M − 22 ) + · · · + F ( M − 2m ) = ∑ F ( M − 2m ),
m =0

where ⌊log2 M⌋ is an integer part of log2 M. Here we cannot write a simple recurrent relation as
in the previous case. It is obvious that F (1) = 1. We can also assume that F (0) = 1. By unfolding
590 Linear Time-Frequency Representations

recurrence we will get

F (2) = F (1) + F (0) = 2,

F (3) = F (2) + F (1) = 3.
F (4) = F (3) + F (2) + F (0) = 6, . . .

The results are presented in the table

M 1234 5 6 7 8
F ( M) 1 2 3 6 10 18 31 56
M 9 10 11 12 13 14 15 16
.
F ( M) 98 174 306 542 956 1690 2983 5272

Note that the approximative formula

h i
F ( M) ≈ 1.0366 · (1.7664) M−1

where [·] is an integer part of the argument, holds, with relative error smaller then 0.4% for
1 ≤ M ≤ 1024. For example, for M = 16 we have 5272 different ways to split time-frequency
plane into non-overlapping time-frequency regions.

10.6.1.1 Time-Varying Hann(ing) Windows

The continuous time-varying Hann(ing) window, positioned at τ = bk , can be defined by


 2 π a τ
2 bk − ak ( ak − 1) , for ak < τ ≤ bk
sin

k


w k ( τ − bk ) = 2 π bk ( τ − 1) , for b < τ ≤ c (10.40)

 cos 2 c k − bk bk k k



0, elsewhere,

where ( ak , bk ] and (bk , ck ], define the width of wk (τ − bk ). For consecutive intervals the relations
ak+1 = bk , bk+1 = ck hold, as shown in Fig. 10.24. These windows satisfy the constant overlap-add
relation
K −1
∑ w(τ − bk ) = 1, (10.41)
k =0
since the squared sine and cosine sum up in two consecutive windows with the same parameters.
The initial window function, w0 (τ − b0 ), is defined



 1, for a0 = 0 < τ ≤ b0

w0 (τ − b0 ) = cos2 π2 c0 b−0b0 ( bτ0 − 1) , for b0 < τ ≤ c0 (10.42)



0, elsewhere,

while the last window function, wK −1 (τ − bK −1 ), is defined


sin2 π aK−1 ( τ − 1) , for a
2 bK −1 − a K −1 a K −1 K −1 < τ ≤ bK −1 = tmax
w K −1 ( τ − bK −1 ) = (10.43)

0, elsewhere,
Ljubiša Stanković Digital Signal Processing 591

0.5

0
0 1 2 3 4 5 6 7 8

0.5

0
0 1 2 3 4 5 6 7 8

Figure 10.24 (a) Time-varying asymmetric Hann(ing) windows.

An example of eight time-varying Hann(ing) windows is shown in Fig. 10.25.

For uniform widths, as in Fig. 10.6(a), the intervals can be defined by
tmax tmax tmax
a k = a k −1 + , bk = a k + , ck = ak + 2 (10.44)
K−1 K−1 K−1
with a1 = 0 and limτ →0 ( a1 /τ ) = 1.

10.6.1.2 Time-Varying Sine Windows

For the weighted overlap-add method, the reconstruction condition is

K −1
∑ w2 (τ − bk ) = 1. (10.45)
k =0

A simple way to construct a window (function) for the weighted overlap-add method is to take the
square root of the constant overlap-add window (for, example, the sine window as the square root of
the Hann window). In this case, the window functions become


sin π2 b a−k a ( aτ − 1) , for ak < τ ≤ bk



k k k

w k ( τ − bk ) = π bk τ (10.46)

 cos (
2 c k − bk bk − 1 ) , for bk < τ ≤ ck



0, elsewhere,
592 Linear Time-Frequency Representations

1.2

0.8

0.6

0.4

0.2

0
0 1 2 3 4 5 6 7 8 (a)

1.2

0.8

0.6

0.4

0.2

0
0 1 2 3 4 5 6 7 8 (b)
Figure 10.25 (a) Time-varying asymmetric Hann(ing) windows, 0 ≤ t ≤ 8, that satisfy the constant overlap-
add (COLA) reconstruction condition ∑k w(τ − bk ) = 1, with bk ∈ {0.1, 0.6, 1.6, 2.2, 3.3, 4.5, 6.0, 8.0},
k = 0, 1, 2, 3, 4, 5, 6, 7. (b) Time-varying asymmetric square root of the Hann(ing) windows (sine window),
0 ≤ t ≤ 8, that satisfy the weighted overlap-add (WOLA) reconstruction condition ∑k w2 (τ − bk ) = 1, with
bk ∈ {0.1, 0.6, 1.6, 2.2, 3.3, 4.5, 6.0, 8.0}, k = 0, 1, 2, 3, 4, 5, 6, 7.

with ak+1 = bk , bk+1 = ck and the initial and the final intervals defined as in (10.42) and (10.43). The
problem of this window is in its differentiability at the interval ending points, as can be seen from Fig.
10.25(b), causing slow frequency domain convergence. This problem will be addressed later, within the
continuous wavelet transform analysis.

10.6.2 Frequency-Varying Window

The STFT may use frequency-varying window as well. For a given DFT frequency pi the window
width in time is constant, Fig.10.26
Ni /2−1 − j 2π
N mk i
STFTNi (n, k i ) = ∑ w(m) x (n + m)e i .
m=− Ni /2

For example, value of STFT4 (2, −1) is

2−1
STFT4 (2, −1) = ∑ x (2 + m)e− j2πm(−1)/4 .
m=−2

It position in the time-frequency plane is shown in 10.26(left).

Ljubiša Stanković Digital Signal Processing 593

7 7
6 6
S4(2,1) S4(6,1) S4(10,1) S4(14,1)
5 5

S2(11,0)

S2(13,0)

S2(15,0)
S2(1,0)

S2(3,0)

S2(5,0)

S2(7,0)

S2(9,0)
4 4
3 3
S (4,1) S8(12,1)
8
2 2
1 1 S16(8,1)
0 0 S16(8,0)
−1 −1
−2 −2
S4(2,−1) S4(6,−1) S4(10,−1) S4(14,−1)
−3 −3

S2(11,−1)

S2(13,−1)

S2(15,−1)
S2(1,−1)

S2(3,−1)

S2(5,−1)

S2(7,−1)

S2(9,−1)
−4 −4
−5 −5
−6 −6
S4(2,−2) S4(6,−2) S4(10,−2) S4(14,−2)
−7 −7
−8 −8
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

7 7 S16(8,7)
6 6 S16(8,6)
5 5
S8(4,2) S8(12,2)
S2(11,0)

S2(13,0)

S2(15,0)
S2(1,0)

S2(3,0)

S2(5,0)

S2(7,0)

S2(9,0)

4 4
3 3
S8(4,1) S8(12,1)
2 2
1 1
0 0
S4(2,0) S4(6,0) S4(10,0) S4(14,0)
−1 −1
−2 −2
S4(2,−1) S4(6,−1) S4(10,−1) S4(14,−1)
−3 −3
−4 −4
S4(2,−1) S4(6,−1) S4(10,−1) S4(14,−1)
−5 −5
S8(4,−3) S8(12,−3)
−6 −6
−7 S16(8,−7) −7
S8(4,−4) S8(12,−4)
−8 S16(8,−8) −8
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Figure 10.26 Time-frequency analysis with the STFT using frequency-varying windows.

For the signal used to illustrate the frequency-varying STFT in 10.26, the best concentration (out
of the presented four) is the one shown in the last subplot. Optimization can be done in the same way
as in the case of time-varying windows.
The STFT can be calculated using the signal’s DFT instead of the signal. There is a direct relation
between the time and the frequency domain STFT via coefficients of the form exp( j2πnk/M ). A dual
form of the STFT is

1 M −1
P(i ) X (k + i )e j2πin/M , (10.47)
M i∑
STFT (n, k) =
=0
STFT M (k) = W− 1 −1
M P M X ( k ).

Frequency domain window P(i ) may be of frequency varying width. This form is dual to the time-
varying form. Forms corresponding to frequency varying windows, dual to the ones for the time-varying
594 Linear Time-Frequency Representations

windows, can be easily defined, for example, for a rectangular frequency domain window, as
 −1 
W N0 0 · · · 0
 0 W −1 · · · 0 
 N1 
STFT =   .. .. . . ..   X, (10.48)
 . . . . 
0 0 · · · W−NK
1

where X = [ X (0), X (1), . . . , X ( M − 1)] T is the DFT vector. A specific form of the STFT with the
frequency-varying windows is called the wavelet transform and will be considered later in the book.

10.6.3 Hybrid Time-Frequency-Varying Windows

In general, spectral content of signal changes in time and frequency in an arbitrary manner. Combining
time-varying and frequency-varying windows we get hybrid time–frequency-varying windows with
STFTN(i,l ) (ni , k l ),

N(i,l ) /2−1
− j N2π mk l
STFTN(i,l ) (ni , k l ) = ∑ w(i,l ) (m) x (ni + m)e (i,l ) (10.49)
m=− N(i,l ) /2

For a graphical representation of the STFT with varying windows, the corresponding STFT value
should be assigned to each instant n = 0, 1, . . . , M − 1 and each DFT frequency p = − M/2, − M/2 +
1, . . . , M/2 − 1 within a block. In the case of a hybrid time–frequency-varying window the matrix
form is obtained from the definition for each STFT value. For example, for the STFT calculated as in
Fig.10.27, for each STFT value an expression based on (10.49) should be written. Then the resulting
matrix STFT can be formed.
There are several methods in the literature that adapt windows or basis functions to the signal
form for each time instant or even for every considered time and frequency point in the time-frequency
plane. Selection of the most appropriate form of the basis functions (windows) for each time-frequency
point includes a criterion for selecting the optimal window width (basis function scale) for each point.

10.7 LOCAL POLYNOMIAL FOURIER TRANSFORM

After the presentation of the wavelet transform we will shift back our attention to the frequency of the
signal, rather than to its amplitude values. There are signals whose instantaneous frequency variations
are known up to an unknown set of parameters. For example, many signals could be expressed as
polynomial-phase signals
2 3 N +1
x (t) = Ae j(Ω0 t+ a1 t + a2 t +···+ a N t )
where the parameters Ω0 , a1 , a2 , . . . , a N are unknown. For nonstationary signals, this approach may be
used if the nonstationary signal could be considered as a polynomial phase signal within the analysis
window. In that case, the local polynomial Fourier transform (LPFT) may be used. It is defined as
Z∞
2 3 N +1
LPFTΩ1 ,Ω2 ,...,Ω N (t, Ω) = x (t + τ )w(τ )e− j(Ωτ +Ω1 τ +Ω2 τ +···+Ω N τ ) dτ. (10.50)
−∞

In general, parameters Ω1 , Ω2 , . . . , Ω N could be time dependent, that is, for each time instant t, the set
of optimal parameters could be different.
Ljubiša Stanković Digital Signal Processing 595

7
STFT (12,3)
8
6
STFT (2,1) STFT (6,1)
4 4
5
STFT (12,2)
8
4
3
STFT8(4,1) STFT8(12,1)
2
1 STFT16(8,1)
frequency

0 STFT16(8,0)
−1
STFT8(4,−1)
−2
STFT (10,−1)
4
−3

STFT (13,−1)

STFT (15,−1)
STFT (4,−2)
8
−4

2
−5
−6
STFT (2,−2) STFT (6,−2) STFT (10,−2)
4 4 4
−7
−8
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

time

Figure 10.27 A time-frequency varying grid in the STFT calculation.

2 3 N +1
Realization of the LPFT reduces to the local signal x (t + τ ) demodulation by e− j(Ω1 τ +Ω2 τ +···+Ω N τ )
followed by the STFT calculation.

Example 10.16. Consider the second-order polynomial-phase signal

2
x ( t ) = e j ( Ω0 t + a1 t ) .

Show that its LPFT could be completely concentrated along the instantaneous frequency.

⋆ Its LPFT has the form

Z∞
2
LPFTΩ1 (t, Ω) = x (t + τ )w(τ )e− j(Ωτ +Ω1 τ ) dτ
−∞
Z∞
2 2
= e j ( Ω0 t + a1 t )
w(τ )e− j(Ω−Ω0 −2a1 t)τ e− j(Ω1 − a1 )τ dτ. (10.51)
−∞

For Ω1 = a1 , the second-order phase term does not introduce any distortion to the local polynomial
spectrogram,

LPFTΩ = a (t, Ω)2 = |W (Ω − Ω0 − 2a1 t)|2 ,
1 1
596 Linear Time-Frequency Representations

with respect to the spectrogram of a sinusoid with constant frequency. For a wide window w(τ ),
like in the case of the STFT of a pure sinusoid, we achieve high concentration.

The LPFT could be considered as the Fourier transform of windowed signal demodulated with
exp(− j(Ω1 τ 2 + Ω2 τ 3 + · · · + Ω N τ N +1 )). Thus, if we are interested in signal filtering, we can
find the coefficients Ω1 , Ω2 , . . . , Ω N , demodulate the signal by multiplying it with exp(− j(Ω1 τ 2 +
Ω2 τ 3 + · · · + Ω N τ N +1 )) and use a standard filter for almost a pure sinusoid. In general, we can extend
this approach to any signal x (t) = e jφ(t) by estimating its phase φ(t) with φ b(t) (using the instantaneous
frequency estimation that will be discussed later) and filtering demodulated signal x (t) exp(− jφ b(t))
by a lowpass filter. The resulting signal is obtained when the filtered signal is returned back to the
original frequencies, by modulation with exp( jφ b(t)).

Example 10.17. Consider the first-order LPFT of a signal x (t). Show that the second-order moments
of the LPFT could be calculated based on the windowed signal moment, windowed signal’s
Fourier transform moment and one more LPFT moment for any Ω1 in (10.50), for example for
Ω1 = 1.

⋆ The second-order moment of the first-order LPFT,

Z∞
2
LPFTΩ1 (t, Ω) = xt (τ )e− j(Ωτ +Ω1 τ ) dτ,
−∞

defined by
Z∞ 2
1
MΩ 1 = Ω2 LPFTΩ1 (t, Ω) dΩ (10.52)
2π
−∞
is equal to
2
Z∞ d xt (τ )e− jΩ1 τ 2
MΩ 1 = dτ,
dτ
−∞

2
since the LPFT could be considered as the Fourier transform of xt (τ )e− jΩ1 τ , that is,
2
LPFTΩ1 (t, Ω) = FT{ xt (τ )e− jΩ1 τ }, and the Parseval’s theorem is used. After the derivative
calculation
Z∞ 2

MΩ 1 = dxt (τ ) − j2Ω1 τxt (τ ) dτ =
dτ
−∞
Z∞
dx (τ ) 2 dx (τ ) dx ∗ (τ )
( t + j2Ω1 τxt∗ (τ ) t − j2Ω1 τxt (τ ) t + |2Ω1 τxt (τ )|2 )dτ.
dτ dτ dτ
−∞

We can recognize some of the terms in the last line, as

Z∞ 2
Z∞ 2
M0 = dxt (τ ) dτ = 1 Ω2 LPFTΩ1 =0 (t, Ω) dΩ.
dτ 2π
−∞ −∞
Ljubiša Stanković Digital Signal Processing 597

This is the moment of Xt (Ω) = FT{ xt (τ )}, since the integral of |dxt (τ )/dτ |2 over τ is equal
to the integral of | jΩXt (Ω)|2 over Ω, according to Parseval’s theorem. Also, we can see that the
last term in MΩ1 contains the signal moment,

Z∞
mx = τ 2 | xt (τ )|2 dτ, (10.53)
−∞

multiplied by 4Ω21 . Then, it is easy to conclude that

Z∞
d[ xt (τ )] d[ x ∗ (τ )]
MΩ1 − M0 − 4m x Ω21 = Ω1 j2τxt∗ (τ ) − j2τxt (τ ) t dτ.
dτ dτ
−∞

Note that the last integral does not depend on parameter Ω1 . Thus, the relation among the LPFT
moments at any two Ω1 , for example, Ω1 = a and an arbitrary Ω1 , easily follows as the ratio

MΩ1 = a − M0 − 4a2 m x a
= . (10.54)
MΩ1 − M0 − 4Ω21 m x Ω1

With a = 1, by leaving the notation for an arbitrary Ω1 unchanged, we get

M1 − M0 − 4m x 1
= , (10.55)
MΩ1 − M0 − 4Ω21 m x Ω1

with M1 = MΩ1 =1 .
Obviously, the second-order moment, for any Ω1 , can be expressed as a function of other
three moments. In this case the relation reads

MΩ1 = 4Ω21 m x + Ω1 ( M1 − M0 − 4m x ) + M0 .

Example 10.18. Find the position and the value of the second-order moment minimum of the LPFT,
based on the windowed signal moment, the windowed signal’s Fourier transform moment, and
the LPFT moment for Ω1 = 1.

⋆ The minimum value of the second-order moment (meaning the best concentrated LPFT in the
sense of the duration measures) could be calculated from
dMΩ1
=0
dΩ1
as
M1 − M0 − 4m x
Ω1 = − .
8m x
Since m x > 0 this is a minimum of the function MΩ1 . Thus, in general, there is no need for a
direct search for the best concentrated LPFT over all possible values of Ω1 . It can be found based
on three moments.
598 Linear Time-Frequency Representations

The value of MΩ1 is

( M1 − M0 − 4m x )2
MΩ1 = M0 − . (10.56)
16m x
Note that any two moments, instead of M0 and M1 , could be used in the derivation.

The fractional Fourier transform easily reduces to the first-order LPFT.

10.7.1 Fractional Fourier Transform with Relation to the LPFT

The fractional Fourier transform (FRFT) for an angle α (α 6= kπ) is defined as

Z∞
Xα ( u ) = x (τ )Kα (u, τ )dτ, (10.57)
−∞

where r
1 − j cot α j(u2 /2) cot α j(τ2 /2) cot α − juτ csc α
Kα (u, τ ) = e e e . (10.58)
2π
It can be considered as a rotation of signal in the time-frequency plane for an angle α. Its inverse can be
considered as a rotation for angle −α
Z∞
x (t) = Xα (u)K−α (u, t)du.
−∞
√
Special cases of the FRFT reduce to: X0 (u) = x (u) and Xπ/2 (u) = X (u)/ 2π, that is, the signal
and its Fourier transform.
The windowed FRFT is
q Z∞
1− j cot α j(u2 /2) cot α 2
/2) cot α − juτ csc α
Xw,α (t, u) = 2π e x (t + τ )w(τ )e j(τ e dτ. (10.59)
−∞

Relation between the windowed FRFT and the first-order LPFT is

r
1 − j cot α j(u2 /2) cot α
Xw,α (t, u) = e LPFTΩ1 (t, Ω) (10.60)
2π
where Ω1 = cot(α)/2 and Ω = u csc(α). Thus, all results can be easily converted from the first-order
LPFT to the windowed FRFT, and vice versa. That is the reason why we will not present a detailed
analysis for this transform after the LPFT has been presented.
By using a window, local forms of the FRFT are introduced as:
Z∞
STFTα (u, v) = Xα (u + τ )w(τ )e− jvτ dτ (10.61)
−∞
Z∞
STFTα (u, v) = x (t + τ )w(τ )Kα (u, τ )dτ (10.62)
−∞
Ljubiša Stanković Digital Signal Processing 599

meaning that the lag truncation could be applied after signal rotation or prior to the rotation. Results
are similar. A similar relation for the moments, like (10.55) in the case of LPFT, could be derived here.
It states that any FRFT moment can be calculated if we know just any three of its moments.

10.8 HIGH-RESOLUTION STFT

High-resolution techniques are developed for efficient processing and separation of very close sinusoidal
signals (in array signal processing, separation of sources with very close DOAs). Among these
techniques the most widely used are Capon’s method, MUSIC, and ESPRIT. The formulation of
high-resolution techniques could be extended to the time-frequency representations. Here we will
present a simple formulation of the STFT and the LPFT within Capon’s method framework.

10.8.1 Capon’s STFT

Here we will present the STFT formulation in a common array signal-processing notation. The STFT
of a discrete time signal x (n) in (causal) notation

1 N −1
x (n + m)e− jωm
N n∑
STFT (ω, n) =
=0

can be written as
1 H
STFT (ω, n) = ŝω (n) = h H x(n) = a (ω )x(n)
N
a H (ω ) = [1 e−iω e−iω2 . . . e−iω ( N −1) ] (10.63)
T
x(n) = [ x (n) x (n + 1) x (n + 2) . . . x (n + N − 1)] ,

where T denotes the transpose operation, and H denotes the conjugate and transpose (Hermitian)
operation. Normalization of the STFT with N is done, as in the robust signal analysis.
The average power of the output signal ŝω (n), over M samples (ergodicity over M samples
around n is assumed), for a frequency ω, is
1
|ŝω (n)|2 (10.64)
M∑
P(ω ) =
n
1 H 1 1
= a (ω ) ∑[x(n)x H (n)]a(ω ) = 2 a H (ω )R̂ x a(ω ),
N2 M n N

where R̂ x is the matrix defined by

1
x ( n ) x H ( n ).
M∑
R̂ x =
n

The standard STFT (10.63) can be derived based on the following consideration. Find h as a solution
of the problem
min{h H h} subject to h H a(ω ) = 1. (10.65)
h
This minimization problem will be explained through the next example.
600 Linear Time-Frequency Representations

Example 10.19. Show that the output power of the filter producing s(n) = h H x(n) is minimized for
the input x(n) = Aa(ω ) + ε(n), with respect the input white noise ε(n), whose autocorrelation
function is R̂ε = ρI if h H h is minimum subject to h H a(ω ) = 1.

⋆ The output for the noise only is sε (n) = h H ε(n), while its average power is
1 1
|h H ε(n)|2 = ∑ h H ε(n)ε H (n)h
M∑n M n
!
H 1
ε(n)ε (n) h =ρ h H h.
H
M∑
=h
n

Minimization of h H h is therefore equivalent to the output white noise power minimization.

The condition h H a(ω ) = 1 means that the input in form of a sinusoid Aa(ω ), at frequency
ω, should not be changed, that is, if x(n) = Aa(ω ), then

h H x(n) = h H Aa(ω ) = A.

Thus, the condition h H a(ω ) = 1 means that the estimate is unbiased with respect to input
sinusoidal signal with amplitude A.

The solution of minimization problem (10.65) is

∂
{h H h + λ(h H a(ω ) − 1)} = 0 subject to h H a(ω ) = 1,
∂h H
2h = −λa(ω ) subject to h H a(ω ) = 1

resulting in
a(ω ) 1
h= = a(ω ) (10.66)
a H (ω )a(ω ) N
and the estimate (10.63), which is the standard STFT, follows.
Consider now a different optimization problem, defined by

1
|h H x(n)|2 } subject to h H a(ω ) = 1. (10.67)
M∑
min{
h n
Two points are emphasized in this optimization problem. First, the weights are selected to minimize the
1
average power M ∑n |h H x(n)|2 of the output signal of the filter. It means that the filter should give
the best possible suppression of all components of signals-plus-noise components of the observations
as well as a suppression of the components of the desired signal for all time-instants (minimization of
the power of y(n)). Second, by setting the condition h H a(ω ) = 1, in the considered time instant n the
signal amplitude is preserved at the output.
The optimization problem can be rewritten in the form

1
h H x(n)x H (n)h} subject h H a(ω ) = 1.
M∑
min{
h n
By denoting
1
x ( n ) x H ( n ),
M∑
R̂x =
n
Ljubiša Stanković Digital Signal Processing 601

we get

min{h H R̂x h} subject to h H a(ω ) = 1.

h
The constrained minimization

∂
{h H R̂x h + λ(h H a(ω ) − 1)} = 0 subject to h H a(ω ) = 1.
∂h H
gives the solution

−1 λa ( ω )
h = −R̂x subject to h H a(ω ) = 1. (10.68)
2
The solution can be written in the form
R̂− 1
x a(ω )
ĥ = , (10.69)
a (ω )R̂−
H 1
x a(ω )

where
1
x ( n ) x H ( n ). (10.70)
M∑
R̂x =
n
The output signal power, in these cases, corresponds to Capon’s form of the STFT, defined by
1
|h H x(n)|2 = h H R̂x h (10.71)
M∑
SCapon (ω ) =
n
!H
R̂− 1
x a(ω ) R̂− 1
x a(ω )
= R̂ x (10.72)
a H (ω )R̂− 1
x a(ω ) a H (ω )R̂− 1
x a(ω )
1
= . (10.73)
a H (ω )R̂− 1
x a(ω )

Note that a H (ω )R̂− 1

x a ( ω ) is a real valued scalar. Along with (10.70), we can use a sliding window
estimate of the autocorrelation matrix in the form
n+K/2
1
x ( p ) x H ( p ), (10.74)
K + 1 p=n∑
R̂x (n) =
−K/2

where K is a parameter defining the width of a symmetric sliding window. Inserting R̂x (n, K ) instead of
R̂x in (10.71) gives the STFT with weights minimizing the output power in (10.67), for the observations
in the neighborhood of the time instant of interest n.
The mean value of this power function, calculated in the neighborhood of the time n over the
window used in (10.74), gives an averaged Capon’s STFT as follows
1
SCapon (n, ω ) = . (10.75)
a H (ω )R̂− 1
x (n)a(ω )

where n indicates the time instant of the interest and the mean is calculated over the observations y(n)
in the corresponding window.
In the realization the autocorrelation function is regularized by a unity matrix I thus, we use
n+K/2
1
x( p)x H ( p) + ρI. (10.76)
K + 1 p=n∑
R̂(n) =
−K/2
602 Linear Time-Frequency Representations

instead of R̂x (n) for the inverse calculation in (10.75) and (10.71).

10.8.2 MUSIC STFT

In the MUSIC formulation of the high resolution STFT the eigenvalue decomposition of the
autocorrelation matrix (10.76) is used as
n+K/2
1
x( p)x H ( p) + ρI = V H (n)Λ(n)V(n),
K + 1 p=n∑
R̂(n) =
−K/2

R̂−1 (n) = V H (n)Λ−1 (n)V(n).

Note that the Capon spectrogram, using eigenvalues and eigenvectors of the autocorrelation matrix, can
be written as
1
SCapon (n, ω ) =
a H ( ω ) V H ( n ) Λ −1 ( n ) V ( n ) a ( ω )
1
=
N
∑ 1
λk |STFTk (n, ω )|2
k =1

where
STFTk (n, ω ) = a H (ω )vk (n)
is the STFT of the kth eigenvector (column) of the autocorrelation matrix R̂(n), corresponding to
the eigenvalue λk . If the signal has N − M components then the first N − M largest eigenvalues λk
(corresponding to the smallest values 1/λk ) will represent the signal space (components), and the
remaining M eigenvalues will correspond to the noise space (represented by ρI in the definition of
autocorrelation matrix R̂(n)).
If a frequency ω corresponds to a signal component, then all eigenvectors corresponding to
the noise space will be orthogonal to that harmonic, being represented by a H (ω ). It means that the
spectrograms of all noise space only components will be very small at the frequencies corresponding to
the signal frequencies.
The MUSIC STFT is defined based on this fact. It is calculated using the eigenvectors
corresponding to noise space, as
1 1
SMUSIC (n, ω ) = = , (10.77)
a H (ω )V H
M V M a(ω )
N
2
∑ |STFTk (n, ω )|
k = N − M +1

where V M is the eigenvector matrix containing only M eigenvectors corresponding to the M lowest
eigenvalues in Λ, representing the space of noise. In this case the signal has N − M components
corresponding to the largest eigenvalues. A special case with M = 1 is the Pisarenko method.

Example 10.20. Calculate high resolution forms of the spectrogram for two-component signal whose
frequencies ω0 + ∆ω and ω0 − ∆ω may be considered as constants around the instant of interest
n = 128,

x (n) = exp( jn(ω0 + ∆ω )) + exp( jn(ω0 − ∆ω )),

ω0 = 1 and ∆ω = 0.05.
Ljubiša Stanković Digital Signal Processing 603

In the STFT calculation use a rectangular window of the width N = 16. Use 15 samples for
averaging (estimation) of the autocorrelation matrix, as well as its regularization by a 0.0001 · I
(corresponding to noise signal x (n) + ε(n), where ε(n) is complex white noise with variance
σε2 = 0.0001). Assume that signal samples needed for autocorrelation function estimation are
also available.

⋆ Signal values around n = 128 are considered. The STFT is calculated using N = 16 signal
samples
x(128) = [ x (128) x (129) x (130) . . . x (143)] T
and a rectangular window. The mainlobe with of this window is D = 4π/N = π/4 = 0.7854.
Its will not be able to resolve two components closer than 2∆ω ∼ D/2 = 0.3927. Considered
∆ω = 0.05 is well below this limit. The STFT is interpolated in frequency up to 2048 samples.
The result is shown in Fig. 10.28(a). Next the autocorrelation matrix

1 128+7
R̂(128) = ∑ x( p)x H ( p) + 0.00001 · I
15 p=128 −7

is estimated using the signal vectors x( p) = [ x ( p) x ( p + 1) x ( p + 2) . . . x ( p + 15)]. Note that

values of signal from x (128 − 7) for p = 128 − 7 up to p = 128 + 7 + 15 are needed for this
calculation. Values of vector

a(ω ) = [1 eiω eiω2 ṡeiω ( N −1) ] T

are calculated at the frequencies of interest ω = 2πk/2048, for k = 0, 1, 2, . . . , 1023. The Capon’s
STFT is then
1 1
SCapon (128, ω ) = = .
a H (ω )R̂−1 (128)a(ω ) 16
∑ 1
λk |STFTk (n, ω )|2
k =1

Its value is presented in Fig. 10.28(b),(d).

The MUSIC spectrogram is obtained by calculating the eigenvectors of R̂(128) and using
only N − 2 eigenvectors corresponding to the noise space eigenvalues of this matrix (there are 2
signal components)
1 1
SMUSIC (n, ω ) = H V a(ω )
=
a H (ω )V14 14
16
2
∑ |STFTk (n, ω )|
k =3

where V14 is a 14 × 16 matrix containing 14 eigenvectors vk (n), k = 3, 4, . . . 16, corresponding

to the noise space (2 eigenvectors corresponding to two largest eigenvalues, being the signal
space, are omitted). The STFT of eigenvector vk (n) is denoted by STFTk (n, ω ). The MUSIC
spectrogram is presented in Fig. 10.28(c),(e).
The case corresponding to one eigenvector being used in the spectrogram |STFT16 (n, ω )|2
(a form of Pisarenko spectrogram, when only the lowest eigenvector is considered as the noise
space) is presented in Fig. 10.28(f). Note that in the case of Pisarenko spectrogram it is sufficient
(and required by its definition) to use only N = 3 window width (number of components plus
one).
Normalized values of all spectrograms are presented in Fig. 10.28.
604 Linear Time-Frequency Representations

Spectrogram Capon spectrogram (normalized)

1 1

0.5 0.5

0 0
(a) (b)
0 1 2 3 0 1 2 3
Ω Ω
MUSIC spectrogram (normalized) Capon spectrogram (zoomed log scale)

1
−1
10
−2
0.5 10
−3
10
0 −4
(c) 10 (d)
0 1 2 3 0.95 1 1.05
Ω Ω
MUSIC spectrogram (zoomed log scale) Pisarenko spectrogram (zoomed log scale)

−1 −1
10 10
−2 −2
10 10
−3 −3
10 10
−4 −4
10 (e) 10 (f)
0.95 1 1.05 0.95 1 1.05
Ω Ω

Figure 10.28 (a) The standard STFT using a rectangular window N = 16. The STFT is interpolated in frequency
up to 2048 samples. (b) Capon’s spectrogram calculated in 2048 frequency points. (c) MUSIC spectrogram calculated
in 2048 frequency points. (d) Capon’s spectrogram zoomed to the signal components. (e) MUSIC spectrogram
zoomed to the signal components. (f) Pisarenko spectrogram zoomed to the signal components.

10.8.3 Capon’s LPFT

With varying coefficients or appropriate signal multiplication, before the STFT calculation, a local
polynomial version of Capon’s transform could be defined. For example, for a linear frequency-
modulated signal of the form
2
x (n) = Ae j(α0 n +ω0 n+ ϕ0 )
we should use (10.75) or (10.71) with a signal of the form
n+K/2
1 2
xα ( p)x aH ( p) with xα ( p) = x( p)e− jαp ,
K + 1 p=n∑
R̂x (n, K, α) =
−K/2

where α as a parameter. The high-resolution form of the LPFT can be used for efficient processing of
close linear frequency-modulated signals, with the same rate within the considered interval.
Ljubiša Stanković Digital Signal Processing 605

Example 10.21. The Capon LPFT form is illustrated on an example with a signal with two close
components

x (t) = exp( j128πt(0.55 − t/2) + j5πt3 ) + exp( j128πt(0.45 − t/2) + j5πt3 ),

that in addition to the linear frequency-modulated contained a small disturbing cubic phase term.
The considered time interval was −1 ≤ t ≤ 1 − ∆t with ∆t = 2/512, ρ = 0.5, K = 30, and the
frequency domain is interpolated eight times. The standard STFT, LPFT, Capon’s STFT, and
Capon’s LPFT-based representations are presented in Fig. 10.29.

0.5 0.5

0 0
t

t
−0.5 −0.5

(a) (b)
−500 0 500 −500 0 500
Ω Ω

0.5 0.5

0 0
t

−0.5 −0.5

Figure 10.29 (a) The standard STFT, (b) the LPFT, (c) Capon’s STFT, and (d) Capon’s LPFT-based representations
of two close almost linear frequency-modulated signals.

In general, higher-order polynomial or any other nonstationary signal, with appropriate

parametrization, can be analyzed in the same way.
Chapter 11
Quadratic Time-Frequency Representations
The dimensions of the STFT blocks (resolutions) are determined by the window width. The best
STFT for a signal would be the one whose window form fits the best to the signal’s time-frequency
content. Consider, for example, an important and simple signal such as a linear frequency modulated
(LFM) chirp. For simplicity of analysis assume that its instantaneous frequency (IF) coincides with the
time-frequency plane diagonal. It is obvious that, due to symmetry, both time and frequency resolution
are equally important. Therefore, the best STFT would be the one calculated using a constant window
whose (equivalent) widths are equal in time and frequency domain. With such a window both resolutions
will be the same. However, these resolutions could be unacceptably low for many applications. It means
that the STFT, including all of its possible time and/or frequency-varying window forms, would be
unacceptable as a time-frequency representation of this signal. The overlapping STFT could be used
for better signal tracking, without any effect on the resolution.

11.1 WIGNER DISTRIBUTION

A way to improve time-frequency representation of this signal is in transforming the signal into a
sinusoid whose constant frequency is equal to the instantaneous frequency value of the linear frequency
modulated signal at the considered instant. Then, a wide window can be used, with a high frequency
resolution. The obtained result is valid for the considered instant only and the signal transformation
procedure should be repeated for each instant of interest.
A simple way to introduce this kind of signal representation is presented. Consider an LFM
signal,
x (t) = A exp( jφ(t)) = A exp( j( at2 /2 + bt + c)).
Its instantaneous frequency changes in time as

Ωi (t) = dφ(t)/dt = at + b.

One of the goals of time-frequency analysis is to obtain a function that will (in an ideal case) fully
concentrate the signal power along its instantaneous frequency. The ideal representation would be

I (t, Ω) = 2π A2 δ(Ω − Ωi (t)).

For a quadratic function φ(t), it is known that

dφ(t) τ τ
τ = φ(t + ) − φ(t − ) = τ ( at + b) = τΩi (t).
dt 2 2

606
Ljubiša Stanković Digital Signal Processing 607

Optimal STFT with a Hann window Wigner distribution with a Hann window

Figure 11.1 Optimal STFT (absolute value, calculated with optimal window width) and the Wigner distribution
of a linear frequency modulated signal.

This property can easily be converted into an ideal time-frequency representation for the linear frequency
modulated signal by using

FTτ { x (t + τ/2) x ∗ (t − τ/2)} =

FTτ { A2 e jΩi (t)τ } = 2πA2 δ(Ω − Ωi (t)).

The Fourier transform of x (t + τ/2) x ∗ (t − τ/2) over τ, for a given t, is called the Wigner distribution.
It is defined as
Z∞
WD (t, Ω) = x (t + τ/2) x ∗ (t − τ/2)e− jΩτ dτ. (11.1)
−∞
The Wigner distribution is originally introduced in quantum mechanics. The illustration of the Wigner
distribution calculation is presented in Fig. 11.2.
Expressing x (t) in terms of X (Ω) and substituting it into (11.1) we get
Z∞
1
WD (t, Ω) = X (Ω + θ/2) X ∗ (Ω − θ/2)e jθt dθ (11.2)
2π
−∞

what represents a definition of the Wigner distribution in the frequency domain.

It is easy to show that the Wigner distribution satisfies the marginal properties. From the Wigner
distribution definition, it follows
Z∞
1
x (t + τ/2) x ∗ (t − τ/2) = IFT{WD (t, Ω)} = WD (t, Ω)e jΩτ dΩ (11.3)
2π
−∞

which, for τ = 0, produces (11.29)

Z∞
1
| x (t)|2 = WD (t, Ω)dΩ. (11.4)
2π
−∞
608 Quadratic Time-Frequency Representations

x(t) considered instant t t

WD(t,Ω)

x(t+τ/2)

τ
x(t−τ/2)

τ
x(t+τ/2)x*(t−τ/2)

τ
FT{x(t+τ/2)x*(t−τ/2)}

Ω Ω

Figure 11.2 Illustration of the Wigner distribution calculation, for a considered time instant t. Real values of a
linear frequency modulated signal (linear chirp) are presented.

Based on the definition of the Wigner distribution in the frequency domain, (11.2), one may easily
prove the fulfillment of the frequency marginal.

Example 11.1. Find the Wigner distribution of signals: (a) x (t) = δ(t − t1 ) and (b) x (t) = exp( jΩ1 t).

⋆ The Wigner distribution of signal x (t) = δ(t − t1 ) is

Z∞
WD (t, Ω) = δ(t − t1 + τ/2)δ(t − t1 − τ/2)e− jΩτ dτ
−∞

= 2δ(2(t − t1 ))e− j2Ω(t−t1 ) = δ(t − t1 ),

since | a| δ( at) x (t) = δ(t) x (0). From the Wigner distribution definition in terms of the Fourier
transform, for x (t) = exp( jΩ1 t) with X (Ω) = 2πδ(Ω − Ω1 ), follows

WD (t, Ω) = 2πδ(Ω − Ω1 ).

A high concentration of time-frequency representation for both of these signals is achieved.

Note that this fact does not mean that we will be able to achieve an arbitrary high concentration
simultaneously, in a point, in the time-frequency domain.
Ljubiša Stanković Digital Signal Processing 609

2
Example 11.2. Consider a linear frequency modulated signal, x (t) = Ae jbt /2 . Find its Wigner
distribution.

⋆ In this case we have

x (t + τ/2) x ∗ (t − τ/2) = | A|2 e jbtτ

with
WD (t, Ω) = 2π | A|2 δ(Ω − bt).
Again, a high concentration along the instantaneous frequency in the time-frequency plane may
be achieved for the linear frequency modulated signals.

These two examples demonstrate that the Wigner distribution can provide superior time-frequency
representation of one-component signal, in comparison to the STFT.

Example 11.3. Calculate the Wigner distribution for a linear frequency modulated signal, with
Gaussian amplitude (Gaussian chirp signal)
2
/2 j(bt2 /2+ct)
x (t) = Ae− at e .

⋆ For the chirp signal, the local autocorrelation function reads as

2 2
R(t, τ ) = x (t + τ/2) x ∗ (t − τ/2) = | A|2 e− at e− aτ /4 e jbtτ + jcτ .

The Wigner distribution is obtained as the Fourier transform of R(t, τ ),

r
2 − at2 π − (Ω−bt−c)2
WD (t, Ω) = 2| A| e e a . (11.5)
a
The Wigner distribution from the previous example is obtained with c = 0 and a → 0, since
√ 2
2 π/ae−Ω /a → 2πδ(Ω) as a → 0.
The Wigner distribution of the Gaussian chirp signal is always positive, as it could be
expected from a distribution introduced with the aim to represent local density of signal energy.
Unfortunately, this is the only signal when the Wigner distribution is always positive, for any
point in the time-frequency plane (t, Ω). This drawback is not the only reason why the study of
time-frequency distributions does not end with the Wigner distribution.

11.1.1 Auto-Terms and Cross-Terms in the Wigner Distribution

For the multi-component signal

M
x (t) = ∑ xm (t)
m =1
610 Quadratic Time-Frequency Representations

the Wigner distribution has the form

M Z∞
M
τ ∗ τ − jΩτ
WD (t, Ω) = ∑ ∑ xm t +
2
xn t −
2
e dτ.
m=1 n=1−∞

Besides the auto-terms

M Z∞
τ ∗ τ
WDat (t, Ω) = ∑ xm (t + ) x (t − )e− jΩτ dτ,
m=1−∞
2 m 2

the Wigner distribution contains a significant number of cross-terms,

M M Z∞
τ ∗ τ
WDct (t, Ω) = ∑ ∑ xm (t + ) x (t − )e− jΩτ dτ.
m =1 n =1 − ∞
2 n 2
n6=m

Usually, they are not desirable in the time-frequency signal analysis. Cross-terms can mask the presence
of auto-terms, which makes the Wigner distribution unsuitable for the time-frequency analysis of
signals.
For a two-component signal with auto-terms located around (t1 , Ω1 ) and (t2 , Ω2 ) (see Fig.11.3)
the oscillatory cross-terms are located around ((t1 + t2 )/2, (Ω1 + Ω2 )/2).

Auto−term

Ω
2

Ω1 Oscillatory
cross−term

Auto−term

0 t1 t2 t

Figure 11.3 Wigner distribution of two component signal.

Example 11.4. Analyze auto-terms and cross-terms for two-component signal of the form
1 2 jΩ1 t 1 2 − jΩ1 t
x ( t ) = e − 2 ( t − t1 ) e
+ e − 2 ( t + t1 ) e
Ljubiša Stanković Digital Signal Processing 611

⋆ In this case we have

√ 2 2 √ 2 2
WD (t, Ω) = 2 πe−(t−t1 ) −(Ω−Ω1 ) + 2 πe−(t+t1 ) −(Ω+Ω1 )
√ − t2 − Ω2
+ 4 πe cos(2t1 Ω − 2Ω1 t)

where the first and second terms represent auto-terms while the third term is a cross-term. Note
that the cross-term is oscillatory in both directions. The oscillation rate along the time axis is
proportional to the frequency distance between components 2Ω1 , while the oscillation rate along
frequency axis is proportional to the distance in time of components, 2t1 . The oscillatory nature
of cross-terms will be used for their suppression.

To analyze auto-terms and cross-terms, the well-known ambiguity function can be used as well.
It is defined as:
Z∞
τ ∗ τ − jθt
AF (θ, τ ) = x t+ x t− e dt. (11.6)
2 2
−∞
It is already a classical tool in optics as well as in radar and sonar signal analysis.
The ambiguity function and the Wigner distribution form a two-dimensional Fourier transform
pair

AF (θ, τ ) = FT2D
t,Ω {WD ( t, Ω )},
 
∞
Z Z∞ Z∞
1 τ τ
 x (u + ) x ∗ (u − )e− jθu du e jθt− jΩτ dτdθ,
WD (t, Ω) =
2π 2 2
−∞ −∞ −∞

where the integration over frequency related variable θ assumes factor 1/(2π ) and the positive sign in
the exponent exp( jθt).
Consider a signal whose components are limited in time to

xm (t) 6= 0 only for |t − tm | < Tm .

∗ ( t − τ/2) 6 = 0 only for
In the ambiguity (θ, τ ) domain we have xm (t + τ/2) xm

− Tm < t − tm + τ/2 < Tm

− Tm < t − tm − τ/2 < Tm .

It means that xm (t + τ/2) xm ∗ ( t − τ/2) is located within | τ | < 2T , that is, around the θ-axis
m
independently of the signal’s position tm . Cross-term between signal’s m-th and n-th component
is located within |τ + tn − tm | < Tm + Tn . It is dislocated from τ = 0 for two components that do not
occur simultaneously, that is, when tm 6= tn .
From the frequency domain definition of the Wigner distribution a corresponding ambiguity
function form follows

Z∞
1 θ θ jΩτ
AF (θ, τ ) = X Ω+ X∗ Ω − e dΩ. (11.7)
2π 2 2
−∞
From this form we can conclude that the auto-terms of the components, limited in frequency to
Xm (Ω) 6= 0 only for |Ω − Ωm | < Wm , are located in the ambiguity domain around τ-axis within the
612 Quadratic Time-Frequency Representations

region |θ/2| < Wm . The cross-terms are within

|θ + Ωn − Ωm | < Wm + Wn ,

where Ωm and Ωn are the frequencies around which the Fourier transform of each component lies.
Therefore, all auto-terms are located along and around the ambiguity domain axis. The cross-terms,
for the components which do not overlap in the time and frequency, simultaneously, are dislocated from
the ambiguity axes, Fig. 11.4. This property will be used in the definition of the reduced interference
time-frequency distributions.

| AF (θ,τ) |
τ
Cross−term

τ
2

Auto−terms
0

τ
1

Cross−term

θ 0 θ θ
1 2

Figure 11.4 Auto and cross-terms for two-component signal in the ambiguity domain.

The ambiguity function of a four-component signal consisting of two Gaussian pulses, one
sinusoidal and one linear frequency modulated component is presented in 11.5.

Example 11.5. Let us consider signals of the form

1 2 1 2 jΩ1 t 1 2 − jΩ1 t
x1 ( t ) = e − 2 t and x2 (t) = e− 2 (t−t1 ) e
+ e − 2 ( t + t1 ) e
.

The ambiguity function of x1 (t) is

√ 1 2 1 2
AFx1 (θ, τ ) = πe− 4 τ − 4 θ

while the ambiguity function of two-component signal x2 (t) is

√ 1 2 1 2 √ 1 2 1 2
AFx2 (θ, τ ) = πe− 4 τ − 4 θ e jΩ1 τ e− jt1 θ + πe− 4 τ − 4 θ e− jΩ1 τ e jt1 θ +
√ − 1 (τ −2t )2 − 1 (θ −2Ω )2 √ − 1 (τ +2t )2 − 1 (θ +2Ω )2
πe 4 1 4 1
+ πe 4 1 4 1

In the ambiguity domain (θ, τ ) auto-terms are located around (0, 0) while cross-terms are located
around (2Ω1 , 2t1 ) and (−2Ω1 , −2t1 ) as presented in Fig. 11.4.
Ljubiša Stanković Digital Signal Processing 613

AF(θ,τ)

100

τ 0

−50

−100
0 1 2 3
−3 −2 −1
θ

Figure 11.5 Ambiguity function of signal from Fig.10.4

11.1.2 Wigner Distribution Properties

A list of the properties satisfied by the Wigner distribution follows. The obvious ones will be just stated,
while the proofs will be given for more complex ones. In the case when the Wigner distributions of
more than one signal are considered, the signal will be added as an index in the Wigner distribution
notation. Otherwise signal x (t) is assumed, as a default signal in the notation.
P1 – Realness
For any signal holds,
WD ∗ (t, Ω) = WD (t, Ω).
P2 – Time-shift property
The Wigner distribution of a signal shifted in time

y ( t ) = x ( t − t0 ),

is
WDy (t, Ω) = WDx (t − t0 , Ω).
P3 – Frequency shift property
For a modulated signal
y(t) = x (t)e jΩ0 t ,
we have
WDy (t, Ω) = WDx (t, Ω − Ω0 ).
P4 – Time marginal property
Z∞
1
WD (t, Ω)dΩ = | x (t)|2 .
2π
−∞
P5 – Frequency marginal property
Z∞
WD (t, Ω)dt = | X (Ω)|2 .
−∞
614 Quadratic Time-Frequency Representations

P6 – Time moments property

Z∞ Z∞ Z∞
1
tn WD (t, Ω)dtdΩ = tn | x (t)|2 dt.
2π
−∞ −∞ −∞

1
R∞
This property follows from 2π −∞ WD ( t, Ω ) dΩ = | x (t)|2 .

P7 -Frequency moments property

Z∞ Z∞ Z∞
Ωn WD (t, Ω)dΩdt = Ωn | X (Ω)|2 dΩ.
−∞ −∞ −∞

P8 – Scaling
For a scaled version of the signal
q
y(t) = | a| x ( at), a 6= 0,

the Wigner distribution reads

WDy (t, Ω) = WDx ( at, Ω/a).
P9 – Instantaneous frequency property
For x (t) = A(t)e jφ(t)
R∞
Ω WD (t, Ω) dΩ d
R−∞
∞ = Ωi (t) = arg[ x (t)] = φ′ (t). (11.8)
−∞ WD ( t, Ω ) dΩ dt

In order to prove this property, we will use the derivative of the inverse Fourier transform of the
Wigner distribution
Z∞
d[ x (t + τ/2) x ∗ (t − τ/2)] 1
= jΩ WD (t, Ω)e jΩτ dΩ
dτ 2π
−∞

with x (t) = A(t)e jφ(t) , calculated at τ = 0. It results in

Z∞
j 1 ′
Ω WD (t, Ω) dΩ = [ x (t) x ∗ (t) − x (t) x ∗′ (t)] = jφ′ (t) A2 (t).
2π 2
−∞
R∞
With the frequency marginal property −∞ WD (t, Ω) dΩ = 2πA2 (t), this property follows.

P10 – Group delay

For signal whose Fourier transform is of the form X (Ω) = | X (Ω)| e jΦ(Ω) , the group delay
t g (Ω) = −Φ′ (Ω) is
R∞
t WD (t, Ω) dt d
R−∞∞ = t g (Ω) = − arg[ X (Ω)] = −Φ′ (Ω).
−∞ WD ( t, Ω ) dt dΩ

The proof is the same as in the instantaneous frequency case, using the frequency domain relations.
P11 – Time constraint

If x (t) = 0 for t outside [t1 , t2 ], then WD (t, Ω) = 0 for t outside [t1 , t2 ].

Ljubiša Stanković Digital Signal Processing 615

The Wigner distribution is a function of x (t + τ/2) x ∗ (t − τ/2). If x (t) = 0 for t outside [t1 , t2 ] then
x (t + τ/2) x ∗ (t − τ/2) is different from zero within

t1 ≤ t + τ/2 ≤ t2 and t1 ≤ t − τ/2 ≤ t2 .

The range of values of t defined by the previous inequalities is t1 ≤ t ≤ t2 .

P12 – Frequency constraint

If X (Ω) = 0 for Ω outside [Ω1 , Ω2 ], then, also WD (t, Ω) = 0 for Ω outside [Ω1 , Ω2 ].

P13 – Convolution
Z∞
WDy (t, Ω) = WDh (t − τ, Ω)WDx (τ, Ω)dτ.
−∞
for
Z∞
y(t) = h(t − τ ) x (τ )dτ,
−∞
P14 – Product
Z∞
1
WDy (t, Ω) = WDh (t, Ω − v)WDx (t, v)dv
2π
−∞
for
y ( t ) = h ( t ) x ( t ).
The local autocorrelation of y(t) is h(t + τ/2)h∗ (t − τ/2) x (t + τ/2) x ∗ (t − τ/2). Thus, the Wigner
distribution of y(t) is the Fourier transform of the product of local autocorrelations h(t + τ/2)h∗ (t −
τ/2) and x (t + τ/2) x ∗ (t − τ/2). It is a convolution in frequency of the corresponding Wigner
distributions of h(t) and x (t). Property P13 could be proven in the same way using the Fourier
transforms of signals h(t) and x (t).
P15 – Fourier transform property

WDy (t, Ω) = WDx (−Ω/c, ct) (11.9)

for q
y(t) = |c|/(2π ) X (ct), c 6= 0.
Here the signal y(t) is equal to the scaled version of the Fourier transform of signal x (t),
Z∞
|c| cτ ∗ cτ − jΩτ
WDy (t, Ω) = X ct + X ct − e dτ
2π 2 2
−∞
Z∞
1 θ θ j(−Ω/c)θ
= X ct + X ∗ ct − e dθ. (11.10)
2π 2 2
−∞

Comparing (11.2) to (11.1), with ct → Ω and (−Ω/c) → t, we get

Z∞
Ω τ Ω τ − jctτ Ω
WDy (t, Ω) = x − + x∗ − − e dτ = WDx − , ct .
c 2 c 2 c
−∞
616 Quadratic Time-Frequency Representations

P16 – Chirp convolution

Ω
WDy (t, Ω) = WDx t − , Ω (11.11)
c
for q
2
y(t) = x (t) ∗ |c|e jct /2 .
p 2 p 2
With Y (Ω) = FT{ x (t) ∗t |c|e jct /2 } = 2πjX (Ω)e− jΩ /(2c) and the signal’s Fourier transform-
based definition of the Wigner distribution, proof of this property reduces to the next one.
P17 – Chirp product
WDy (t, Ω) = WDx (t, Ω − ct)
for
2
y(t) = x (t)e jct /2
.
The Wigner distribution of y(t) is
Z∞ τ jc(t+τ/2)2 /2 ∗ τ − jc(t−τ/2)2 /2 − jΩτ
x t+ e x t− e e dτ
2 2
−∞
Z∞ τ ∗ τ jctτ − jΩτ
= x t+ x t− e e dτ = WDx (t, Ω − ct). (11.12)
2 2
−∞

P18 – Moyal property

2
Z∞ Z∞ Z∞
1
WDx (t, Ω)WDy (t, Ω)dtdΩ = x (t)y(t)dt . (11.13)
2π −∞
−∞ −∞

This property follows from

Z∞ Z∞ Z∞ Z∞
1 τ τ τ τ
x t + 1 x ∗ t − 1 y t + 2 y∗ t − 2 e− jΩτ1 e− jΩτ2 dΩ dτ1 dτ2 dt
2π 2 2 2 2
−∞ −∞ −∞ −∞
Z∞ Z∞ τ τ τ ∗ τ
= x t+ x∗ t − y t− y t+ dτdt.
2 2 2 2
−∞ −∞

With t + τ/2 = u and t − τ/2 = v, we get

2
Z∞ Z∞ Z∞
∗ ∗

x (u) y (u)du x (v) y (v) dv = x (t)y(t)dt .
−∞ −∞
−∞

11.1.3 Pseudo and Smoothed Wigner Distribution

In practical realizations of the Wigner distribution, we are constrained with a finite time lag τ. A pseudo
form of the Wigner distribution is then used. It is defined as
Z∞
PWD (t, Ω) = w(τ/2)w∗ (−τ/2) x (t + τ/2) x ∗ (t − τ/2)e− jΩτ dτ (11.14)
−∞
Ljubiša Stanković Digital Signal Processing 617

where window w(τ ) localizes the considered lag interval. If w(0) = 1, the pseudo Wigner distribution
satisfies the time marginal property. Note that the pseudo Wigner distribution is smoothed in the
frequency direction with respect to the Wigner distribution
Z∞
1
PWD (t, Ω) = WD (t, θ )We (Ω − θ )dθ
2π
−∞

where We (Ω) is a Fourier transform of w(τ/2)w∗ (−τ/2).

The pseudo Wigner distribution example for multi-component signals is presented in Fig.11.6. In
the case of multi-component signal the cross-terms between components are emphatic.
Mono-component case with sinusoidally frequency modulated signal is presented in Fig.11.7.
Note that significant inner interferences are present.

PWD (t,Ω)
1

250

200

150

100
t
50
(a)
0 2.5 3
0.5 1 1.5 2
0

Ω
PWD2(t,Ω)

250

200

150

100
t
50
(b)
0 2.5 3
0.5 1 1.5 2
0

Figure 11.6 Pseudo Wigner distribution of the signals from Fig.10.4

Monocomponent case with sinusoidally frequency modulated signal is presented in Fig.11.7.

Note that significant inner interferences are present.
618 Quadratic Time-Frequency Representations

Example 11.6. For a sinusoidally frequency modulated signal

x (t) = exp(− j128 cos(πt/64))

calculate an approximate value of the pseudo Wigner distribution with a window w(τ ) of the
width defined by T = 8.

⋆ The pseudo Wigner distribution of this signal is

Z8
PWD (Ω, t) = e− j128 cos(π (t+τ/2)/64) e j128 cos(π (t−τ/2)/64) w(τ )e− jΩτ dτ.
−8

By using the Taylor expansion with respect to a relatively small τ

πτ π
cos πt/64 ± = cos(πt/64) ∓ sin(πt/64)τ
128 128
π 2 τ 2 π 3
3 )
(∓τ1,2
− cos(πt/64) + sin(πt/64) ,
128 2 128 6
with |τ1,2 | ≤ 8 in the Taylor series reminder, we get

Z8 π3 τ 3 +τ23
j128 sin(πt/64) 1
WD (Ω, t) = e j2π sin(πt/64)τ e 1283 6 w(τ )e− jΩτ dτ.
−8

π3 τ13 +τ23
For |τ1,2 | ≤ 8 it holds 128 128 3 sin( πt/64) 6 ≤ 0.33. By neglecting this term we may write

PWD (Ω, t) ∼
= W (Ω − 2π sin(πt/64)),

where W (Ω) is the Fourier transform of window w(τ ) Fig.11.7(a) (with a Hann(ing) window).
For a wider window this approximation does not hold and the inner interferences in the Wigner
distribution appear, Fig.11.7(b) (with a four times wider Hann(ing) window).

11.1.4 Discrete Pseudo Wigner Distribution

If the signal in (11.14) is discretized in τ with a sampling interval ∆t, then a sum instead of an integral
is formed. The pseudo Wigner distribution of a discrete-lag signal, for a given time instant t, is given by
∞
∗ ∗ − jmΩ∆t
PWD (t, Ω) = ∑ w m ∆t
2 w − m ∆t
2 x t + m ∆t
2 x t − m ∆t
2 e ∆t. (11.15)
m=−∞

Sampling in τ with ∆t = π/Ω0 , Ω0 > Ωm corresponds to the sampling of signal x (t + τ/2) in τ/2
with ∆t/2 = π/(2Ω0 ).
The discrete-lag pseudo Wigner distribution is the Fourier transform of signal

∆t ∆t ∆t ∆t
R(t, m) = w m w∗ −m x t+m x∗ t − m ∆t.
2 2 2 2
Ljubiša Stanković Digital Signal Processing 619

PWD(t,Ω) PWD(t,Ω)

50 50

t t
0 0

−50 (a) −50 (b)

0 5 10 0 5 10 Ω
−10 −5 Ω −10 −5

Figure 11.7 Pseudo Wigner distribution of sinusoidally frequency modulated signal. Narrow Hann(ing) window
(left) and a four times wider window (right).

For a given instant t, it can be written as

∞
PWD (t, ω ) = ∑ R(t, m)e− jmω
m=−∞

with ω = Ω∆t. If the sampling interval satisfies the sampling theorem, then the sum in (11.15) is equal
to the integral form (11.14). A discrete form of the pseudo Wigner distribution, with N + 1 samples
and ω = 2πk/( N + 1), for a given time instant t, is
N/2
PWD (t, k) = ∑ R(t, m)e− j2πmk/( N +1) .
m=− N/2

Here, N/2 is an integer. This distribution could be calculated using the standard DFT routines.
For discrete-time instants t = n∆t, introducing the notation

∆t ∗ ∆t ∆t ∗ ∆t
R(n∆t, m∆t) = w m w −m x n∆t + m x n∆t − m ∆t
2 2 2 2
m m
m ∗ m
R(n, m) = w w∗ − x n+ x n− ,
2 2 2 2
the discrete-time and discrete-lag pseudo Wigner distribution can be written as
∞ m m m ∗ m − jmω
PWD (n, ω ) = ∑ w w∗ − x n+ x n− e . (11.16)
m=−∞ 2 2 2 2

Notation x (n + m/2), for given n and m, should be understood as the signal value at the instant
x ((n + m/2)∆t). In this notation, the discrete-time pseudo Wigner distribution is periodic in ω with
period 2π.
Since various discretization steps are used (here and in open literature), we will provide a relation
of discrete indexes to the continuous time and frequency, for each definition, as

2πk
PWD (t, Ω)|t=n∆t, Ω= 2πk = PWD n∆t, → PWD (n, k).
( N +1)∆t ( N + 1)∆t
The sign → could be understood as the equality sign in the sense of sampling theorem (Example 2.13).
Otherwise, it should be considered as a correspondence sign. The discrete form of (11.14), with N + 1
620 Quadratic Time-Frequency Representations

samples, is

2πk
PWD n∆t, → PWD (n, k)
( N + 1)∆t
N/2 m m m ∗ m − j2πkm/( N +1)
PWD (n, k) = ∑ w w∗ − x n+ x n− e ,
m=− N/2
2 2 2 2

where N/2 is an integer, − N/2 ≤ k ≤ N/2 and ω = Ω∆t = 2πk/( N + 1) or Ω = 2πk/(( N +

1)∆t).
In order to avoid different sampling intervals in time and lag in the discrete Wigner distribution
definition, the discrete Wigner distribution can be oversampled in time, as it has been done in lag. It
means that the same sampling interval ∆t/2, for both time and lag axes, can be used. Then, we can
write

∆t
R n , m∆t → R(n, m)
2

∆t ∆t ∆t ∆t ∆t ∆t ∆t
R n , m∆t = w m w∗ −m x n +m x∗ n − m ∆t
2 2 2 2 2 2 2
R(n, m) = w(m)w∗ (−m) x (n + m) x ∗ (n − m)

The discrete-time and discrete-lag pseudo Wigner distribution, in this case, is of the form
∞
PWD (n, ω ) = 2 ∑ w(m)w∗ (−m) x (n + m) x ∗ (n − m)e− j2mω . (11.17)
m=−∞

It corresponds to the continuous-time pseudo Wigner distribution (11.14) with substitution τ/2 → τ
Z∞
PWD (t, Ω) = 2 w(τ )w∗ (−τ ) x (t + τ ) x ∗ (t − τ )e− j2Ωτ dτ.
−∞

The discrete pseudo Wigner distribution is given here by

n∆t 4πk
PWD , → PWD (n, k)
2 ( N + 1)∆t
N/2
PWD (n, k) = ∑ w(m)w∗ (−m) x (n + m) x ∗ (n − m)e− j4πmk/( N +1) (11.18)
m=− N/2

for − N/2 ≤ 2k ≤ N/2. Since, the standard DFT routines are commonly used for the pseudo Wigner
distribution calculation, we may use every other (2k) sample in (11.18) or oversample the pseudo
Wigner distribution in frequency (as it has been done in time). Then,

n∆t 2πk
PWD , → PWD (n, k)
2 ( N + 1)∆t
N/2
PWD (n, k) = ∑ w(m)w∗ (−m) x (n + m) x ∗ (n − m)e− j2πmk/( N +1) . (11.19)
m=− N/2

This discrete pseudo Wigner distribution, oversampled in both time and in frequency by factor of 2, has
finer time-frequency grid, producing smaller time-frequency estimation errors at the expense of the
calculation complexity.
Ljubiša Stanković Digital Signal Processing 621

Example 11.7. Signal x (t) = exp( j31πt2 ) is considered within −1 ≤ t ≤ 1. Find the sampling
interval of signal for discrete pseudo Wigner distribution calculation. If the rectangular window of
the width N + 1 = 31 is used in analysis, find the pseudo Wigner distribution values and estimate
the instantaneous frequency at t = 0.5 based on the discrete pseudo Wigner distribution.

⋆ For this signal the instantaneous frequency is Ωi (t) = 62πt. It is within the range −62π ≤
Ωi (t) ≤ 62π. Thus, we may approximately assume that the maximum frequency is Ωm =
62π.The sampling interval for the Fourier transform would be ∆t ≤ 1/62. For the direct pseudo
Wigner distribution calculation, it should be twice smaller, ∆t/2
√ ≤ 1/124. Therefore, the discrete
version of the pseudo Wigner distribution, normalized with 2 ∆t, at t = 0.5 or n = 62, is (11.18)
15 2 2
PWD (n, k) = ∑ e j31π ((n+m)/124) e− j31π ((n−m)/124) e− j4πmk/31
m=−15
15 sin( π8 (n − 16k))
= ∑ e jπmn/124 e− j4πmk/31 = π .
m=−15
sin( 248 (n − 16k))

The argument k, when the pseudo Wigner distribution reaches maximum for n = 62, follows
from 62 − 16k = 0 as

62
k̂ = arg max PWD (n, k) = = 4,
k 16

where [·] stands for the nearest integer. Obviously, the exact instantaneous frequency is not on the
discrete frequency grid. The estimated value of the instantaneous frequency at t = 1/2 is

Ω̂ = 4π k̂/(( N + 1)∆t) = 16π/(31/62) = 32π.

The true value is Ωi (1/2) = 31π. When the true frequency is not on the grid, the estimation can
be improved using the interpolation or displacement bin, as explained in Chapter 1. The frequency
sampling interval is ∆Ω = 4π/(( N + 1)∆t) = 8π, with maximum estimation absolute error
∆Ω/2 = 4π.
If we used the standard DFT routine (11.19) with N + 1 = 31 and all available frequency
samples, we would get
n 2 2
o
PWD (n, k) = DFT31 e j31π ((n+m)/124) e− j31π ((n−m)/124)
15 2 2 sin( π8 (n − 8k))
= ∑ e j31π ((n+m)/124) e− j31π ((n−m)/124) e− j2πmk/31 = π .
m=−15
sin( 248 (n − 8k))

The maximum would be at k̂ = 8, with the estimated frequency Ω̂ = 2π k̂/(( N + 1)∆t).

Thus, Ω̂ = 32π, as expected. By this calculation, the frequency sampling interval is ∆Ω =
2π/(( N + 1)∆t) = 4π, with the maximum estimation absolute error ∆Ω/2 = 2π.

Using an odd number of samples N + 1 in the previous definitions, the symmetry of the product
x (n + m) x ∗ (n − m) is preserved in the summation. However, when an even number of samples is
used, that is not the case. To illustrate this effect, consider a simple example of signal, for n = 0, with
N = 4 samples. Then, four values of the signal x (m), used in calculation, are
622 Quadratic Time-Frequency Representations

x (m) x (−2) x (−1) x (0) x (1)

.
x (−m) x (1) x (0) x (−1) x (−2)

So, in forming the local autocorrelation function, there are several possibilities. One is to omit
sample x (−2) and to use an odd number of samples, in this case as well. Also, it is possible to
periodically extend the signal and to form the product based on

x (m) · · · x (1) x (−2) x (−1) x (0) x (1) x (−2) x (−1)

x (−m) · · · x (−1) x (−2) x (1) x (0) x (−1) x (−2) x (1)
we ( m ) · · · 0 0 w e ( 1 ) w0 ( 0 ) w e ( 1 ) 0 0

Here, we can use four product terms, but with the first one formed as x (−2) x ∗ (−2), that is, as
x (− N/2) x ∗ (− N/2). When a lag window with zero ending value is used (for example, a Hann(ing)
window), this term does not make any influence to the result. The used lag window must also follow
the symmetry, for example we (m) = cos2 (πm/N ), when,

n∆t 2πk
PWD , → PWD (n, k)
2 N∆t
N/2−1
PWD (n, k) = ∑ we (m) x (n + m) x ∗ (n − m)e− j2πmk/N
m=− N/2
N/2−1
= ∑ we (m) x (n + m) x ∗ (n − m)e− j2πmk/N ,
m=− N/2+1

since we (− N/2) = 0. However, if the window is nonzero at the ending point m = − N/2, this term
will result in a kind of aliased distribution.
In order to introduce another way of the discrete Wigner distribution calculation, with an even
number of samples, consider again the continuous form of the Wigner distribution of a signal with a
limited duration. Assume that the signal is sampled in such a way that the sampling theorem can be
applied and the equality sign used (Example 2.13). Then, the integral may be replaced by a sum
N
∗ ∆t − jmΩ∆t
WD (t, Ω) = ∑ x (t + m ∆t
2 ) x (t − m 2 )e ∆t
m=− N
N/2
∗ ∆t − j2mΩ∆t
= ∑ x (t + 2m ∆t
2 ) x ( t − 2m 2 ) e ∆t
m=− N/2
N/2−1
∗ ∆t − j(2m+1)Ω∆t
+ ∑ x (t + (2m + 1) ∆t
2 ) x ( t − (2m + 1) 2 ) e ∆t. (11.20)
m=− N/2

The initial sum is split into its even and odd terms part. Now, let us assume that the signal is sampled in
such a way that twice wider sampling interval ∆t is also sufficient to obtain the Wigner distribution (by
using every other signal sample). Then, for the first sum (with an odd number of samples) holds,
N/2
1
∑ x (t + m∆t) x ∗ (t − m∆t)e− j2mΩ∆t ∆t = WD (t, Ω).
m=− N/2
2
Ljubiša Stanković Digital Signal Processing 623

The factor 1/2 comes from the sampling interval. Now, from (11.20) follows
N/2−1
∗ ∆t − j(2m+1)Ω∆t 1
∑ x (t + (2m + 1) ∆t
2 ) x ( t − (2m + 1) 2 ) e ∆t = WD (t, Ω). (11.21)
m=− N/2
2

This is just the discrete Wigner distribution with an even number of samples. If we denote

x (t + (2m + 1) ∆t ∆t
2 ) = x ( t + m∆t + 2 ) = xe ( t + m∆t )
√
x (n∆t + m∆t + ∆t 2 ) 2∆t = xe ( n + m )

then
∆t ∆t
x (t − m∆t − 2 ) = x ( t − m∆t + 2 − ∆t )
√
∆t
x (n∆t − m∆t + 2 − ∆t ) 2∆t = xe ( n − m − 1).

The summation terms, for example for n = 0, are of the form

xe (m) . . . xe (−2) xe (−1) xe (0) xe (1) . . .

.
xe (−m − 1) . . . xe (1) xe (0) xe (−1) xe (−2) . . .

They would produce a modulated version of the pseudo Wigner distribution, due to the shift of a
half of the sampling interval. However, this shift can be corrected as (11.21)
N/2−1
WD (t, Ω) = e− jΩ∆t ∑ xe (t + m∆t) xe∗ (t − m∆t − ∆t)e− j2mΩ∆t (2∆t)
m=− N/2

for any t and Ω (having in mind the sampling theorem). Thus, we may also write

πk
WD n∆t, → WD (n, k)
N∆t
N/2−1
WD (n, k) = e− jπk/N ∑ xe (n + m) xe∗ (n − m − 1)e− j2πmk/N . (11.22)
m=− N/2

In MATLAB notation, relation (11.1.4) can be implemented, as follows. The signal values are

x+
n = [ xe ( n − N/2), xe ( n − N/2 + 1), . . . , xe ( n + N/2 − 1)],

x− ∗ ∗ ∗
n = [ xe ( n + N/2 − 1), xe ( n + N/2 − 2), . . . , xe ( n − N/2)].
The vector of Wigner distribution values, for a given n and k, is
T
− jπk/N + − − jπkm/N
WD (n, k)=e xn ∗ xn . ∗ e ,

where e− jπkm/N is the vector with elements e− jπkm/N , for − N/2 ≤ m ≤ N/2 − 1, ∗ is the matrix
multiplication and .∗ denotes the vector multiplication term by term.
Thus, in the case of an even number of samples, the discrete Wigner distribution of a signal xe (n),
calculated according to (11.1.4), corresponds to the original signal x (t) related to xe (n) as
√
xe (n) ↔ x (n∆t + ∆t/2) 2∆t.
624 Quadratic Time-Frequency Representations

To check this statement, consider the time marginal property of this distribution. It is
!
1 N/2−1 N/2−1
∗ 1 N/2−1 − j(2m+1)πk/N
∑ WD (n, k) = ∑
N k=−
x e ( n + m ) x e ( n − m − 1)
N k=−∑ e
N/2 m=− N/2 N/2
!
N/2−1 −
1 1 − e j(2m+1)π
= ∑ xe (n + m) xe∗ (n − m − 1) e j(2m+1)π/2
m=− N/2
N 1 − e− j(2m+1)π/N

N/2−1 1 2
= ∑ ( xe (n + m) xe∗ (n − m − 1)δ(2m + 1)) = xe (n − ) = | x (n∆t)|2 (2∆t),
m=− N/2
2

for |2m + 1| < N.

Since for any signal y(n) and its DFT holds

DFT N/2 {y(n) + y(n + N/2)} = Y (2k),

where
Y (k) = DFT N {y(n)},
the pseudo Wigner distribution (11.1.4), without frequency ovesampling, in the case of an even N, can
be calculated as

2πk
WD n∆t, → WD (n, k)
N∆t
N/4−1
WD (n, k) = e− jπk/( N/2) ∑ ( R(n, m) + R(n, m + N/2)) e− j2πmk/( N/2)
m=− N/4

where
R(n, m) = xe (n + m) xe∗ (n − m − 1).
Periodicity in m, for a given n, with period N is assumed in R(n, m), that is, R(n, m + N ) = R(n, m) =
R(n, m − N ). It is needed to calculate R(n, m + N/2) for − N/4 ≤ m ≤ N/4 − 1 using R(n, m)
for − N/2 ≤ m ≤ N/2 − 1 only.
In the case of real-valued signals, in order to avoid the need for oversampling, as well as to
eliminate cross-terms (that will be discussed later) between positive and negative frequency components,
their analytic part is used in calculations.

11.2 FROM THE STFT TO THE WIGNER DISTRIBUTION VIA S-METHOD

The pseudo Wigner distribution can be calculated as

Z∞
1
PWD (t, Ω) = STFT (t, Ω + θ )STFT ∗ (t, Ω − θ )dθ. (11.23)
π
−∞

Where STFT is defined as

Z∞
STFT (t, Ω) = x (t + τ )w(τ )e− jΩτ dτ. (11.24)
−∞

This can be proven by substituting (11.24) into (11.23).

Ljubiša Stanković Digital Signal Processing 625

Relation (11.23) has led to the definition of a time-frequency distribution

ZL P
1
SM(t, Ω) = P(θ )STFT (t, Ω + θ )STFT ∗ (t, Ω − θ )dθ, (11.25)
π
− LP

where P(θ ) is a finite frequency domain window (we also assume rectangular form), P(θ ) = 0 for
|θ | > L P . Distribution obtained in this way is referred to as the S-method. Two special cases are: the
spectrogram P(θ ) = πδ(θ ) and the pseudo Wigner distribution P(θ ) = 1.
The S-method can produce a representation of a multi-component signal such that the distribution
of each component is its Wigner distribution, avoiding cross-terms, if the STFTs of the components do
not overlap in time-frequency plane.
Consider a signal
M
x (t) = ∑ xm (t)
m =1
where xm (t) are monocomponent signals. Assume that the STFT of each component lies inside the
region Dm (t, Ω), m = 1, 2, . . . , M and assume that regions Dm (t, Ω) do not overlap. Denote the length
of the m-th region along Ω, for a given t, by 2Bm (t), and its central frequency by Ω0m (t). Under this
assumptions the S-method of x (t) produces the sum of the pseudo Wigner distributions of each signal
component
M
SMx (t, Ω) = ∑ PWDxm (t, Ω), (11.26)
m =1
if the width of the rectangular window P(θ ), for a point (t, Ω), is defined by

Bm (t) − |Ω − Ω0m (t)| for (t, Ω) ∈ Dm (t, Ω)
L P (t, Ω) =
0 elsewhere.

To prove this consider a point (t, Ω) inside a region Dm (t, Ω). The integration interval in (11.25), for
the m-th signal component is symmetrical with respect to θ = 0. It is defined by the smallest absolute
value of θ for which Ω + θ or Ω − θ falls outside Dm (t, Ω), that is,

|Ω ± θ − Ω0m (t)| ≥ Bm (t).

For Ω > Ω0m (t) and positive θ, the integration limit is reached for θ = Bm (t) − (Ω − Ω0m (t)). For
Ω < Ω0m (t) and positive θ, the limit is reached for θ = Bm (t) + (Ω − Ω0m (t)). Thus, having in
mind the interval symmetry, an integration limit which produces the same value of integral (11.25) as
the value of (11.23), over the region Dm (t, Ω), is given by L P (t, Ω). Therefore, for (t, Ω) ∈ Dm (t, Ω)
we have SMx (t, Ω) = PWDxm (t, Ω). Since regions Dm (t, Ω) do not overlap we have
M
SMx (t, Ω) = ∑ PWDxm (t, Ω).
m =1

Note that any window P(θ ) with constant width

L P ≥ max{ L P (t, Ω)}

(t,Ω)

M
produces SMx (t, f ) = ∑m =1 PWDxm ( t, Ω ), if the regions
Dm (t, Ω) for m = 1, 2, .., M, are at least
2L P apart along the frequency axis, Ω0p (t) − Ω0q (t) > B p (t) + Bq (t) + 2L P , for each p, q and
t. This is the S-method with constant window width. The best choice of L P is the value when P(θ )
is wide enough to enable complete integration over the auto-terms, but narrower than the distance
626 Quadratic Time-Frequency Representations

between the auto-terms, in order to avoid the cross-terms. If two components overlap for some time
instants t, then the cross-term will appear, but only between these two components and for that time
instants.
A discrete form of the S-method (11.25) reads
L
SM L (n, k) = ∑ S N (n, k + i )S∗N (n, k − i )
i =− L

for P(i ) = 1, − L ≤ i ≤ L (a weighted form P(i ) = 1/(2L + 1) could be used). A recursive relation
for the S-method calculation is

SM L (n, k) = SM L−1 (n, k) + 2 Re[S N (n, k + L)S∗N (n, k − L)], (11.27)

The spectrogram is the initial distribution SM0 (n, k) = |S N (n, k)|2 and 2 Re[S N (n, k + i )S∗N (n, k −
i )], i = 1, 2, . . . , L are the correction terms. Changing parameter L we can start from the spectrogram
( L = 0) and gradually make the transition toward the pseudo Wigner distribution by increasing L.
For the S-method realization we have to implement the STFT first, based either on the FFT
routines or recursive approaches suitable for hardware realizations. After we get the STFT we have to
“correct” the obtained values, according to (11.27), by adding few “correction” terms to the spectrogram
values. Note that S-method is one of the rare quadratic time-frequency distributions allowing easy
hardware realization, based on the hardware realization of the STFT, presented in the first part, and its
“correction” according to (11.27). There is no need for analytic signal since the cross-terms between
negative and positive frequency components are removed in the same way as are the other cross-terms.
If we take that STFT (n, k) = 0 outside the basic period, that is, when k < − N/2 or k > N/2 − 1,
then there is no aliasing when the STFT is alias-free (in this way we can calculate the alias-free Wigner
distribution by taking L = N/2 in (11.27)). The calculation in (11.27) can be performed for the whole
matrix of the S-method and the STFT. This can significantly save time in some matrix based calculation
tools.
There are two ways to implement summation in the S-method. The first one is with a constant
L. Theoretically, in order to get the Wigner distribution for each individual component, the number
of correcting terms L should be such that 2L is equal to the width of the widest auto-term. This will
guarantee cross-terms free distribution for all components which are at least 2L frequency samples
apart.
The second way to implement the S-method is with a time-frequency dependent L = L(n,k) . The
summation, for each point (n, k), is performed as long as the absolute values of S N (n, k + i ) and
S∗N (n, k − i ) for that (n, k ) are above an assumed reference level (established, for example, as a few
percents of the STFT maximum value). Here, we start with the spectrogram, L = 0. Consider the
correction term S N (n, k + i )S∗N (n, k − i ) with i = 1. If the STFT values are above the reference level
then it is included in summation. The next term, with i = 2 is considered in the same way, and so
on. The summation is stopped when a STFT in a correcting term is below the reference level. This
procedure will guarantee cross-terms free distribution for components that do not overlap in the STFT.

Example 11.8. A signal consisting of three LFM components,

3
x (n) = ∑ Ai exp( jai πn/32 + jbi πn2 /1024),
i =1

with
( a1 , a2 , a3 ) = (−21, −1, 20)
Ljubiša Stanković Digital Signal Processing 627

and
(b1 , b2 , b3 ) = (2, −0.75, −2.8),
is considered at the instant n = 0. The IFs of the signal components are k i = ai , while the
normalized squared amplitudes of the components are indicated by dotted lines in Fig.11.8. An
ideal time-frequency representation of this signal, at n = 0, would be

I (0, k) = A21 δ(k − k1 ) + A22 δ(k − k2 ) + A23 δ(k − k3 ).

The starting STFT, with the corresponding spectrogram, obtained using the cosine window of the
width N = 64 is shown in Fig.11.8(a),(b). The first correction term is presented in Fig.11.8(c).
The result of summing the spectrogram with the first correction term is the S-method with
L = 1, Fig.11.8(d). The second correction term (Fig.11.8(e)) when added to SM1 (0, k), produces
the S-method with L = 2, Fig.11.8(f). The S-methods for L = 3, 5, and 8, ending with the
Wigner distribution (L = 31) are presented in Fig.11.8(g)-(j). Just a few correction terms are
sufficient in this case to achieve a high concentration. The cross-terms start appearing at L = 8 and
increase as L increases toward the Wigner distribution. They make the Wigner distribution almost
useless, since they cover a great part of the frequency range, including some signal components
(Fig.11.8(j)).
The optimal number of correction terms L is the one that produces the best S-method
concentration (sparsity), using the ℓ1/2 -norm of the spectrogram and the S-method (corresponding
to the ℓ1 -norm of the STFT). In this case the best concentrated S-method is detected for L = 5. The
spectrogram is the initial distribution SM0 (n, k) = |S N (n, k)|2 and 2 Re[S N (n, k + i )S∗N (n, k −
i )], i = 1, 2, . . . , L are the correction terms. Considering the parameter L as a frame index, we
can make a video of the transition from the spectrogram to the Wigner distribution.

Example 11.9. The adaptive S-method realization will be illustrated on a five-component signal
x (t) defined for 0 ≤ t < 1 and sampled with ∆t = 1/256. The Hamming window of the
width Tw = 1/2 (128 samples) is used for STFT calculation. The spectrogram is presented
in Fig.11.9(a), while the S-method with the constant Ld = 3 is shown in Fig.11.9(b). The
concentration improvement with respect to the case Ld = 0, Fig.11.9(a), is evident. Further
increasing of Ld would improve the concentration, but the cross-terms would also appear. Small
changes are noticeable between the components with constant instantaneous frequency and
between quadratic and constant instantaneous frequency component. An improved concentration,
without cross-terms, can be achieved using the variable window width Ld . The regions Di (n, k),
determining the summation limit Ld (n, k ) for each point (n, k), are obtained by imposing the
reference level corresponding to 0.14% of its maximum value at that time instant n. They are
defined as:
1 when |STFTxi (n, k)|2 ≥ Rn
Di (n, k) =
0 elsewhere
and presented in Fig.11.9(c). White regions mean that the value of spectrogram is below 0.14%
of its maximum value at that time instant n, meaning that the concentration improvement is not
performed at these points. The signal dependent S-method is given in Fig.11.9(d). The method
sensitivity, with respect to the reference level is low.
628 Quadratic Time-Frequency Representations

STFT |SN(0,k)|
first correction term second correction term
*
2Re[SN(0,k+1) SN (0,k−1)] 2Re[S (0,k+2) S *(0,k−2)]
N N

(a) (c) (e)

+ +

2
|SN(0,k)| =SM0(0,k) (b)+(c)=(d) SM1(0,k) (d)+(e)=(f) SM2(0,k)

k k k
−32 −16 0 16 31 −32 −16 0 16 31 −32 −16 0 16 31
(b) (d) (f)

SM3(0,k) SM5(0,k) SM6(0,k)

k
(g) −32 −16 0 16 31 (i)
(h)

SM8(0,k) SM9(0,k) SM31(0,k)=WD(0,k)

k
(j) (k) −32 −16 0 16 31
(l)

Figure 11.8 Analysis of a signal consisting of three LFM components (at the instant n = 0). (a) The STFT with a
cosine window of the width N = 64. (b) The spectrogram. (c) The first correction term. (d) The S-method (SM)
with one correction term. (e) The second correction term. (f) The S-method with two correction terms. (g) The
S-method with three correction terms. (h) The S-method with five correction terms. (i) The S-method with six
correction terms. (j) The S-method with eight correction terms.(k) The S-method with nine correction terms. (l) The
Wigner distribution (the S-method with L = 31 correction term).

11.3 GENERAL QUADRATIC TIME-FREQUENCY DISTRIBUTIONS

In order to provide additional insight into the field of joint time-frequency analysis, as well as
to improve concentration of time-frequency representation, energy distributions of signals were
introduced. We have already mentioned the spectrogram which belongs to this class of representations
and is a straightforward extension of the STFT. Here, we will discuss other distributions and their
generalizations.
The basics condition for the definition of time-frequency energy distributions is that a two-
dimensional function of time and frequency P(t, Ω) represents the energy density of a signal in the
Ljubiša Stanković Digital Signal Processing 629

1 1

0.5 0.5
t 0 t 0

−0.5 (a) −0.5 (b)

−1 400 600 800 −1 400 600 800
0 200 0 200
Ω Ω

0.5
1
t 0 0.5
t 0
−0.5

(c) −0.5 (d)

−1
0 200 400 600 800 −1 600 800
0 200 400
Ω Ω

Figure 11.9 Time-frequency analysis of a multi-component signal: a) Spectrogram, b) The S-method with a
constant window, with L P = 3, c) Regions of support for the S-method with a variable window width calculation,
corresponding to Q2 = 725, d) The S-method with the variable window width calculated using regions in c).

time-frequency plane. Thus, the signal energy associated with the small time and frequency intervals
∆t and ∆Ω, respectively, would be

Signal energy within [Ω + ∆Ω, t + ∆t] = P(t, Ω)∆Ω∆t.

However, point by point definition of time-frequency energy densities in the time-frequency plane is
not possible, since the uncertainty principle prevents us from defining concept of energy at a specific
instant and frequency. This is the reason why some more general conditions are being considered to
derive time-frequency distributions of a signal. Namely, one requires that the integral of P(t, Ω) over
Ω, for a particular instant of time should be equal to the instantaneous power of the signal | x (t)|2 ,
while the integral over time for a particular frequency should be equal to the spectral energy density
| X (Ω)|2 . These conditions are known as marginal conditions or marginal properties of time-frequency
distributions.
Therefore, it is desirable that an energetic time-frequency distribution of a signal x (t) satisfies:
– Energy property
Z∞ Z∞
1
P(t, Ω) dΩ dt = Ex , (11.28)
2π
−∞ −∞
630 Quadratic Time-Frequency Representations

– Time marginal properties

Z∞
1
P(t, Ω) dΩ = | x (t)|2 , and (11.29)
2π
−∞
– Frequency marginal property
Z∞
P(t, Ω) dt = | X (Ω)|2 , (11.30)
−∞
where Ex denotes the energy of x (t). It is obvious that if either one of marginal properties (11.29),
(11.30) is fulfilled, so is the energy property. Note that relations (11.28), (11.29) and (11.30), do not
reveal any information about the local distribution of energy at a point (t, Ω). The marginal properties
are illustrated in Fig. 11.10.
Next we will introduce some distributions satisfying these properties.

|x(t)|2

Integration over Ω
Ω Ω
Integration over t

t |X(Ω)|2

P(t,Ω)

Figure 11.10 Illustration of the marginal properties

Time and frequency marginal properties (11.29) and (11.30) may be considered as the projections
of the distribution P(t, Ω) along the time and frequency axes, that is, as the Radon transform of P(t, Ω)
along these two directions. It is known that the Fourier transform of the projection of a two-dimensional
function on a given line is equal to the value of the two-dimensional Fourier transform of P(t, Ω),
denoted by AF (θ, τ ), along the same direction (inverse Radon transform property). Therefore, if
P(t, Ω) satisfies marginal properties then any other function having two-dimensional Fourier transform
equals to AF (θ, τ ) along the axes lines θ = 0 and τ = 0, and arbitrary values elsewhere, will satisfy
marginal properties, Fig. 11.11.
Assuming that the Wigner distribution is a basic distribution which satisfies the marginal properties
(any other distribution satisfying marginal properties can be used as the basic one), then any other
distribution with two-dimensional Fourier transform

AFg (θ, τ ) = c(θ, τ )FT2D

t,Ω {WD ( t, Ω )} = c ( θ, τ ) AF ( θ, τ ) (11.31)

where c(0, τ ) = 1 and c(θ, 0) = 1, satisfies marginal properties as well.

The inverse two-dimensional Fourier transform of AFg (θ, τ ) produces the Cohen class of
distributions, introduced from quantum mechanics into the time-frequency analysis by Claasen and
Ljubiša Stanković Digital Signal Processing 631

2 2 θ AF(τ,θ)
|x(t)| FT [ |x(t)| ]

t τ

Integration over Ω

FT [ | X(Ω)|2 ]
2D FT
Ω Ω

Integration over t
t |X(Ω)|2

P(t,Ω)

Figure 11.11 Marginal properties and their relation to the ambiguity function.

Mecklenbäuker, in the form

Z∞ Z∞ Z∞
1
CD (t, Ω) = c(θ, τ ) x (u + τ/2) x ∗ (u − τ/2)e jθt− jΩτ − jθu dudτdθ (11.32)
2π
−∞ −∞ −∞

where c(θ, τ ) is called the kernel in the ambiguity domain.

Alternatively, the frequency domain definition of the Cohen class of distributions is
Z∞ Z∞ Z∞
1
CD (t, Ω) = X (u − θ/2) X ∗ (u + θ/2)c(θ, τ )e jθt− jτΩ+ jτu dudτdθ. (11.33)
(2π )2
−∞ −∞ −∞

Various distributions can be obtained by altering the kernel function c(θ, τ ). For example, c(θ, τ ) = 1
produces the Wigner distribution, while for c(θ, τ ) = e jθτ/2 the Rihaczek distribution follows.
The Cohen class of distributions, defined in the ambiguity domain:
Z∞ Z∞
1
CD (t, Ω) = c(θ, τ ) AF (θ, τ )e jθt− jΩτ dτ dθ (11.34)
2π
−∞ −∞

can be written in other domains, as well. The time-lag domain form is obtained from (11.32), after
integration on θ, as:
Z∞ Z∞
CD (t, Ω) = c T (t − u, τ ) x (u + τ/2) x ∗ (u − τ/2)e− jΩτ dτ du. (11.35)
−∞ −∞

The frequency-Doppler frequency domain form follows from (11.33), after integration on τ, as:
Z∞ Z∞
1
CD (t, Ω) = CΩ (θ, Ω − u) X (u + θ/2) X ∗ (u − θ/2)e jθt dθ du. (11.36)
(2π )2
−∞ −∞
632 Quadratic Time-Frequency Representations

Finally, the time-frequency domain form is obtained as a two-dimensional convolution of the two-
dimensional Fourier transforms, from (11.34), as:
Z∞ Z∞
1
CD (t, Ω) = Π(t − u, Ω − ξ )WD (u, ξ ) du dξ. (11.37)
2π
−∞ −∞

Kernel functions in the respective time-lag, Doppler frequency-frequency and time-frequency domains
are related to the ambiguity domain kernel c(θ, τ ) as:
Z∞
1
c T (t, τ ) = c(θ, τ )e jθt dθ (11.38)
2π
−∞

Z∞
CΩ (θ, Ω) = c(θ, τ )e− jΩτ dτ (11.39)
−∞
Z∞ Z∞
1
Π(t, Ω) = c(θ, τ )e jθt− jΩτ dτ dθ. (11.40)
2π
−∞ −∞
According to (11.37) all distributions from the Cohen class may be considered as 2D filtered versions
of the Wigner distribution. Although any distribution could be taken as a basis for the Cohen class
derivation, the form with the Wigner distribution is used because it is the best concentrated distribution
from the Cohen class with the signal independent kernels.

11.3.1 Reduced Interference Distributions

The analysis performed on ambiguity function and Cohen class of time-frequency distributions leads to
the conclusion that the cross-terms may be suppressed or eliminated, if a kernel c(θ, τ ) is a function of
a two-dimensional low-pass type. In order to preserve the marginal properties c(θ, τ ) values along the
axis should be c(θ, 0) = 1 and c(0, τ ) = 1.
Choi and Williams exploited one of the possibilities defining the distribution with the kernel of
the form
2 2 2
c(θ, τ ) = e−θ τ /σ .
The parameter σ controls the slope of the kernel function which affects the influence of cross-terms.
Small σ causes the elimination of cross-terms but it should not be too small because, for the finite width
of the auto-terms around θ and τ coordinates, the kernel will cause their distortion, as well. Thus, there
should be a trade-off in the selection of σ.
Here we will mention some other interesting kernel functions, producing corresponding
distributions, Fig. 11.12:
Born-Jordan distribution
sin( θτ
2 )
c(θ, τ ) = θτ
,
2
Zhao-Atlas-Marks distribution
sin( θτ
2 )
c(θ, τ ) = w(τ ) |τ | θτ
,
2
Sinc distribution
θτ 1 for |θτ/α| < 1/2
c(θ, τ ) = rect( )=
α 0 otherwise
Ljubiša Stanković Digital Signal Processing 633

Butterworth distribution
1
c(θ, τ ) = ,
1 + ( θθτ
c τc
)2N
where w(τ ) is a function corresponding to a lag window and α, N, θc and τc are constants in the above
kernel definitions.

c(θ,τ) c(θ,τ)

100 100

50 50
τ 0 τ 0

−50 −50
(a) (b)
−100 −100
0 2 0 2
−2 −2
θ θ

c(θ,τ) c(θ,τ)

100 100

50 50
τ 0 τ 0

−50 −50
(c) (d)
−100 −100
0 2 0 2
−2 −2
θ θ

Figure 11.12 Kernel functions for: Choi-Williams distribution, Born-Jordan distribution, Sinc distribution and
Zhao-Atlas-Marks distribution.

The spectrogram belongs to this class of distributions. Its kernel in (θ, τ ) domain is the ambiguity
function of the window
Z∞ τ τ − jθt
c(θ, τ ) = w t− w t+ e dt = AFw (θ, τ ).
2 2
−∞

Since the Cohen class is linear with respect to the kernel, it is easy to conclude that a distribution from
the Cohen class is positive if its kernel can be written as
M
c(θ, τ ) = ∑ ai AFw (θ, τ ), i
i =1
634 Quadratic Time-Frequency Representations

where ai ≥ 0, i = 1, 2, . . . , M.
There are several ways for calculation of the reduced interference distributions from the Cohen
class. The first method is based on the ambiguity function (11.34):
1. Calculation of the ambiguity function,
2. Multiplication with the kernel,
3. Calculation of the inverse two-dimensional Fourier transform of this product.
The reduced interference distribution can be calculated by using (11.35) or (11.37) with
appropriate kernel transformations defined by (11.38) and (11.40). All these methods assume signal
oversampling in order to avoid aliasing effects. Figure 11.13 shows the ambiguity function along with
kernel (Choi-Williams). Figure 11.14(a) presents Choi-Williams distribution calculated according to the
presented procedure. In order to reduce high side lobes of the rectangular window, the Choi-Williams
distribution is also calculated with the Hann(ing) window in the kernel definition c(θ, τ )w(τ ) and
shown in Fig. 11.14(b). The pseudo Wigner distribution with Hann(ing) window is given in Fig. 11.6.

AF(θ,τ) and CW kernel

100

τ 0

−50

−100

−3 −2 −1 0 1 2 3
θ

Figure 11.13 Ambiguity function for signal from Fig.10.4 with the Choi-Williams kernel

For the discrete-time signals. there are several ways to calculate a reduced interference
distributions from the Cohen class, based on (11.34), (11.35), (11.36), or (11.37).
The kernel functions are usually defined in the Doppler-lag domain (θ, τ ). Thus, here we should
use (11.34) with the ambiguity function of a discrete-time signal
Ljubiša Stanković Digital Signal Processing 635

CWD(t,Ω)

250

200

150

t 100
50 (a)

0 2.5 3
0.5 1 1.5 2
0
Ω

CWD(t,Ω)

250
200

150

t 100

50 (b)

0 2.5 3
0.5 1 1.5 2
0
Ω

Figure 11.14 Choi-Williams distribution: (a) direct calculation, (b) calculation with the kernel multiplied by a
Hann(ing) lag window.

∞
∆t ∗ ∆t
AF (θ, m∆t) = ∑ x p∆t + m x p∆t − m e− jpθ∆t ∆t.
p=−∞ 2 2
The signal should be sampled as in the Wigner distribution case. For a given lag instant m, the ambiguity
function can be calculated using the standard DFT routines. Another way to calculate the ambiguity
function is to take the inverse two-dimensional transform of the Wigner distribution. Note that the
corresponding transformation pairs are time ↔ Doppler and lag ↔ f requency, that is, t ↔ θ and
τ ↔ Ω. The relation between discretization values in the Fourier transform pairs (considered interval,
sampling interval in time ∆t, number of samples N, sampling interval in frequency ∆Ω = 2π/( N∆t))
is discussed in Chapter 1.
The generalized ambiguity function is obtained as

AFg (l∆θ, m∆t) = c(l∆θ, m∆t) AF (l∆θ, m∆t) (11.41)

∞
∆t ∆t − j l∆θ p∆t
= c(l∆θ, m∆t) ∑ x p∆t + m x ∗ p∆t − m e ∆t,
p=−∞ 2 2
636 Quadratic Time-Frequency Representations

while a distribution, with kernel c(θ, τ ) is the two-dimensional inverse Fourier transform in the form

1 ∞ ∞
CD (n∆t, k∆Ω) = ∑ ∑ AFg (l∆θ, m∆t)e− jkm∆t∆Ω e jnl∆θ∆t ∆t∆θ.
2π l =−∞ m=−∞

In this notation we can calculate CD (n, k) = IDFT2D l,m AFg ( l, m ) , where the values of AFg ( l, m )
are calculated according to (11.41).
In the time-lag domain, the discrete-time form reads
∞ ∞
CD (n∆t, k∆Ω) = ∑ ∑ c T (n∆t − p∆t, m∆t)
p=−∞m=−∞

∆t ∆t
× x p∆t + m x∗ p∆t − m e− jkm∆t∆Ω (∆t)2 (11.42)
2 2
with
1 ∞
c T (n∆t − p∆t, m∆t) = ∑ c(l∆θ, m∆t)e jnl∆θ∆t e− jl p∆θ∆t ∆θ.
2π l =− ∞
For the discrete-time signals, it is common to write and use the Cohen class of distributions in the form
∞ ∞
CD (n, ω ) = ∑ ∑ c T (n − p, m) x ( p + m) x ∗ ( p − m)e− j2mω , (11.43)
p=−∞ m=−∞

where

∆t ∆t
x ( p + m) x∗ ( p − m) = x ( p + m) x∗ ( p − m) ∆t
2 2

∆t ∆t
c T (n − p, m) = c T (n − p) , m∆t
2 2

∆t
CD (n, ω ) → CD n , Ω∆t .
2
Here we should mention that the presented kernel functions are of infinite duration along the coordinate
axis in (θ, τ ) thus, they should be limited in calculations. Their transforms exist in a generalized sense
only.

11.3.2 Kernel Decomposition Method

Distributions from the Cohen class can be calculated using decomposition of the kernel function in the
time-lag domain. Starting from
Z∞ Z∞
CD (t, Ω) = c T (t − u, τ ) x (u + τ/2) x ∗ (u − τ/2)e− jΩτ dτdu
−∞ −∞

with substitutions u + τ/2 = t + v1 and u − τ/2 = t + v2 we get t − u = −(v1 + v2 )/2 and

τ = v1 − v2 , resulting in
Z∞ Z∞
v1 + v2
CD (t, Ω) = c T (− , v1 − v2 ) x (t + v1 ) x ∗ (t + v2 )e− jΩ(v1 −v2 ) dv1 dv2
2
−∞ −∞
Ljubiša Stanković Digital Signal Processing 637

The discrete-time version of the Cohen class of distribution can be written, as

n + n2
CD (n, ω ) = ∑ ∑ c T − 1 , n1 − n2 [ x (n + n1 )e− jωn1 ][ x (n + n2 )e− jωn2 ]∗ .
n1 n2 2

Assuming that C is a square matrix of finite dimension, with elements:

n1 + n2
C ( n1 , n2 ) = c T − , n1 − n2
2
we can write
CD (n, ω ) = xn CxnH
where xn is a vector with elements x (n + n1 )e− jωn1 . We can now perform the eigenvalue decom-
position, finding solutions of det(C − λI) = 0 and determining eigenvectors matrix Q that satisfies
QQ H = I and
C = QΛQ H ,
where Λ is a diagonal matrix containing the eigenvalues. It results in

CD (n, ω ) = (xn Q)Λ(xn Q) H

Then, it is easy to conclude that the Cohen class of distribution can be written as a sum of spectrograms:

CD (n, ω ) = ∑ λi |STFTqi (n, ω )|2

where λi represents eigenvalues, while qi are corresponding eigenvectors of C, that is, columns of Q,
used as windows in the STFT calculations.

Example 11.10. A four-component real-valued signal with M = 384 samples is considered. Its STFT
is calculated with a Hann(ing) window of the width N = 128 with a step of 4 samples. The
spectrogram (L = 0) is shown in Fig.11.15(a). The alias-free Wigner distribution (L = N/2) is
presented in Fig. 11.15(b). The Choi-Williams distribution of analytic signal is shown in Fig.
11.15(c). Its cross-terms are smoothed by the kernel, that also spreads the auto-term of the LFM
signal and chirps. The S-method with L = 10 is shown in Fig. 11.15(d). For graphical presentation,
the distributions are interpolated by a factor of 2. In all cases the pure sinusoidal signal is well
concentrated. In the Wigner distribution and the SM the same concentration is achieved for the
LFM signal.
638 Quadratic Time-Frequency Representations

SPEC(t,Ω) WD(t,Ω)

250 250

200 200

150 150

t 100 t 100

50 50

0 2 2.5 3 0 2 2.5 3
a) 0.5 1 1.5 b) 0.5 1 1.5
0 0
Ω Ω
CWD(t,Ω) SM(t,Ω)

250 250

200 200

150 150

t 100 t 100

50 50

0 2 2.5 3 0 2 2.5 3
c) 0.5 1 1.5 d) 0.5 1 1.5
0 0
Ω Ω

Figure 11.15 Time-frequency representation of a four component signal: (a) the spectrogram, (b) the Wigner
distribution, (c) the Choi-Williams distribution, and (d) the S-method.
Chapter 12
Wavelet Transform
The first form of functions having the basic property of wavelets was used by Haar at the beginning
of the twentieth century. At the beginning of 1980’s, Morlet introduced a form of basis functions for
analysis of seismic signals, naming them “wavelets”. Theory of wavelets was linked to the image
processing by Mallat in the following years. In late 1980s Daubechies presented a whole new class of
wavelets that can be implemented in a simple way, using digital filtering ideas. The most important
applications of the wavelets are found in image processing and compression, pattern recognition and
signal denoising. Here, we will only link the basics of the wavelet transform to the time-frequency
analysis.

12.1 CONTINUOUS MORLET WAVELET TRANSFORM

Common STFT is characterized by a constant window and constant time and frequency resolutions
for both low and high frequencies. The basic idea behind the wavelet transform, as it was originally
introduced by Morlet, was to vary the resolution with scale (being related to frequency) in such a way
that a high frequency resolution is obtained for signal components at low frequencies, whereas a high
time resolution is obtained for signal at high frequency components. This kind of resolution change
could be relevant for some practical applications, like for example seismic signals. It is achieved by
introducing a frequency variable window width. The window width is decreased as frequency increases.
The basis functions in the STFT are
Z∞
STFTI I (t, Ω0 ) = x (τ )w∗ (τ − t)e− jΩ0 τ dτ
−∞
D E Z∞
= x (τ ), w(τ − t)e jΩ0 τ = h x (τ ), h(τ − t)i = x (τ )h∗ (τ − t)dτ
−∞

where h(τ − t) = w(τ − t)e jΩ0 τ is a a band-pass signal, obtained when a real-valued window w(τ − t)
is modulated by e jΩ0 τ . Notice that h(τ − t) = w(τ − t)e jΩ0 (τ −t) is also used. This form follows from
Z∞ Z∞
∗ − jΩ0 τ
STFT (t, Ω0 ) = x (t + τ )w (τ )e dτ = x (τ )w∗ (τ − t)e− jΩ0 (τ −t) dτ.
−∞ −∞

When the above idea about the wavelet transform is translated into the mathematical form and
related to the STFT, one gets the definition of a continuous wavelet transform

639
640 Wavelet Transform

Z∞
1 τ−t
WT (t, a) = p x (τ ) h∗ ( )dτ, (12.1)
| a| a
−∞
where h(t) is a band-pass signal, and the parameter a is the scale. This transform produces a time-scale,
rather than the time-frequency signal representation. For the Morlet wavelet the relation between the
scale and the frequency is a = Ω0 /Ω. In order to establish a strong formal relationship between the
wavelet transform and the STFT, we will choose the basic Morlet wavelet h(t) in the form

h(τ − t) = w(τ − t)e jΩ0 (τ −t) (12.2)

where w(t) is a window function and Ω0 is a constant frequency. For the Morlet wavelet we have a
Gaussian function r
1 −ατ2
w(τ ) = e
2π
√
where the values of α and Ω0 are chosen such that the ratio of h(0)t=0 = 1/ 2π and the first
maximum (of the real part of the Morlet wavelet w(τ − t) cos(Ω0 τ ), which √ is also used in the
analysis) at τ = 2π/Ω0 is equal to 1/2 = exp (−α4π 2 /Ω20 ), that is, Ω0 = 2π α/ ln 2. Substitution
of (12.2) into (12.1) leads to a continuous wavelet transform form suitable for a direct comparison with
the STFT
Z∞ Z∞
1 ∗ τ − t − jΩ0 (τ −t)/a 1 τ
WT (t, a) = p x (τ )w ( )e dτ = p x (t + τ )w∗ ( )e− jΩ0 τ/a dτ.
| a| a | a| a
−∞ −∞
(12.3)
From the definition of w(τ/a) it is obvious that small Ω (that is, large a) corresponds to a
wide wavelet, that is, a wide window, and vice versa. The basic idea of the wavelet transform and its
comparison with the STFT is illustrated in Fig. 12.1.
From the filter theory point of view the wavelet transform,
p for a given scale a, could be considered
as the output of system with impulse response h∗ (−t/a) | a|, that is,
q
WT (t, a) = x (t) ∗t h∗ (−t/a) | a|,

where ∗t denotes a convolution in time. Similarly the STFT, for a given Ω, may be considered as
STFTI I (t, Ω) = x (t) ∗t [w∗ (−t)e jΩt ]. If we consider these two bandpass filters from the bandwidth
point of view we can see that, in the case of STFT, the filtering is done by a system whose impulse
response w∗ (−t)e jΩt has a constant bandwidth, being equal to the width of the Fourier transform of
w ( t ).
Constant Q-Factor Transform: The quality factor Q for a band-pass filter, as measure of the filter
selectivity, is defined as
Central Frequency
Q=
Bandwidth
In the STFT the bandwidth is constant, equal to the window Fourier transform width, Bw . Thus, factor
Q is proportional to the considered frequency,
Ω
Q= .
Bw
In the case of the wavelet transform the bandwidth of impulse response is the width of the Fourier
transform of w(t/a). It is equal to B0 /a, where B0 is the constant bandwidth corresponding to the
Ljubiša Stanković Digital Signal Processing 641

Wavelet expansion functions STFT expansion functions

Ω=Ω0/2
a=2

(a) (b)
t t

Ω=Ω0
a=1

Ω=2Ω0
a=1/2

(e) (f)
t t

Figure 12.1 Expansion functions for the wavelet transform (left) and the short-time Fourier transform (right). Top
row presents high scale (low frequency), middle row is for medium scale (medium frequency) and bottom row is for
low scale (high frequency).

mother wavelet (wavelet in scale a = 1). It follows

Ω Ω
Q= = 0 = const.
B0 /a B0
Therefore, the continuous wavelet transform corresponds to the passing a signal through a series of
band-pass filters centered at Ω, with constant factor Q. Again we can conclude that the filtering, that
produces Wavelet transform, results in a small bandwidth (high frequency resolution and low time
resolution) at low frequencies and wide bandwidth (low frequency and high time resolution) at high
frequencies.

Example 12.1. Find the wavelet transform of signal (10.3)

x (t) = δ(t − t1 ) + δ(t − t2 ) + e jΩ1 t + e jΩ2 t . (12.4)

⋆ Its continuous wavelet transform is

1 h i
WT (t, a) = p w((t1 − t)/a)e− jΩ0 (t1 −t)/a + w((t2 − t)/a)e− jΩ0 (t2 −t)/a
| a|
q h i
+ | a| e jΩ1 t W [ a(Ω0 /a − Ω1 )] + e jΩ2 t W [ a(Ω0 /a − Ω2 )] . (12.5)
where w(t) is a real-valued function. The transform (12.5) has nonzero values in the region
depicted in Fig. 12.2(a).
642 Wavelet Transform

WT(t,Ω) STFT(t,Ω)
Ω Ω

Ω2 Ω2

Ω1 Ω1
(a) (b)
t1 t2 t t1 t2 t

Figure 12.2 Illustration of the wavelet transform (a) of a sum of two delta pulses and two sinusoids compared to
the STFT (b)

In analogy with spectrogram, the scalogram is defined as the squared magnitude of a wavelet
transform:
SCAL(t, a) =| WT (t, a) |2 . (12.6)
The scalogram obviously loses the linearity property, and fits into the category of quadratic transforms.

12.1.1 S-Transform

The S-transform (the Stockwell transform) is conceptually a combination of the STFT analysis and
wavelet analysis. It employs a common window, as in the STFT, with a frequency variable length
as in the wavelet transform. The frequency-dependent window function produces a higher frequency
resolution at lower frequencies, while at higher frequencies sharper time localization can be achieved,
the same as in the continuous wavelet case.
For a signal x (t) the S-transform is defined by
+
Z∞ 2 2
|Ω| − (τ −8π
t) Ω
Sc (t, Ω) = x (τ )e 2 e− jΩτ dτ, (12.7)
(2π )3/2
−∞

with substitutions τ − t → τ, the above equation can be rewritten as follows

+
Z∞
|Ω| e− jΩt 2 2
− τ8πΩ2
Sc (t, Ω) = x (t + τ )e e− jΩτ dτ. (12.8)
(2π )3/2
−∞
For the window function of the form

|Ω| − τ2 Ω22
w(τ, Ω) = e 8π , (12.9)
(2π )3/2
the definition of the continuous S-transform can be rewritten as follows
+
Z∞
Sc (t, Ω) = e− jΩt x (t + τ )w(τ, Ω)e− jΩτ dτ. (12.10)
−∞
A discretization over τ of (12.10) results in the discrete form of S-transform
Ljubiša Stanković Digital Signal Processing 643

Sd (t, Ω) = e− jΩt ∑ x (t + n∆t)w(n∆t, Ω)e− jΩn∆t ∆t. (12.11)

n
It may be considered as a STFT with frequency-varying window.

12.1.2 Spectral Meyer Wavelet Transform

The spectral domain STFT can be obtained from the corresponding time domain form
Z∞ Z∞ Z∞
1
STFT (t, Ω) = x (t + τ )w∗ (τ )e− jΩτ dτ = X (θ )e j(t+τ )θ w∗ (τ )e− jΩτ dθdτ
2π
−∞ −∞ −∞
Z∞ Z∞
1 1
= X (θ )W ∗ (θ − Ω)e jθt dθ = ∗
X (θ )WΩ (θ )e jθt dθ,
2π 2π
−∞ −∞

where W (θ ) is the Fourier transform of the window function w(τ ) and WΩ ∗ ( θ ) is its bandpass form
∗
centered at the frequency Ω (including the possibility that the form of WΩ (θ ) is frequency-varying and
that it changes with Ω as well, as in Section 10.6.1.1). The STFT can be considered as the projection
(inner product) of the Fourier transform of the signal, X (θ ), onto the kernel WΩ (θ )e− jθt .
The inversion relation can be derived in the same way as in Section 10.2. Assume that the STFT
is calculated (available) for a set of discrete frequency values Ωi . The Fourier transform of the signal is
a projection of the STFT onto the kernel functions

D E Z∞
∗
STFT (t, Ω), WΩ (θ )e jθt =∑ STFT (t, Ωi )WΩi (θ )e− jθt dt
Ω,t Ωi − ∞

∗ ∗
= ∑ X (θ )WΩ i
(θ )WΩi (θ ) = X (θ ) ∑ WΩ i
(θ )WΩi (θ ) = X (θ )
Ωi Ωi

if the condition
∑ |WΩ (θ )|2 = 1,
i
(12.12)
Ωi
holds. The inverse Fourier transform relation,
Z∞
STFT (t, Ω)e− jθt dt = X (θ )WΩ
∗
( θ ),
−∞

is used. Notice that we have not used a factor of 1/(2π ) within the scalar product definition and the
summation over Ωi , in order to simplify the notation. With this factor, the reconstruction condition
would be ∑Ωi |WΩi (θ )|2 /(2π ) = 1.
The spectral wavelet function is defined as a projection od the signal onto a set of the kernel
functions W ( ai θ )e− jθt = WΩi (θ )e− jθt , where ai is the scale which changes the position and the form
of the basic bandpass function W (θ ).
The Meyer wavelet transfer functions in the spectral domain, at a scale ai , in the notation W ( ai θ ),
are defined as in (10.40), (10.42), and (10.43),



sin π2 q( ai θ − 1) , for 1 < ai θ ≤ M
Wi (θ ) = W ( ai θ ) = (12.13)


cos π2 q( aMi θ − 1) , for M < ai θ ≤ M2
644 Wavelet Transform

and 0 elsewhere, for 2 ≤ i ≤ K − 1, where q = 1/( M − 1). The sine and cosine function arguments
are such that they are either 0 or π/2 at the interval ending points. The scales ai for each frequency
interval are related through a geometric progression with a factor of M > 1, that is

ai = ai−1 M = a1 Mi−1 = Mi /θmax ,

where θmax is the maximum considered frequency.

The first function, corresponding to the highest and widest frequency band, is defined as

sin π q( a θ − 1) , for 1 < a θ ≤ M,
2 1 1
W1 (θ ) = W ( a1 θ ) = (12.14)

0, elsewhere.

1.2

0.8

0.6

0.4

0.2

0
0 1 2 3 4 5 6 7 8 (a)

1.2

0.8

0.6

0.4

0.2

0
0 1 2 3 4 5 6 7 8 (b)
Figure 12.3 (a) Spectral domain windows (sine type) for the wavelet transform, 0 ≤ θ ≤ 8, that satisfy the
reconstruction condition ∑i W 2 ( ai θ ) = 1, with , with W ( a0 θ ) = G (θ ), M = 2, θmax = 8, and K = 7. (b)
Spectral domain windows (Meyer spectral domain wavelet) for the wavelet transform, 0 ≤ θ ≤ 8, that satisfy
the reconstruction condition ∑i W 2 ( ai θ ) = 1, with W ( a0 θ ) = G (θ ), M = 2, θmax = 8, and K = 7.

Since W ( ai θ ) are bandpass functions, to handle the lowpass spectral components (the interval for
θ which includes θ = 0), the lowpass type scale function, G (θ )), is added in the form



 1, for 0 ≤ θ ≤ M/aK = θmax /MK −1

G (θ ) = cos π2 q( aM Kθ
− 1) , for M < aK θ ≤ M2 (12.15)



0, elsewhere.
Ljubiša Stanković Digital Signal Processing 645

An example of the frequency domain windows (spectral transfer functions) of this wavelet is
shown in Fig. 12.3(a).
The reconstruction condition in (12.12) can be written in the form of a sum of all normalized
spectral transfer functions
K −1
∑ |W (ai θ )|2 = |G(θ )|2 + ∑ |W (ai θ )|2 = 1. (12.16)
ai i =1

The derivative discontinuity at the frequency band ending points can be avoided by introducing
the polynomial argument into sine and cosine functions of the form

v x ( x ) = x4 (35 − 84x + 70x2 − 20x3 ). (12.17)

This polynomial will keep the property that the argument is such that v x (0) = 0 and v x (1) = 1, but it
will make the derivatives smooth at the transition points. The Meyer wavelet functions are defined by


sin π2 v x q( ai θ − 1) , for 1 < ai θ ≤ M



W ( ai θ ) = π ai θ (12.18)

 cos v
2 x q ( M − 1 ) , for M < ai θ ≤ M2



0, elsewhere.

The same form is used in the first transfer function W ( a1 θ ) and the scale function G (θ ). The spectral
Meyer wavelet functions with the same parameters as in the previous example, are shown in Fig.
12.3(b). They exhibit smooth transitions and they satisfy the reconstruction condition (12.16).

12.2 FILTER BANK AND DISCRETE WAVELET

This analysis will start by splitting the signal’s spectral content into its high frequency and low frequency
part. Within the STFT framework, this can be achieved by a two sample rectangular window

w ( m ) = δ ( m ) + δ ( m − 1),

with N = 2. A two-sample window STFT is

1 1 1
STFT (n, 0) = √ ∑ x (n + m)e− j0 = √ ( x (n) + x (n + 1)) = x L (n), (12.19)
2 m =0 2

for k = 0, corresponding to low frequency ω = 0 and

1
x H (n) = √ ( x (n) − x (n + 1)) (12.20)
2
for k = 1 corresponding to high frequency ω = π. A time-shifted (anticausal) version of the STFT

1 N −1
STFT (n, k) = √ ∑ x (n + m)e− j2πkm/N
N m =0
N/2−1 − j2πkm/N in order to remain within the
is used, instead of STFT (n, k) = ∑m =− N/2 x ( n + m ) e √
common wavelet literature notation. For√
the same reason the STFT is scaled by N (a form when the
DFT and IDFT have the same factor 1/ N).
646 Wavelet Transform

This kind of signal analysis leads to the Haar (wavelet) transform. In the Haar wavelet
transform the high-frequency part, x H (n) is not processed anymore. It is kept with this (high) two-
samples resolution in time. The resolution in time of x H (n, 1) is just slightly (two-times) √ lower
than the original signal sampling interval. The lowpass part x L (n) = ( x (n) + x (n + 1)) / 2 will
be used in further processing. After the signal samples x (n) and x (n + 1) are processed using
(12.19) and (12.20), then next two samples x (n + 2) and x (n√+ 3) are analyzed. The highpass
part is again calculated x H (n + 2) = √( x (n + 2) − x (n + 3)) / 2 and kept as it is. Lowpass part
x L (n + 2) = ( x (n + 2) + x (n + 3)) / 2 is considered as a new signal, along with its corresponding
previous sample x L (n).
Spectral content of the lowpass part of signal is divided, in the same way, into its low and high
frequency part,
1 1
x LL (n) = √ ( x L (n) + x L (n + 2)) = [ x (n) + x (n + 1) + x (n + 2) + x (n + 3)]
2 2
1 1
x LH (n) = √ ( x L (n) − x L (n + 2)) = [ x (n) + x (n + 1) − [ x (n + 2) + x (n + 3)]] .
2 2

The highpass part x LH (n) is left with resolution four in time, while the lowpass part is further
processed in the same way, by dividing spectral content of x LL (n) and x LL (n + 4) into its low and
high frequency part. This process is continued until the full length of signal is achieved. The Haar
wavelet transformation matrix in the case of signal with 8 samples is
 √    
√ 2W1 (0, H ) 1 −1 0 0 0 0 0 0 x (0)
 
 √2W1 (2, H )   0 0 1 −1 0 0 0 0  
  x (1) 

  
 √2W1 (4, H )   0 0 0 0 1 − 1 0 0   x (2) 
    
 2W1 (6, H )   0 0 0 0 0 0 1 −1   
  x (3)  .
 =     (12.21)
 2W2 (0, H )   1 1 − 1 − 1 0 0 0 0  x ( 4 ) 
  
 2W2 (4, H )   0 0 0 0 1 1 −1 −1  
  x (5) 

 √ 
 2 2W4 (0, H )   1 1 1 1 −1 −1 −1 −1   x (6) 
√
2 2W4 (0, L) 1 1 1 1 1 1 1 1 x (7)
This kind of signal transformation was introduced by Haar more than a century ago . In this notation
scale a = 1 values of the wavelet coefficients W1 (2n, H ) are equal to the highpass part of signal
calculated using two samples, W1 (2n, H ) = x H (2n). The scale a = 2 wavelet coefficients are
W2 (4n, H ) = x LH (4n). In scale a = 4 there is only one highpass and one lowpass coefficient at
n = 0, W4 (8n, H ) = x LLH (8n) and W4 (8n, L) = x LLL (8n). In this way any length of signal N = 2m
can be decomposed into Haar wavelet coefficients.
The Haar wavelet transform has a property that its highpass coefficients are equal to zero if the
analyzed signal is constant within the analyzed time interval, for considered scale. If signal has large
number of constant value samples within the analyzed time intervals, then many Haar wavelet transform
coefficients are zero valued. They can be omitted in signal storage or transmission. In recovery their
values are assumed as zeros and the original signal is obtained. The same can be done in the case
of noisy signals, when all coefficients bellow an assumed level of noise can be zero-valued and the
signal-to-noise ratio in the reconstructed signal improved.

12.2.1 Lowpass and Highpass Filtering and Downsampling

Although the presented Haar wavelet analysis is quite simple we will use it as an example to introduce
the filter bank framework of the wavelet transform. Obvious results from the Haar wavelet will be used
to introduce other wavelet forms. For the Haar wavelet calculation two signals x L (n) and x H (n) are
formed according to (12.19) and (12.20), based on the input signal x (n). Transfer functions of the
Ljubiša Stanković Digital Signal Processing 647

discrete-time systems producing these two signals are

1
H L ( z ) = √ (1 + z ) (12.22)
2
1
H H ( z ) = √ (1 − z ) .
2
Frequency responses of these systems assume the form
1
HL (e jω ) = √ 1 + e jω
2
1
jω
HH (e ) = √ 1 − e jω
2
√ √

with amplitude characteristics HL (e jω ) = 2 |cos(ω/2)|, and HH (e jω ) = 2 |sin(ω/2)|,
presented in Fig.12.4. As expected, they represent a quite rough forms of lowpass and highpass
filters. In general, this principle is kept for all wavelet transforms. The basic goal for all of them is to
split the frequency content of a signal into its lowpass part and highpass part providing, in addition, a
possibility of simple and efficient signal reconstruction.

2 jω 2 jω 2
|H (e )| +|H (e )| =2
L H
1.8

1.6 jω
|H (e )|=|DFT{φ (n)}|
L 1
1.4

1.2

0.8 jω
|H (e )|=|DFT{ψ (n)}|
H 1
0.6

0.4

0.2

0
−3 −2 −1 0 1 2 3
√
Figure 12.4 Amplitude of the Fourier transform of basic Haar wavelet and scale function divided by 2.

After the values representing lowpass

√ and highpass part of signal are obtained,
√ next values of the
signals x L (n) = [ x (n) + x (n + 1)] / 2 and x H (n) = [ x (n) − x (n + 1)] / 2 are calculated after
one time instant is skipped. Therefore the output signal is downsampled by factor of two. The new
downsampled signals will be denoted by

s L (n) = x L (2n)
s H (n) = x H (2n). (12.23)

Downsampling of a signal x (n) to get the signal y(n) = x (2n) is described in the z-transform domain
by the function
1 1
Y (z) = X (z1/2 ) + X (−z1/2 ).
2 2
648 Wavelet Transform

This relation can easily be verified using the z-transform definition

∞
X (z) = ∑ x (n)z−n
n=−∞
∞ ∞
X (z1/2 ) + X (−z1/2 ) = ∑ x (n)[(z−1/2 )n + (−z−1/2 )n ] = ∑ 2x (2n)z− n
n=−∞ n=−∞
1 1
Z { x (2n))} = Y (z) = X (z1/2 ) + X (−z1/2 ). (12.24)
2 2
For the signals s L (n) = x L (2n) and s H (n) = x H (2n) the system implementation is presented in
Fig.12.5.

1/2 1/2 1/2 1/2

HH(z) ↓ [X(z )H (z )+X(−z )H (−z )]/2
H H

2
x(n)
X(z)

H (z)
L ↓ [X(z1/2)H (z1/2)+X(−z1/2)H (−z1/2)]/2
L L

Figure 12.5 Signal filtering by a low pass and a high pass filter followed by downsaampling by 2.

If the signals s L (n)and s H (n) are passed through the lowpass and highpass filters HL (z) and
HH (z) and then downsampled,

1 1
S L (z) = H (z1/2 ) X (z1/2 ) + HL (−z1/2 ) X (−z1/2 )
2 L 2
1 1
S H (z) = HH (z ) X (z ) + HH (−z1/2 ) X (−z1/2 )
1/2 1/2
2 2
hold.

12.2.2 Upsampling

Let us assume that we are not going to transform the signals s L (n) and s H (n) any more. The only goal
is to reconstruct the signal x (n) based on its downsampled lowpass and highpass part signals s L (n)
and s H (n). The first step in the signal reconstruction is to restore the original sampling interval of the
discrete-time signal. It is done by upsampling the signals s L (n) and s H (n).
Upsampling of a signal x (n) is described by

y(n) = [. . . x (−2), 0, x (−1), 0, x (0), 0, x (1), 0, x (2), 0, . . . ] .

Its z-transform domain form is

Y ( z ) = X ( z2 ),
Ljubiša Stanković Digital Signal Processing 649

since
∞
X ( z2 ) = ∑ x (n)z−2n = . . . x (−1)z2 + 0 · z1 + x (0) + 0 · z−1 + x (1)z−2 + . . . . (12.25)
n=−∞

Upsampling of a signal x (n) is defined by

x (n/2) for even n
y(n) = = Z −1 { X (z2 ))}.
0 for odd n

If a signal x (n) is downsampled first and then upsampled, the resulting signal transform is

1 1/2 2 1 2
Y (z) =
X( z ) + X (− z1/2 )
2 2
1 1
Y (z) = X (z) + X (−z). (12.26)
2 2

In the Fourier domain it means Y (e jω ) = ( X e jω + X e j(ω +π ) . This form indicates that an

aliasing component X e j(ω +π ) appeared in this process.

12.2.3 Reconstruction Condition

In general, when the signal is downsampled and upsampled the aliasing appears since the component
X (−z) exists in addition to the original signal X (z) in (12.26). The upsampled versions of signals
s L (n) and s H (n) should be appropriately filtered and combined in order to eliminate aliasing. The
conditions to avoid the aliasing in the reconstructed signal will be studied next.

2
S (z)
H
H (z)
H ↓ ↑ G (z)
H

2
x(n) y(n)
+
X(z) Y(z)

2
HL(z) ↓ ↑ GL(z)
S (z)
L
2

Figure 12.6 One stage of the filter bank with reconstruction, corresponding to the one stage of the wavelet
transform realization.

In the reconstruction process the signals are upsampled (S L (z) → S L (z2 ) and S H (z) → S H (z2 ))
and passed through the reconstruction filters GL (z) and GL (z) before being added up to form the
output signal, Fig.12.6. The output signal transforms are
1 1
YL (z) = S L (z2 ) GL (z) = [ HL (z) X (z) + HL (−z) X (−z)] GL (z)
2 2
2 1 1
YH (z) = S H (z ) GH (z) = [ HH (z) X (z) + HH (−z) X (−z)] GH (z)
2 2
650 Wavelet Transform

The total output is

Y (z) = YL (z) + YH (z)

1 1
= [ HL (z) GL (z) + HH (z) GH (z)] X (z)
2 2
1 1
+[ HL (−z) GL (z) + HH (−z) GH (z)] X (−z).
2 2
Condition for the alias-free reconstruction is

Y ( z ) = X ( z ).

This means that

HL (z) GL (z) + H H (z) G H (z) = 2 (12.27)

HL (−z) GL (z) + HH (−z) GH (z) = 0. (12.28)

These are general conditions for a correct (alias-free) signal reconstruction.

Based on the reconstruction conditions we can show that the lowpass filters satisfy

HL (z) GL (z) + HL (−z) GL (−z) = 2 (12.29)

P(z) + P(−z) = 2, (12.30)
where P(z) = HL (z) GL (z).

From (12.28) we may write

HL (−z) GL (z)
GH (z) =
HH (−z)
HL (z) GL (−z)
HH (z) = .
GH (−z)

Second expression is obtained from (12.28) with z being replaced by −z, when HL (z) GL (−z) +
HH (z) GH (−z) = 0. Substituting these values into (12.27) we get

HL (−z) GL (z) HL (z) GL (−z)

HL (z) GL (z) + =2
HH (−z) GH (−z)
or
HL (z) GL (z)
[ H (−z) GH (−z) + HL (−z) GL (−z)] = 2.
HH (−z) GH (−z) H
Since the expression within the brackets is equal to 2 (reconstruction condition (12.27) with z being
replaced by −z) then
HL (z) GL (z)
=1 (12.31)
HH (−z) GH (−z)
and (12.29) follows with
HH (z) GH (z) = HL (−z) GL (−z).
In the Fourier transform domain the reconstruction conditions are

HL (e jω ) GL (e jω ) + HH (e jω ) GH (e jω ) = 2 (12.32)
jω jω jω jω
HL (−e ) GL (e ) + HH (−e ) GH (e ) = 0.
Ljubiša Stanković Digital Signal Processing 651

12.2.4 Orthogonality Conditions

The wavelet transform is calculated using downsampling by a factor 2. One of the basic requirements
that will be imposed to the filter impulse response for an efficient signal reconstruction is that it is
orthogonal to its shifted version with step 2 (and its multiples). In addition the wavelet functions in
different scales should be orthogonal. Orthogonality of wavelet function in different scales will be
discussed later. The orthogonality condition for the impulse response is

hh L (m), h L (m − 2n)i = δ(n) (12.33)

∑ h L (m)h L (m − 2n) = δ(n).
m

For the Haar wavelet transform this condition is obviously satisfied. In general, for wavelet transforms
when the duration of impulse response h L (n) is greater than two, the previous relation can be understood
as a downsampled convolution of h L (n) and h L (−n)

r (n) = h L (n) ∗ h L (−n) = ∑ h L (m)h L (m − n),

m
2

Z {r (n))} = HL (z) HL (z−1 ) or FT{r (n))} = HL (e jω ) .

The Fourier transform of the downsampled convolution, for real-valued h L (n) is, (12.24)

1 2 1

2

FT{r (2n))} = HL (e jω/2 ) + HL (−e jω/2 ) .
2 2
From r (2n) = δ(n) follows
2 2

HL (e jω ) + HL (−e jω ) = 2.
The impulse response is orthogonal, in the sense of (12.33), if the frequency response satisfies
2 2

HL (e jω ) + HL (e j(ω +π ) ) = 2.

Time domain form of relation (12.29) is

h L (n) ∗ g L (n) + [(−1)n h L (n)] ∗ [(−1)n g L (n)] = 2δ(n)

∑ h L (m) gL (n − m) + ∑(−1)n h L (m) gL (n − m) = 2δ(n)
m m

∑ h L (m) gL (2n − m) = δ(n).

If the impulse response h L (n) is orthogonal, as in (12.33), then the last relation is satisfied for

g L (n) = h L (−n).

In the z-domain it holds

G L ( z ) = H L ( z −1 )
and we may write (12.29) in the form

GL (z) GL (z−1 ) + GL (−z) GL (−z−1 ) = 2 (12.34)

or P(z) + P(−z) = 2 with P(z) = GL (z) GL (z−1 ). Relation (12.34) may also written for HL (z)

HL (z) HL (z−1 ) + HL (−z) HL (−z−1 ) = 2.

652 Wavelet Transform

12.2.5 FIR Filter and Orthogonality Condition

Consider a lowpass anticausal FIR filter of the form

K −1
h L (n) = ∑ hk δ(n + k )
k =0

and the corresponding causal reconstruction filter

K −1
g L (n) = h L (−n) = ∑ h k δ ( n − k ), GL (e jω ) = HL (e− jω )
k =0

If the highpass filters are obtained from corresponding lowpass filters by reversal, in addition to
common multiplication by (−1)n , then

g H (n) = (−1)n g L (K − n)
K K
GH (e jω ) = ∑ gH (n)e− jωn = ∑ (−1)n gL (K − n)e− jωn
n =0 n =0
K K
= ∑ (−1)K−m gL (m)e− jω(K−m) = (−1)K e− jωK ∑ e jπm gL (m)e− j(−ω)m
m =0 m =0
− jωK − j(ω −π ) − jωK
= −e GL (e ) = −e GL (−e− jω )

or
GH (e jω ) = −e− jωK GL (−e− jω ) = −e− jωK HL (−e jω )
for GL (e jω ) = HL (e − jω ). Similar relation holds for the anticausal h H (n) impulse response

h H (n) = (−1)n h L (−K − n).

0 0
HH (e jω ) = ∑ h H (n)e− jωn = ∑ (−1)n h L (−n − K )e− jωn
n=−K n=−K
0
= ∑ (−1)−K −m h L (m)e jω (m+K ) = −e jωK HL (−e− jω )
m=−K

The reconstruction conditions are satisfied since, according to (12.27) and (12.31), a relation
corresponding to
HH (z) GH (z) = HL (−z) GL (−z)
holds in the Fourier domain
h ih i
HH (e jω ) GH (e jω ) = −e jωK HL (−e− jω ) −e− jωK HL (−e jω )

= HL (−e− jω ) HL (−e jω ) = GL (−e jω ) HL (−e jω ).

In this way all filters are expressed in terms of GL (e jω ) or HL (e jω ).

For example, if GL (e jω ) is obtained using (12.34), with appropriate design conditions, then

HL (e jω ) = GL (e− jω )
GH (e jω ) = −e− jωK GL (−e− jω )
HH (e jω ) = −e jωK GL (−e jω ). (12.35)
Ljubiša Stanković Digital Signal Processing 653

Note that the following symmetry of the frequency response amplitude functions holds

HL (e jω ) = GL (e− jω ) = HH (e j(ω +π ) ) = HH (e− j(ω +π ) ) .

The highpass and lowpass response orthogonality

∑ h L (m)h H (m − 2n) = 0
m

∑ gL (m) gH (m − 2n) = 0 (12.36)

is also satisfied with these forms of transfer functions for any n. Since

Z {h L (n) ∗ h H (−n)} = HL (z) HH (z−1 )

and Z {h L (2n) ∗ h H (−2n)} = 0, in the Fourier domain this relation assumes the form

HL (e jω ) HH (e− jω ) + HL (−e jω ) HH (−e− jω ) = 0.

This identity follows from the second relation in (12.32)

HL (−e jω ) GL (e jω ) + HH (−e jω ) GH (e jω ) = 0

with HH (−e jω ) = e jωK HL (e− jω ), GH (e jω ) = −e− jωK GL (−e− jω ), and HL (e jω ) = GL (e− jω ) as

GL (−e− jω ) GL (e jω ) − e jωK GL (e jω )e− jωK GL (−e− jω ) = 0.

12.2.6 Haar Wavelet Implementation

The condition that the reconstruction filter GL (z) has zero value at z = e jπ = −1 means that√its form
is GL (z) = a(1 + z−1 ). This form without additional requirements would produce a = 1/ 2 from
the reconstruction relation GL (z) GL (z−1 ) + GL (−z) GL (−z−1 ) = 2. The time domain filter form is

1
g L (n) = √ [δ(n) + δ(n − 1)] .
2

It corresponds to the Haar wavelet. All other filter functions can be defined using g L (n) or GL (e jω ).
The same result would be obtained starting from the filter transfer functions for the Haar wavelet
already introduced as
1
H L ( z ) = √ (1 + z )
2
1
H H ( z ) = √ (1 − z ) .
2
The reconstruction filters are obtained from (12.27)-(12.28)
1 1
√ (1 + z ) G L ( z ) + √ (1 − z ) G H ( z ) = 2
2 2
1 1
√ (1 − z ) G L ( z ) + √ (1 + z ) G H ( z ) = 0
2 2
654 Wavelet Transform

as
1
G L ( z ) = √ 1 + z −1 (12.37)
2
1
G H ( z ) = √ 1 − z −1
2
with
1 1
g L (n) = √ δ(n) + √ δ ( n − 1) (12.38)
2 2
1 1
g H (n) = √ δ(n) − √ δ ( n − 1).
2 2
The values impulse responses in the Haar wavelet transform (relations (12.22) and (12.38)) are:
√ √ √ √
n 2h L (n) 2h H (n) n 2g L (n) 2g H (n)
0 1 1 0 1 1
−1 1 −1 1 1 −1

A detailed time domain filter bank implementation of the reconstruction process in the Haar wavelet
case is described. The reconstruction is implemented in two steps:
1) The signals s L (n) and s H (n) from (12.23) are upsampled, according to (12.25), as

r L ( n ) = [ s L (0) 0 s L (1) 0 s L (2) 0 . . . s L ( N − 1) 0]

r H ( n ) = [ s H (0) 0 s H (1) 0 s H (2) 0 . . . s H ( N − 1) 0]

These signals are then passed trough the reconstruction filters. A sum of the outputs from these filters is

y(n) = r L (n) ∗ g L (n) + r H (n) ∗ g H (n)

1 1 1 1
= √ r L ( n ) + √ r L ( n − 1) + √ r H ( n ) − √ r H ( n − 1)
2 2 2 2
1 1
= √ [ x L (0) 0 x L (2) 0 x L (4) . . . .0 x L (2N − 2) 0] + √ [0 x L (0) 0 x L (2) . . . .0 x L (2N − 2)]
2 2
1 1
+ √ [ x H (0) 0 x H (2) 0 x H (4) . . . .0 x H (2N − 2) 0] − √ [0 x H (0) 0 x H (2) . . . .0 x H (2N − 2)] .
2 2
where s L (n) = x L (2n) and s H (n) = x H (2n). From the previous relation follows

1
y(0) = √ [ x L (0) + x H (0)] = x (0)
2
1
y(1) = √ [ x L (0) − x H (0)] = x (1)
2
...
1
y(2n) = √ [ x L (2n) + x H (2n)] = x (2n)
2
1
y(2n + 1) = √ [ x L (2n) − x H (2n)] = x (2n + 1).
2
A system for the implementation of the Haar wavelet transform of a signal with eight samples is
presented in Fig.12.7. It corresponds to the matrix form realization (12.21).
Ljubiša Stanković Digital Signal Processing 655

discrete−time n 0 1 2 3 4 5 6 7

W (0,H) W (2,H) W (4,H) W (6,H)

1 1 1 1
HH(z) ↓ scale a=1
2
x(n)
W2(0,H) W2(4,H)
HH(z) ↓ scale a=2
2
W4(0,H)
HL(z) ↓ HH(z) ↓ scale a=3
2 2
HL(z) ↓
2
W4(0,L)
HL(z) ↓
2

first stage second stage third stage

Figure 12.7 Filter bank for the wavelet transform realization

Example 12.2. For a signal x (n) = [1, 1, 2, 0, 2, 2, 0, 0, 2, 2, 2, 2, 0, 0, 0, 0] calculate the Haar

wavelet transform coefficients, with their appropriate placement in the time-frequency plane
corresponding to a signal with M = 16 samples.

⋆ The wavelet transform of a signal with M = 16 samples after the stage a = 1 is shown in
Fig.12.8(a). The whole frequency range is divided into √ two subregions, denoted by L and H within √
the coefficients W1 (n, L) = [ x (n) + x (n + 1)] / 2 and W1 (n, H ) = [ x (n) − x (n − 1)] / 2
calculated at instants n = 0, 2, 3, 6, 8, 10, 12, 14. In the second stage ( a = 2) the highpass region is
not transformed, while the lowpass part s2 (n) =√W1 (2n, L) is divided into its lowpass and high- √
pass region W2 (n, L) = [s2 (n) + s2 (n + 1)] / 2 and W2 (n, H ) = [s2 (n) − s2 (n + 1)] / 2,
respectively, Fig.12.8(b). The same calculation is performed in the third and fourth stage,
Fig.12.8(c) - (d).

12.2.7 Daubechies D4 Wavelet Transform

The Haar wavelet has the duration of impulse response equal to two. In one stage, it corresponds
to a two-sample STFT calculated using a rectangular window. Its Fourier transform presented
in Fig.12.4 is quite rough approximation of a lowpass and highpass filter. In order to improve
filter performance, an increase of the number of filter coefficients should be done. A fourth order
FIR system will be considered. The impulse response of anticausal fourth order FIR filter is
h L (n) = [ h L (0), h L (−1), h L (−2), h L (−3)] = [ h0 , h1 , h2 , h3 ].
656 Wavelet Transform

15 15
14 14
13 13

W1(10,H)

W1(12,H)

W (14,H)

W1(10,H)

W1(12,H)

W1(14,H)
W (0,H)

W (2,H)

W (4,H)

W1(6,H)

W1(8,H)

W (0,H)

W1(2,H)

W1(4,H)

W (6,H)

W1(8,H)
12 12
11
1

1 11

1
1
10 10
9 9
8 8
7 7
6 6
W (0,H) W (4,H) W (8,H) W (12,H)
2 2 2 2
5 W1(10,L) 5

W (12,L)

W (14,L)
W1(0,L)

W (2,L)

W1(4,L)

W1(6,L)

W (8,L)

4 4
1

3 1 3

1
2 2
W (0,L) W (4,L) W (8,L) W (12,L)
2 2 2 2
1 1
0 0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
(a) (b)
15 15
14 14
13 13
W1(10,H)

W1(12,H)

W (14,H)

W1(10,H)

W1(12,H)

W1(14,H)
W1(0,H)

W (2,H)

W1(4,H)

W1(6,H)

W1(8,H)

W (0,H)

W1(2,H)

W1(4,H)

W (6,H)

W1(8,H)
12 12
11
1

11
1

1
1

10 10
9 9
8 8
7 7
6 6
W (0,H) W (4,H) W (8,H) W (12,H) W (0,H) W (4,H) W (8,H) W (12,H)
2 2 2 2 2 2 2 2
5 5
4 4
3 3
W (0,H) W (8,H) W (0,H) W (8,H)
3 3 3 3
2 2
1 1 W (0,H)
W (0,L) W (8,L) 4
0 3 3
0 W (0,L)
4
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
(c) (d)

Figure 12.8 Wavelet transform of a signal with M = 16 samples at the output of stages 1, 2, 3 and 4, respectively.
Notation Wa (n, H ) is used for the highpass value of coefficient after stage (scale of) a at an instant n. Notation
Wa (n, L) is used for the lowpass value of coefficient after stage (scale of) a at an instant n.

If the highpass and reconstruction filter coefficients are chosen such that

n h L (n) h H (n) n g L (n) g H (n)

0 h0 h3 0 h0 h3
−1 h1 − h2 1 h1 − h2 . (12.39)
−2 h2 h1 2 h2 h1
−3 h3 − h0 3 h3 − h0

then relation (12.35) is satisfied with K = 3, since h L (n) = g L (−n), g H (n) = (−1)n g L (3 − n), and
h H (n) = (−1)n g L (n + 3).
The reconstruction conditions

HL (z) GL (z) + H H (z) G H (z) = 2

HL (−z) GL (z) + HH (−z) GH (z) = 0

are satisfied if
h20 + h21 + h22 + h23 = 1.
Ljubiša Stanković Digital Signal Processing 657

Using the z-transform of the corresponding filters, it follows

HL (z) GL (z) + H H (z) G H (z)

= h0 + h1 z + h2 z2 + h3 z3 h0 + h1 z −1 + h2 z −2 + h3 z −3

+ − h0 z3 + h1 z2 − h2 z + h3 − h0 z −3 + h1 z −2 − h2 z −1 + h3
= 2(h20 + h21 + h22 + h23 ) = 2

and

HL (−z) GL (z) + HH (−z) GH (z)

= h0 − h1 z + h2 z2 − h3 z3 h0 + h1 z −1 + h2 z −2 + h3 z −3

+ h0 z3 + h1 z2 + h2 z + h3 −h0 z−3 + h1 z−2 − h2 z−1 + h3 = 0.

For the calculation of impulse response values h0 , h1 , h2 , h3 of a fourth order system (12.39) four
independent equations (conditions) are needed.√We already have three conditions. The filter has
to satisfy zero-frequency condition HL (e j0 ) = 2, high-frequency condition HL (e jπ ) = 0 and the
reconstruction condition h20 + h21 + h22 + h23 = 1. Therefore one more condition is needed. In the
Daubechies D4 wavelet derivation the fourth condition is imposed so that the derivative of the filter
transfer function at ω = π is equal to zero

dHL (e jω )
= 0.
dω
ω =π

This condition, meaning a smooth approach to zero-value at ω = π, also guarantees that the output of
high-pass filter HH (−z) to the linear input signal, x (n) = an + b, will be zero. This will be illustrated
later. Now we have a system of four equations,
√ √
h0 + h1 + h2 + h3 = 2 from HL (e j0 ) = 2
h20 + h21 + h22 + h23 = 1 reconstruction condition
h0 − h1 + h2 − h3 = 0 from HL (e jπ ) = 0

dHL (e jω )
−h1 + 2h2 − 3h3 = 0 from = 0.
dω
ω =π

Its solution produces the fourth order Daubechies wavelet coefficients (D4)

n h L (√n) h H (√n) n g L (√n) g H (√n)

1+√ 3 1−√ 3 1+√ 3 1−√ 3
0 0
4 √2 4 2√ 4 √2 4 2√
−1 3+√ 3
− 3−√ 3 1 3+√ 3
− 3−√ 3
4 √2 4√ 2 4 √2 4√ 2
3−√ 3 3+√ 3 3−√ 3 3+√ 3
−2 2
4 √2 4 2√ 4 √2 4 2√
1−√ 3 1−√ 3
−3 − 1+√ 3 3 − 1+√ 3
4 2 4 2 4 2 4 2

Note that this is just one of possible symmetric solutions of the previous system of equations, Fig.12.9.
The reconstruction conditions for the fourth order FIR filter

HL (e jω ) = h0 + h1 e jω + h2 e j2ω + h3 e j3ω
658 Wavelet Transform

1 gL(n) 1 gH(n)

0.5 0.5

0 0

−0.5 −0.5
−4 −3 −2 −1 0 1 2 3 4 −4 −3 −2 −1 0 1 2 3 4
time n time n

1 hL(n) 1 hH(n)

0.5 0.5

0 0

−0.5 −0.5
−4 −3 −2 −1 0 1 2 3 4 −4 −3 −2 −1 0 1 2 3 4
time n time n

Figure 12.9 Impulse responses of the D4 filters.

with Daubechies wavelet coefficients (D4) can also be checked in a graphical way by calculating
2 2

HL (e jω ) + HL (e j(ω +π ) ) = 2

HL (e j(ω +π ) ) HL∗ (e jω ) + HL (e jω ) HL∗ (e j(ω +π ) ) = 0.

From Fig.12.10, we can see that it is much better approximation of low and high pass filters than in the
Haar wavelet case, Fig.12.4.

2 jω 2 jω 2
|HL(e )| +|HH(e )| =2
1.8

1.6 jω
|H (e )|=|DFT{φ (n)}|
L 1
1.4

1.2

0.8 jω
|H (e )|=|DFT{ψ (n)}|
H 1
0.6

0.4

0.2

0
−3 −2 −1 0 1 2 3

Figure 12.10 Amplitude of the Fourier transform of basic Daubechies D4 wavelet and scale function.
Ljubiša Stanković Digital Signal Processing 659

Another way to derive Daubechies wavelet coefficients (D4) is in using relation (12.34)

P(z) + P(−z) = 2

with
P ( z ) = G L ( z ) H L ( z ) = G L ( z ) G L ( z −1 )
Condition imposed on the transfer function GL (z) in D4 wavelet is that its value and the value of its
first derivative at z = −1 are zero-valued (smooth approach to the highpass zero value)

GL (e jω ) =0
ω =π
dGL (e jω )
= 0.
dω
ω =π
2
Then GL (z) must contain a factor of the form 1 + z−1 . Since the filter order must be even (K must
2
be odd), taking into account that 1 + z−1 would produce a FIR system with 3 nonzero coefficients,
then we have to add at least one factor of the form a(1 + z1 z−1 ) to GL (z). Thus, the lowest order FIR
filter with an even number of (nonzero) impulse response values is
2
G L ( z ) = 1 + z −1 a (1 + z1 z −1 )

with 2 2
P ( z ) = 1 + z −1 1 + z1 R ( z )
where h ih i
R(z) = a(1 + z1 z−1 ) a(1 + z1 z1 ) = z0 z−1 + b + z0 z.
Using
P(z) + P(−z) = 2
only the terms with even exponents of z will remain in P(z) + P(−z) producing

(4z0 + b)z2 + 8z0 + 6b + (4z0 + b)z−1 = 1

8z0 + 6b = 1
4z0 + b = 0

The solution is z0 = −1/16 and b = 1/4. It produces az1 = z0 = −1/16 and a2 + z21 = b = 1/4
with √
1 √ 1− 3
a = √ 1 + 3 and z1 = √
4 2 1+ 3
and 2
1 √ √ √ √
R(z) = √ 1 + 3 + 1 − 3 z −1 1 + 3 + 1 − 3 z1 .
4 2
The reconstruction filter transfer function is
1 √ √
G L ( z ) = √ (1 + z −1 )2 1 + 3 + 1 − 3 z −1
4 2
with
1 √ √ √ √
g L (n)= √ [ 1 + 3 δ(n) + 3 + 3 δ(n − 1) + 3 − 3 δ(n − 2) + 1 − 3 δ(n − 3)].
4 2
660 Wavelet Transform

All other impulse responses follow from this one (as in the presented table).

Example 12.3. Consider a signal that is a linear function of time

x (n) = an + b.

Show that the condition

dHL (e jω )
−h L (−1) + 2h L (−2) − 3h L (−3) = 0 following from =0
dω
ω =π

is equivalent to the condition that highpass coefficients (output from HH (e jω )) are zero-valued,
Fig.12.10. Show that the lowpass coefficients remain a linear function of time.

⋆ The highpass coefficients after the first stage W1 (2n, H ) are obtained by downsampling
W1 (n, H ) whose form is

W1 (n, H ) = x (n) ∗ h H (n)

= x (n)h H (0) + x (n + 1)h H (−1) + x (n + 2)h H (−2) + x (n + 3)h H (−3)
= x ( n ) h3 − x ( n + 1) h2 + x ( n + 2) h1 − x ( n + 3) h0
= ( an + b)h3 − ((n + 1) a + b)h2 + ((n + 2) a + b)h1 − ((n + 3) a + b)h0
= ( a(n + 3) + b) (−h0 + h1 − h2 + h3 ) − a (h1 − 2h2 + 3h3 ) = 0

−h0 + h1 − h2 + h3 = 0 and
h1 − 2h2 + 3h3 = 0.

The lowpass coefficients are obtained by downsampling

W1 (n, L) = x (n) ∗ h L (n)

= x ( n ) h0 + x ( n + 1) h1 + x ( n + 2) h2 + x ( n + 3) h3
= ( an + b)h0 + ((n + 1) a + b)h1 + ((n + 2) a + b)h2 + ((n + 3) a + b)h3
= ( an + b) (h0 + h1 + h2 + h3 ) + a (h1 + 2h2 + 3h3 )
= a1 n + b1
√ √
where a1 = 2a and b1 = 2b + 0.8966a.

Thus we may consider that the highpass D4 coefficients will indicate the deviation of the signal
from a linear function x (n) = an + b. In the first stage the coefficients will indicate the deviation from
the linear function within four samples. In the next stage the equivalent length of wavelet is doubled.
The highpass coefficient in this stage will indicate the deviation of the signal from the linear function
within doubled number of signal samples, and so on. This a significant difference from the STFT nature
that is derived based on the Fourier transform and the signal decomposition and tracking its frequency
content.
Ljubiša Stanković Digital Signal Processing 661

Example 12.4. Show that with the conditions

√ √
h0 + h1 + h2 + h3 = 2 from HL (e j0 ) = 2
jπ
−h0 + h1 − h2 + h3 = 0 from HL (e ) = 0

the reconstruction condition

h20 + h21 + h22 + h23 = 1
is equivalent to the orthogonality property of the impulse response and its shifted version for step
2
h0 h1 h2 h3 0 0 0 0
0 0 h0 h1 h2 h3 0 0
given by
h2 h0 + h3 h1 = 0.

⋆ If we write the sum of squares of the first two equations follows

2(h20 + h21 + h22 + h23 ) + 4h0 h2 + 4h1 h3 = 2.

Therefore, the conditions

h20 + h21 + h22 + h23 = 1
and
h0 h2 + h1 h3 = 0
√
follow from each other if h0 + h1 + h2 + h3 = 2 and − h0 + h1 − h2 + h3 = 0 are assumed.

The matrix for the D4 wavelet transform calculation in the first stage is of the form
    
W1 (0, L) h0 h1 h2 h3 0 0 0 0 x (0)
 W1 (0, H )   h3 − h2 h1 − h0 0 0  
   0 0  x (1) 
 W1 (2, L)   0 0 h h h h 0 0  x (2) 
   0 1 2 3  
 W1 (2, H )   0 0 h3 − h2 h1 − h0 0 0  
 =  x (3) . (12.40)
 W1 (4, L)   0 0 0 0 h h h h  x ( 4 ) 
   0 1 2 3  
 W1 (4, H )   0 0 0 0 h − h h − h  x ( 5 ) 
   3 2 1 0  
 W1 (6, L)   h2 h3 0 0 0 0 h0 h1  x (6) 
W1 (6, H ) h1 − h0 0 0 0 0 h3 − h2 x (7)

In the first row of transformation matrix the coefficients corresponds to h L (n), while the second row
corresponds to h H (n). The first row produces D4 scaling function, while the second row produces
D4 wavelet function. The coefficients are shifted for 2 in next rows. As it has been described in the
Hann(ing) window reconstruction case, the calculation should be performed in a circular manner,
assuming signal periodicity. That is why the coefficients are circularly shifted in the last two rows.

Example 12.5. Consider a signal x (n) = 64 − |n − 64| within 0 ≤ n ≤ 128. How many nonzero
coefficients will be in the first stage of the wavelet transform calculation using D4 wavelet
functions. Assume that the signal can appropriately be extended so that the boundary effects can
be neglected.
662 Wavelet Transform

⋆ In the first stage all highpass coefficients corresponding to linear four-sample intervals
will be zero. It means that out of 64 high pass coefficients (calculated with step two in time)
only one nonzero coefficient will exist, calculated for n = 62, including nonlinear interval
62 ≤ n ≤ 65. It means that almost a half of the coefficients can be omitted in transmission or
storage, corresponding to 50% compression ratio. In the DFT analysis this would correspond to a
signal with a half of (the high frequency) spectrum being equal to zero. In the wavelet analysis
this process would be continued with additional savings in next stages of the wavelet transform
coefficients calculation. It also means that if there is some noise in the signal, we can filter out all
zero-valued coefficients using an appropriate threshold. For this kind of signal (piecewise linear
function of time) we will be able to improve the signal-to-noise ratio for about 3 dB in just one
wavelet stage.

Example 12.6. For the signal x (n) = δ(n − 7) defined within 0 ≤ n ≤ 15 calculate the wavelet
transform coefficients using the D4 wavelet/scale function. Repeat the same calculation for the
signal x (n) = 2 cos(16πn/N ) + 1 with 0 ≤ n ≤ N − 1 with N = 16.

⋆ The wavelet coefficients in the first stage (scale a = 1, see also Fig.12.7) are

W1 (2n, H ) = x (2n)h H (0) + x (2n + 1)h H (−1)

+ x (2n + 2)h H (−2) + x (2n + 3)h H (−3)
= x (2n)h3 − x (2n + 1)h2 + x (2n + 2)h1 − x (2n + 3)h0

with √ √ √ √
1− 3 3− 3 3+ 3 1+ 3
[ h3 , h2 , h1 , h0 ] = [ √ , √ , √ , √ ].
4 2 4 2 4 2 4 2
In specific, W1 (0, H ) = 0, W1 (2, H ) = 0, W1 (4, H ) = −0.4830, W1 (6, H ) = −0.2241,
W1 (8, H ) = 0, W1 (10, H ) = 0, W1 (12, H ) = 0, and W1 (14, H ) = 0.
The lowpass part of the first stage values

s2 (n) = W1 (2n, L) = x (2n)h0 + x (2n + 1)h1 + x (2n + 2)h2 + x (2n + 3)h3

are W1 (0, L) = 0, W1 (2, L) = 0, W1 (4, L) = −0.1294, W1 (6, L) = 0.8365, W1 (8, L) = 0,

W1 (10, L) = 0, W1 (12, L) = 0, and W1 (14, L) = 0. Values of s2 (n) are defined for 0 ≤ n ≤ 7
as s2 (n) = −0.1294δ(n − 2) + 0.8365δ(n − 3). This signal is the input to the next stage (scale
a = 2). The highpass output of the stage two is

W2 (4n, H ) = s2 (n)h3 − s2 (n + 1)h2 + s2 (n + 2)h1 − s2 (n + 3)h0 .

The values of W2 (4n, H ) are: W2 (0, H ) = −0.5123, W2 (4, H ) = −0.1708, W2 (8, H ) = 0, and
W2 (12, H ) = 0. The lowpass values at this stage at the input to the next stage (a = 3) calculation

s3 (n) = W2 (4n, L) = s2 (n)h0 + s2 (n + 1)h1 + s2 (n + 2)h2 + s2 (n + 3)h3 .

They are W2 (0, L) = −0.1373, W2 (4, L) = 0.6373, W2 (8, L) = 0, and W2 (12, L) = 0.

Since there is only 4 samples in s3 (n) this is the last calculation. The coefficients in this stage
are W3 (0, H ) = −0.1251, W3 (8, H ) = −0.4226 and W3 (0, L) = 0.4668, W3 (8, L) = −0.1132.
The absolute value of the wavelet transform of x (n) with D4 wavelet function is shown in
Fig.12.11.
Ljubiša Stanković Digital Signal Processing 663

For the signal x (n) = 2 cos(2π8n/N ) + 1 with 0 ≤ n ≤ N − 1 with N = 16 the same

calculation is done. Here it is important to point out that the circular convolutions should be used.
The wavelet transform coefficients are W1 (2n, L) = 1.4142 and W1 (2n, H ) = 2.8284. Values in
the next stage are W2 (2n, H ) = 0 and W2 (2n, L) = 2. The third stage values are W3 (2n, H ) = 0
and W3 (2n, L) = 2.8284. Compare these results with Fig. 12.2(a). Since the impulse response
duration is 4 and the step is 2 this could be considered as a kind of signal analysis with overlapping.

15 15
14 14
13 13

W1(10,H)

W (12,H)

W1(14,H)

W (10,H)

W1(12,H)

W (14,H)
W1(0,H)

W1(2,H)

W (4,H)

W (6,H)

W (8,H)

W (0,H)

W (2,H)

W (4,H)

W (6,H)

W1(8,H)
12 12
11
1

1
11

1
1

1
10 10
9 9
8 8
7 7
6 6
W2(0,H) W2(4,H) W2(8,H) W2(12,H) W2(0,H) W2(4,H) W2(8,H) W2(12,H)
5 5
4 4
3 3
W3(0,H) W3(8,H) W3(0,H) W3(8,H)
2 2
1 1
W3(0,L) W3(8,L) W3(0,L) W3(8,L)
0 0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Figure 12.11 Daubechies D4 wavelet transform (absolute value) of the signal x (n) = δ(n − 7) using N = 16
signal samples, 0 ≤ n ≤ N − 1 (left). The Daubechies D4 wavelet transform (absolute value) of the signal
x (n) = 2 cos(2π8n/N ) + 1, 0 ≤ n ≤ N − 1, with N = 16 (right).

The inverse matrix for the D4 wavelet transform for a signal with N = 8 samples would be
calculated from the lowest level in this case for a = 2 with coefficients W2 (0, L), W2 (0, H ), W2 (4, L),
and W2 (4, H ). The lowpass part of signal at level a = 1 would be reconstructed using
    
W1 (0, L) h0 h3 h2 h1 W2 (0, L)
 W1 (2, L)   h1 − h2 h3 − h0   W2 (0, H ) 
    
 W1 (4, L)  =  h2 h1 h0 h3   W2 (4, L)  .
W1 (6, L) h3 − h0 h1 − h2 W2 (4, H )
After the lowpass part W1 (0, L), W1 (2, L), W1 (4, L), and W1 (6, L) are reconstructed, they are used
with wavelet coefficients from this stage W1 (0, H ), W1 (2, H ), W1 (4, H ), and W1 (6, H ) to reconstruct
the signal as     
x (0) h0 h3 0 0 0 0 h2 h1 W1 (0, L)
 x (1)   h1 − h2 0 0 h3 − h0   
   0 0  W1 (0, H ) 
 x (2)   h2 h1 h0 h3 0 0 0 
0  W1 (2, L) 
   
 x (3)   h3 − h0 h1 − h2 0 0 0 0  W1 (2, H ) 
 =  . (12.41)
 x (4)   0 0  
   0 h2 h1 h0 h3 0  W1 (4, L) 
 x (5)   0 0 h3 − h0 h1 − h2 0 
0  W1 (4, H )  
  
 x (6)   0 0 0 0 h2 h1 h0 h3  W1 (6, L) 
x (7) 0 0 0 0 h3 − h0 h1 − h2 W1 (6, H )
This procedure can be continued for signal of length N = 16 with one more stage. Additional stage
would be added for N = 32 and so on.
664 Wavelet Transform

Example 12.7. For the Wavelet transform from the previous example find its inverse (reconstruct the
signal).

⋆ The inversion is done backwards. From W3 (0, H ), W3 (0, L), W3 (8, H ), W3 (8, L) we get
signal s3 (n) or W2 (2n, L) as
    
W2 (0, L) h0 h3 h2 h1 W3 (0, L)
 W2 (4, L)   h1 − h2 h3 − h0   W3 (0, H ) 
    
 W2 (8, L)  =  h2 h1 h0 h3   W3 (8, L) 
W2 (12, L) h3 − h0 h1 − h2 W3 (8, H )
    
h0 h3 h2 h1 0.4668 −0.1373
 h1 − h2 h3 − h0   −0.1251   0.6373 
= 
 h2 h1 h0 h3   −0.1132  = 
  .

0
h3 − h0 h1 − h2 −0.4226 0

Then W2 (4n, L) = s3 (n) are used with the wavelet coefficients W2 (4n, H ) to reconstruct
W1 (2n, L) or s2 (n) using
    
W1 (0, L) h0 h3 0 0 0 0 h2 h1 W2 (0, L)
 W1 (2, L)   h1 − h2 0 0 h3 − h0  
   0 0  W2 (0, H ) 
 W1 (4, L)   h2 h1 h0 h3 0 0 0 0  W2 (4, L) 
    
 W1 (6, L)   h3 − h0 h1 − h2 0 0  
 = 0 0  W2 (4, H )  .
 W1 (8, L)   0 0 h h h h 0 0  W ( 8, L ) 
   2 1 0 3  2 
 W1 (10, L)   0 0 h − h h − h 0 0  W2 (8, H ) 
   3 0 1 2  
 W1 (12, L)   0 0 0 0 h2 h1 h0 h3  W2 (12, L) 
W1 (14, L) 0 0 0 0 h3 − h0 h1 − h2 W2 (12, H )

The obtained values W1 (n, L) with the wavelet coefficients W1 (n, H ) are used to reconstruct the
original signal x (n). The transformation matrix in this case is of 16 × 16 order and it is formed
using the same structure as the previous transformation matrix.

12.2.8 Daubechies D4 Wavelet Functions in Different Scales

Although the wavelet realization can be performed using the same basic function presented in the
previous section, here we will consider the equivalent wavelet function h H (n) and equivalent scale
function h L (n) in different scales. To this aim we will analyze the reconstruction part of the system.
Assume that in the wavelet analysis of a signal only one coefficient is nonzero. Also assume that this
nonzero coefficient is at the exit of all lowpass filters structure. It means that the signal is equal to the
basic scale function in the wavelet analysis. The scale function can be found in an inverse way, by
reconstructing signal corresponding to this delta pulse like transform. The system of reconstruction
filters is shown in Fig.12.12. Note that this case and coefficient in the Haar transform would correspond
to W4 (0, L) = 1 in (12.21) or in Fig.12.7.
The reconstruction process consists of signal upsampling and passing it trough the reconstruction
stages. For example, the output of the third reconstruction stage has the z-transform

Φ2 ( z ) = G L ( z ) G L ( z2 ) G L ( z4 ).
Ljubiša Stanković Digital Signal Processing 665

In the time domain the reconstruction is performed as

φ0 (n) = δ(n) ∗ g L (n) = g L (n)

φ1 (n) = [φ0 (0) 0 φ0 (1) 0 φ0 (2) 0 φ0 (3)] ∗ g L (n)
φ2 (n) = [φ1 (0) 0 φ1 (1) 0 . . . φ1 (8) 0 φ1 (9)] ∗ g L (n)
....
φa+1 (n) = ∑ φa ( p) g L (n − 2p),
p

where g L (n) is the four sample impulse response (Daubechies D4 coefficients). Duration of the scale
function φ1 (n) is (4 + 3) + 4 − 1 = 10 samples, while the duration of φ2 (n) is 19 + 4 − 1 = 22
samples. The scale function for different scales a (exists of different reconstruction stages) are is
presented in Fig.12.14. Normalized values φa (n)2( a+1)/2 are presented. The amplitudes are scaled
by 2( a+1)/2 in order to keep their values within the same range for various a. In a similar way the

δ (n) GL(z)
φ0(n)=hL(n)

↑ GL(z)
φ1(n)
2

0 GH(z) ↑ GL(z)

2
φ2(n)
↑ GH(z)

↑ GH(z)

Figure 12.12 Calculation of the upsampled scale function.

wavelet function ψ(n) is calculated. The mother wavelet is obtained in the wavelet analysis of a signal
when only one nonzero coefficient exists at the highpass of the lowest level of the signal analysis.
To reconstruct the mother wavelet the reconstruction system as in Fig.12.13 is used. The values of
ψ(n) are calculated: using the values of g H (n) at the first input, upsampling it and passing trough the
reconstruction system with g L (n), to obtain ψ1 (n) and repeating this procedure for the next steps. The
resulting z-transform is
Ψ ( z ) = G H ( z ) G L ( z2 ) G L ( z4 ).
In the Haar transform (12.21) and Fig.12.7 this case would correspond to W4 (0, H ) = 1.
666 Wavelet Transform

0 G (z)
L

↑ G (z)
L
ψ1(n)
2

δ (n) G (z)
H
ψ0(n)=hH(n) ↑ G (z)
L

2
ψ (n)
2
↑ G (z)
H

↑ GH(z)

Figure 12.13 Calculation of the upsampled wavelet function

Calculation in the time of the wavelet function in different scales is done using

ψ0 (n) = δ(n) ∗ g H (n) = g H (n)

ψ1 (n) = [ψ1 (0) 0 ψ1 (1) 0 ψ1 (2) 0 ψ1 (3)] ∗ g L (n)
ψ2 (n) = [ψ2 (0) 0 ψ2 (1) 0 . . . ψ2 (8) 0 ψ2 (9)] ∗ g L (n)
....
ψa+1 (n) = ∑ ψa ( p) g L (n − 2p)
p

Different scales of the wavelet function are presented in Fig.12.14, normalized using ψa (n)2( a+1)/2 .
Wavelet function are orthogonal in different scales, with corresponding steps, as well. For example,
it is easy to show that
hψ0 (n − 2m), ψ1 (n)i = 0
since !
hψ0 (n − 2m), ψ1 (n)i = ∑ g H ( p) ∑ gH (n − 2m) gL (n − 2p) =0
p n
for any p and m according to (12.36).
Note that the wavelet and scale function in the last row are plotted as the continuous functions. The
continuous wavelet transform (CWT) is calculated by using the discretized versions of the continuous
functions. However in contrast to the discrete wavelet transform whose step in time and scale change is
strictly defined, the continuous wavelet transform can be used with various steps and scale functions.

Example 12.8. In order to illustrate the procedure it has been repeated for the Haar wavelet when
g L (n) = [1 1] and g H (n) = [1 −1]. The results are presented in Fig.12.15.
Ljubiša Stanković Digital Signal Processing 667

Daubechies scaling function D4 Daubechies wavelet D4

1 1

0 0

−1 −1

0 10 20 30 40 0 10 20 30 40

1 1

0 0

−1 −1

0 10 20 30 40 0 10 20 30 40

1 1

0 0

−1 −1

0 10 20 30 40 0 10 20 30 40

1 1

0 0

−1 −1

0 10 20 30 40 0 10 20 30 40

1 1

0 0

−1 −1

0 1 2 3 0 1 2 3

Figure 12.14 The Daubechies D4 wavelet scale function and wavelet calculated using the filter bank relation
in different scales: a = 0 (first row), a = 1 (second row), a = 2 (third row), a = 3 (fourth row), a = 10 (fifth
row-approximation of a continuous domain). The amplitudes are scaled by 2(a+1)/2 to keep them within the same
range. Values ψa (n)2(a+1)/2 and φa (n)2(a+1)/2 are presented.
668 Wavelet Transform

Haar scaling function Haar wavelet

1 1

0 0

−1 −1

0 5 10 15 0 5 10 15

1 1

0 0

−1 −1

0 5 10 15 0 5 10 15

1 1

0 0

−1 −1

0 5 10 15 0 5 10 15

1 1

0 0

−1 −1

0 5 10 15 0 5 10 15

Figure 12.15 The Haar wavelet scale function and wavelet calculated using the filter bank relation in different
scales. Values are normalized 2(a+1)/2 .

12.2.9 Daubechies D6 Wavelet Transform

The results derived for Daubechies D4 wavelet transform can be extended to higher order polynomial
functions. Consider a sixth order FIR system

h L (n) = [ h L (0), h L (−1), h L (−2), h L (−3), h L (−4), h L (−5)]

= [ h0 , h1 , h2 , h3 , h4 , h5 ].
√
In addition to the conditions HL (e j0 ) = 2 and HL (e jπ ) = 0, written as
√
h0 + h1 + h2 + h3 + h4 + h5 = 2
h0 − h1 + h2 − h3 + h4 − h5 = 0,
Ljubiša Stanković Digital Signal Processing 669

the orthogonality conditions

h0 h2 + h1 h3 + h2 h4 + h3 h5 = 0
h0 h4 + h1 h5 = 0,

are added. Since the filter order is 6 then two orthogonality conditions must be used. One for shift 2
and the other for shift 4.
The linear signal cancellation condition is again used as

−h1 + 2h2 − 3h3 + 4h4 − 5h5 = 0.

The final condition in the Daubechies D6 wavelet transform is that the quadratic signal cancellation is
achieved for highpass filter, meaning

d2 HL (e jω ) d2 ∑5n=0 hn e jωn 5
2 jωn
dω 2

=
dω 2 = − ∑ n h n e

= 0.
ω =π n =0 ω =π
ω =π

This condition is of the form

−h1 + 22 h2 − 32 h3 + 42 h4 − 52 h5 = 0

From the set of six equations the Daubechies D6 wavelet transform coefficients are obtained as

h L (n) = [1.1411, 0.4705, 0.6504, 0.0498, −0.1208, −0.1909].

This is one of possible symmetric solutions of the previous system. From the definition it is obvious that
the highpass coefficients will be zero as far as the signal is of quadratic nature within the considered
interval. These coefficients can be used as a measure of the signal deviation from the quadratic form in
each scale.
Implementation is the same as in the case of Haar or D4 wavelet transform. Only difference is in
the filter coefficients form.
This form can be also derived from the reconstruction conditions and the fact that the transfer
function GL (z) contains a factor of the form (1 + z−1 )3 since z = −1 is its third order zero, according
to the assumptions.

12.2.10 Coifflet Transform

In the Daubechies D6 wavelet transform the last condition is introduced so that the output of high-pass
filter is zero when the input signal is quadratic. Another way to form filter coefficients for a six sample
wavelet is to introduce the condition that the first moment of the scale function is zero, instead of the
second order moment of the wavelet function. In this case symmetric form of coefficients should be
used in the definition
√
h L (−2) + h L (−1) + h L (0) + h L (1) + h L (2) + h L (3) = 2
h2L (−2) + h2L (−1) + h2L (0) + h2L (1) + h2L (2) + h2L (3) = 1
−2h L (−2) + h L (−1) − h L (1) + 2h L (2) − 3h L (3) = 0
h L (−2)h L (0) + h L (−1)h L (1) + h L (0)h L (2) + h L (1)h L (3) = 0
h L (−2)h L (2) + h L (−1)h L (3) = 0.
670 Wavelet Transform

The first-order moment of h L (n) is

−2h L (−2) − h L (−1) + h L (1) + 2h L (2) + 3h L (3) = 0

This is so called sixth order coifflet transform. Its coefficients are

√ √
h(−2) = ( 2 − 14)/32,
√ √
h(−1) = (−11 2 + 14)/32,
√ √
h(0) = (7 2 + 14)/16,
√ √
h(1) = (− 2 − 14)/16,
√ √
h(2) = ( 2 − 14)/32,
√ √
h(3) = (−3 2 + 14)/32.

12.2.11 Discrete Wavelet Transform - STFT

Originally the wavelet transform was introduced by Morlet as a frequency varying STFT. Its aim was
to analyze spectrum of the signal with varying resolution in time and frequency. Higher resolution in
frequency was required at low frequencies, while at high frequencies high resolution in time was the
aim, for specific analyzed seismic signals.
The Daubechies D4 wavelet/scale function is derived from the condition that the highpass
coefficients of a signal with linear change in time (x (n) = an + b) are zero-valued. Higher order
Daubechies wavelet/scale functions are derived by increasing the order of the signal polynomial
changes. Frequency of a signal does not play any direct role in the discrete-wavelet transform definition
using Daubechies functions. In this sense it would be easier to relate the wavelet transform to the
linear (D4) and higher order interpolations of functions (signals), within the intervals of various lengths
(corresponding to various wavelet transform scales), than to the spectral analysis where the harmonic
basis functions play the central role.

Example 12.9. Consider a signal x (n) with M = 16 samples, 0 ≤ n ≤ M − 1. Write the Daubechies
D4 wavelet transform based decomposition of this signal that will divide the frequency axis into
four equal regions.

⋆ In the STFT a 4−point (N −point) signal would be used to calculate 4 (or N) coefficients of
the frequency plane. The wavelet transform divides the time-frequency plane into two regions
(high and low) regardless of the number of the signal values (wavelet transform coefficients) being
used. If the Haar wavelet is used in Fig.12.16 then by dividing both highpass bands and lowpass
bands in the same way the short-time Walsh-Hadamard transform with 4-sample nonoverlapping
calculation would be obtained. In the cases of Daubechies 4D wavelet transform, a kind of short
time analysis with the Daubechies functions would be obtained. For the Daubechies D4 function
the scale 2 functions:

φ1 (n) = h LL (n) = [ h L (0) 0 h L (1) 0 h L (2) 0 h L (3)] ∗ h L (n) (12.42)

ϕ1 (n) = h LH (n) = [ h H (0) 0 h H (1) 0 h H (2) 0 h H (3)] ∗ h L (n)
ψ1 (n) = h HL (n) = [ h L (0) 0 h L (1) 0 h L (2) 0 h H (3)] ∗ h H (n)
κ1 (n) = h HH (n) = [h H (0) 0 h H (1) 0 h H (2) 0 h H (3)] ∗ h H (n) (12.43)
Ljubiša Stanković Digital Signal Processing 671

discrete−time n 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

HL(z) ↓ W(0,3) W(4,3) W(8,3) W(12,3)

2
HH(z) ↓
2
HH(z) ↓ W(0,2) W(4,2) W(8,2) W(12,2)
2
x(n)

HH(z) ↓ W(0,2) W(4,2) W(8,2) W(12,2)

2
HL(z) ↓
2
HL(z) ↓ W(0,0) W(4,0) W(8,0) W(12,0)
2

Figure 12.16 Full coverage of the time-frequency plane using the filter bank calculation and systems with impulse
responses corresponding to the wavelet transformation.

would be used to calculate W (4n, 0), W (4n, 1), W (4n, 2), and W (4n, 3), Fig.12.17. The
asymmetry of the frequency regions is visible.
Note that the STFT analysis of this case, with a Hann(ing) window of N = 8 and calculation
step R = 4 will result in the same number of instants, however the frequency range will be divided
in 8 regions, having a finer grid. This grid is redundant with respect to the signal and to the
wavelet transform. Both, the signal and the wavelet transform have 16 values (coefficients).
672 Wavelet Transform

Daubechies functions D4 Spectral form of Daubechies functions D4

1 4

0 2
−1
0
0 2 4 6 8 10 −1 −0.5 0 0.5 1

1 4

0 2
−1
0
0 2 4 6 8 10 −1 −0.5 0 0.5 1

1 4

0 2
−1
0
0 2 4 6 8 10 −1 −0.5 0 0.5 1

1 4

0 2
−1
0
0 2 4 6 8 10 −1 −0.5 0 0.5 1

Figure 12.17 Daubechies functions: Scaling function (first row), Mother wavelet function (second row), Function
producing the low-frequency part in the second stage of the high frequency part in the first stage (third), Function
producing the high-frequency part in the second stage of the high frequency part in the first stage (fourth). Time
domain forms of the functions are left while its spectral content is shown on the right.
Part VI

Sparse Signal Processing and

Compressive Sensing

673
Chapter 13
Sensing of Sparse Signals

A discrete-time signal can be transformed into various domains using different signal trans-
formations. Some signals that cover the whole considered interval in one domain could have only
a few nonzero coefficients in a transformation domain. These signals are sparse in the considered
transformation domain. An observation or measurement of a sparse signal is a linear combination
of the sparsity domain coefficients. Since the signal samples are linear combinations of the signal
transform coefficients they can be considered as the measurements of a sparse signal in the respective
transformation domain.
Compressive sensing is a field dealing with a model for data acquisition including the problem of
sparse signal recovery from a reduced set of measurements. A reduced set of measurements can be a
result of a desire to sense a sparse signal with the lowest possible number of measurements/observations
(compressive sensing). It can also be a result of a physical or measurement unavailability to take the
complete set of measurements. In applications it could also happen that some arbitrarily positioned
samples of a signal are so heavily corrupted by disturbances that it is better to omit them and consider as
unavailable in the analysis and to try to reconstruct the signal from a reduced set of samples. Although
the reduced set of measurements appears in the first case as a result of the user strategy to compress
information, while in the next two cases the reduced set of samples is not a result of user intention, all
of them can be considered within the unified framework. Under some conditions, a full reconstruction
of a sparse signal can be obtained with a reduced set of measurements/samples, as in the case if a
complete set of measurements/samples were available. A priori information about the sparse nature of
the analyzed signal in a known transformation domain must be used in this analysis. Sparsity is the
main requirement that should be satisfied in order to efficiently apply the compressive sensing methods
for sparse signal reconstruction.
The topic of this chapter is to analyze the signals that are sparse in one of its transformations
domains. The DFT will be used as a case study. The compressive sensing results and algorithms are
presented and used as a tool to solve engineering problems involving sparse signals.

13.1 ILLUSTRATIVE EXAMPLES

Before we start the analysis we will describe two simple examples that can be interpreted and solved
within the context of sparse signal processing and compressive sensing.
Consider a large set of real numbers X (0), X (1), . . . , X ( N − 1). Assume that only one of them
is nonzero (or different from a common and known expected value). We do not know either its position

_________________________________________________
Authors: Ljubiša Stanković, Miloš Daković, Srdjan Stanković, Irena Orović

674
Ljubiša Stanković Digital Signal Processing 675

or its value. The aim is to find the position and the value of this nonzero number. The nonzero valued
sample will be denoted by X (i ). A direct way to find the position of nonzero sample would be to
perform up to N measurements and to check which sample assumes a nonzero value. However, if N is
very large and there is only one nonzero sample we can get the result using just a few measurements. A
procedure to solve the problem with a reduced number of measurements is described next.
Take random numbers as weighting coefficients a0 (k), k = 0, 1, 2, . . . , N − 1, for each sample.
Measure the total value of all N weighted samples. Since only one of them is different from zero we
will get the measurement
N −1
y (0) = ∑ a0 ( k ) X ( k ) = a0 ( i ) X ( i ). (13.1)
k =0
The same value will be obtained if there is only one sample different from the common and known
expected value m of all other samples. Then, we will get the total measured value

M = a1 m + a2 m + · · · + ai (m + X (i )) + · · · + a N m.

After we subtract the expected value MT = ( a1 + a2 + · · · + a N )m from M, the same measurement

y(0) as in (13.1) follows.
As an illustration consider a set of N bags. Assume that only one bag contains all false coins
whose weight is m + X (i ). It is different from the known weight m of the true coins. The goal is to
find the position, i, and the difference in weight, X (i ), of the false coins. From each of N bags we will
take ak (0), k = 0, 1, . . . N − 1, coins, respectively. The number of coins from the kth bag is denoted by
ak (0). The total measured weight of all coins from N bags is M, Fig.13.1. After the expected value is
subtracted, the measurement y(0) value is obtained as
N −1
y (0) = ∑ X ( k ) a k (0). (13.2)
k =0

In the space of unknowns (variables) X (0), X (1), . . . , X ( N − 1), this equation represents an N-
dimensional hyperplane. We know that only one unknown X (k) is nonzero at an unknown position
k = i. The cross-section of hyperplane (13.2) with any of the coordinate axes could be a solution to our
problem Fig.13.2(a). Assuming that a single X (k) is nonzero, a solution will exist for any k. Thus, one
measurement would produce a set of N possible single nonzero values equal to

X (k ) = y(0)/ak (0), ak (0) 6= 0, k = 0, 1, 2, . . . , N − 1.

As expected, from one measurement we are not able to solve the problem (to find the position and the
value of one nonzero sample).
If we perform one more measurement, with another set of weighting coefficients ak (1),
k = 0, 1, . . . , N − 1, and get measurement y(1) = X (i ) ai (1), the result will be another hyperplane,
Fig.13.2(b),
N −1
y (1) = ∑ X ( k ) a k (1). (13.3)
k =0

This measurement will produce a new set of possible solutions for each X (k) defined by

X (k) = y(1)/ak (1), ak (1) 6= 0, k = 0, 1, 2, . . . , N − 1.

If these two hyperplanes (sets of possible solutions) produce only one common value for k = i,

X (i ) = y(0)/ai (0) = y(1)/ai (1),

676 Sensing of Sparse Signals

1 2 3 N

Figure 13.1 There are N bags with coins. One of them, at an unknown position, contains false coins. False coins
differ from the true ones in mass for an unknown X (i ) = ∆m. The mass of the true coins is m. A set of coins for
the measurement is formed using: a1 coins from the first bag, a2 coins from the second bag, and so on. The total
measured value is M = a1 m + · · · + ai (m + X (i )) + · · · + a N m. The difference of this value from the total mass
if all coins were true is M − MT . The equations for the case with one and two bags with false coins are shown. The
notation ak (0) = ak+1 , for k = 0, 1, . . . , N − 1, is used in this illustration.

Figure 13.2 The solution illustration for N = 3, K = 1, and various possible cases: (a) Three possible solutions
for one measurement plane. (b) Unique solution for two measurement planes. (c) Two possible solutions for two
measurement planes.
Ljubiša Stanković Digital Signal Processing 677

then it is the solution to our problem, Fig.13.2(b).

Example 13.1. Consider a set of N = 5 bags of coins. In one of them all coins are false. The weight
of true coins is m = 2.
In the first measurement we use ak (0) = k coins from the kth bag. The total weight of coins
in this measurement is M = 31. This weight is equal to (1 + 2 + 3 + 4 + 5)2 + iX (i ) = M,
where X (i ) is the unknown weight difference of false coins. It means that iX (i ) = 1, since all
true coins would produce M = (1 + 2 + 3 + 4 + 5)2 = 30. If the false coins were in the first
bag, their weight difference would be X (1) = 1/1 = 1, if they were in the second bag then
X (2) = 1/2, and so on, X (3) = 1/3, X (4) = 1/4, X (5) = 1/5. False coins can be in any of
th3 five bags.
Let us perform one more measurement with ak (1) = k2 coins from each bag. Assume that we
get the total measured weight M = 113. It is equal to M = 2(12 + 22 + 32 + 42 + 52 ) + i2 X (i ) =
113. Obviously i2 X (i ) = 3. Again, if the false coins were in the first bag then X (1) = 3/1,
the second false bag would produce X (2) = 3/22 = 3/4, and so on, X (3) = 3/32 = 1/3,
X (4) = 3/16, X (5) = 3/25.
The common solution for both sets is X (3) = 1/3. Thus, the false coins are in the third bag.
Their weight difference from the true coins is 1/3.

Measurements (13.2) and (13.3) can be written in a matrix form as

 
X (0)
y (0) a0 (0) a1 (0) . . . a N −1 (0)   X (1) 

=
y (1) a0 (1) a1 (1) . . . a N −1 (1)  ... 
X ( N − 1)
y = AX,

where A is the measurement matrix

a0 (0) a1 (0) ... a N −1 (0)
A= (13.4)
a0 (1) a1 (1) ... a N −1 (1)

and y are the measurements of the vector variable X.

Common value for two measurements X (i ) = y(0)/ai (0) and X (i ) = y(1)/ai (1) is unique if

a i (0) a k (0)
ai (0) ak (1) − ai (1) ak (0) = det 6= 0
a i (1) a k (1)

for all i 6= k. It also means that rank (A2 ) = 2 for all 2 × 2 submatrices, denoted by A2 , of the
measurement matrix A defined by (13.4).
In order to prove this statement assume that two different solutions X (i ) and X (k), for the case of
one nonzero coefficient, satisfy the same measurement hyperplane equations (proof by contradiction)

a i (0) X ( i ) = y (0), a i (1) X ( i ) = y (1)

and
a k (0) X ( k ) = y (0), a k (1) X ( k ) = y (1).
678 Sensing of Sparse Signals

Then

a i (0) X ( i ) = a k (0) X ( k )
and
a i (1) X ( i ) = a k (1) X ( k ).

If we divide these two equations we get

a i (0) a (0)
= k
a i (1) a k (1)

or ai (0) ak (1) − ai (1) ak (0) = 0. This is contrary to the assumption that ai (0) ak (1) − ai (1) ak (0) 6= 0.
The same conclusion can be made considering matrix form relations for X (i ) and X (k). If both
of them may satisfy the same two measurements then

y (0) a i (0) a k (0) X (i )
=
y (1) a i (1) a k (1) 0

y (0) a i (0) a k (0) 0
= . (13.5)
y (1) a i (1) a k (1) X (k)
Subtraction of the previous matrix equations results in

a i (0) a k (0) X (i )
= 0.
a i (1) a k (1) − X (k)

If ai (0) ak (1) − ai (1) ak (0) 6= 0 is satisfied, then the trivial solution to the problem, X (i ) = X (k) = 0,
follows. Therefore, two different nonzero solutions X (i ) and X (k) cannot exist in this case.
The previous experiment can be repeated assuming two nonzero values X (i ) and X (k),
Fig.13.1(second option). In the case of two nonzero elements in vector X, two measurements,
N −1
y (0) = ∑ X ( l ) a l (0) = X ( i ) a i (0) + X ( k ) a k (0) (13.6)
l =0
N −1
y (1) = ∑ X ( l ) a l (1) = X ( i ) a i (1) + X ( k ) a k (1),
l =0

will result in X (i ) and X (k) for any i and k, i 6= k. They are the solution to a system with two equations
of two unknowns. Therefore, with two measurements we cannot get a result of the problem and find
the positions and the values of two nonzero coefficients. If two more measurements are performed then
an additional system with two equations

y (2) = X ( i ) a i (2) + X ( k ) a k (2) (13.7)

y (3) = X ( i ) a i (3) + X ( k ) a k (3)

is formed. Two systems of two equations (13.6) and (13.7) could be solved to find X (i ) and X (k) for
each combination of i and k. If these two systems produce only one common solution pair X (i ) and
X (k), then this pair is the unique solution to our problem. As in the case of one nonzero coefficient, we
may show that the sufficient condition for the unique solution is
 
a k 1 (0) a k 2 (0) a k 3 (0) a k 4 (0)
 a k (1) a k (1) a k (1) a k (1) 
det 
 a k (2) a k (2) a k (2) a k (2)  6 = 0
1 2 3 4  (13.8)
1 2 3 4
a k 1 (3) a k 2 (3) a k 3 (3) a k 4 (3)
Ljubiša Stanković Digital Signal Processing 679

for all combinations of k1 , k2 , k3 and k4 or rank (A4 ) = 4 for all A4 , where A4 is a 4 × 4 submatrix
of the measurement matrix A defined, in this case, as
 
a0 (0) a1 (0) . . . a N −1 (0)
 a0 (1) a1 (1) . . . a N −1 (1) 
A =  a0 (2) a1 (2) . . . a N −1 (2)  .
 (13.9)
a0 (3) a1 (3) . . . a N −1 (3)

Suppose that (13.8) holds and that two pairs of solutions of the problem X (k1 ), X (k2 ) and
X (k3 ), X (k4 ) exist. Then,
    
y (0) a k 1 (0) a k 2 (0) a k 3 (0) a k 4 (0) X (k1 )
 y (1)   a k 1 (1) a k 2 (1) a k 3 (1) a k 4 (1)   X (k2 ) 
    
 y (2)  =  a k 1 (2) a k 2 (2) a k 3 (2) a k 4 (2)   0 
y (3) a k 1 (3) a k 2 (3) a k 3 (3) a k 4 (3) 0

and     
y (0) a k 1 (0) a k 2 (0) a k 3 (0) a k 4 (0) 0
 y (1)   a k (1) a k 2 (1) a k 3 (1) a k 4 (1)   0 
   1  .
 y (2)  =  a k (2) a k 2 (2) a k 3 (2) a k 4 (2)   X ( k 3 ) 
1
y (3) a k 1 (3) a k 2 (3) a k 3 (3) a k 4 (3) X (k4 )
Subtracting these two systems we get
  
a k 1 (0) a k 2 (0) a k 3 (0) a k 4 (0) X (k1 )
 a k (1) a k (1) a k 3 (1) a k 4 (1)   

0= 1 2   X (k2 ) .
a k 1 (2) a k 2 (2) a k 3 (2) a k 4 (2)   − X ( k 3 ) 
a k 1 (3) a k 2 (3) a k 3 (3) a k 4 (3) − X (k4 )

Since (13.8) holds then X (k1 ) = X (k2 ) = X (k3 ) = X (k4 ) = 0, meaning that the assumption about
two independent pairs of solutions with two nonzero coefficients is not possible if (13.8) holds.
The presented approach to solve the problem (and to check the solution uniqueness) is illustrative,
however computationally not feasible. For example, in a simple case with N = 1024 and just two
nonzero coefficients, we have to solve systems of equations (13.6) and (13.7) for each possible
combination of i and k and to compare their solutions. The total number of combinations of two out of
N indices is
N
∼ 5 × 105 .
2
In order to check the solution uniqueness we should calculate a determinant value for all combinations
of four indices k1 , k2 , k3 , k4 out of the set of N indices. The number of determinants is ( N4 ) ∼ 1010 . If
one determinant of the fourth order is calculated in 10−5 [sec], then more than 5 days are needed to
calculate all determinants for this quite simple case of two nonzero coefficients.
As the second illustrative example consider a signal described by a weighted sum of K harmonics
from a set of possible discrete oscillatory functions e j2πkn/N , k = 0, 1, 2, . . . , N − 1,

x (n) = A1 e j2πk1 n/N + A2 e j2πk2 n/N + · · · + AK e j2πkK n/N ,

with K ≪ N. This signal is sparse in the DFT domain. Its DFT X (k) assumes only a few nonzero
values at k = k i , i = 1, 2, . . . , K.
In classical signal processing this signal is described by a full set of N signal samples/measure-
ments x (n) at n = 0, 1, 2, . . . , N − 1.
However, if we know that the signal consists of only K ≪ N discrete oscillatory functions with
unknown amplitudes and frequency indices k i , then regardless of their frequencies, the signal can be
680 Sensing of Sparse Signals

fully reconstructed from a reduced set of signal samples. As in the first illustrative example, a signal
sample at an arbitrary instant n1 can be considered as a weighted measurement of the sparse coefficients
X ( k ),
N −1 N −1
y (0) = x ( n1 ) = ∑ X (k)ψk (n1 ) = ∑ X ( k ) a k (0),
k =0 k =0
with the weighting factors ψk (n1 ) = exp( j2πn1 k/N )/N = ak (0). The previous relation is the inverse
DFT.
Now a similar analysis like in the first illustrative example can be performed, assuming, for
example, K = 1 or K = 2. We can find the positions and the values of nonzero coefficients X (k) using
just a few signal samples/measurements y(i ). When the nonzero coefficient positions and their values
are recovered then the whole DFT and the signal are recovered.
This model corresponds to many signals in real life. For example, in the Doppler-radar systems
the speed of a radar target is transformed into a frequency of a sinusoidal signal. Since the returned
signal contains only one or just a few targets, the signal representing target velocity is a sparse signal
in the DFT domain. It can be reconstructed from fewer samples than the total number of radar return
signal samples N.

13.2 BASIC DEFINITIONS

After the basic notions are introduced through illustrative examples in the previous section, here we
will provide formal definitions of the key concepts in sparse signal processing and compressive sensing.

13.2.1 Sparsity

Consider a set of numbers X (k), k = 0, 1, . . . , N − 1. In signal processing this set of numbers

corresponds to a signal in one of its representation domains.
A sequence { X (k)}, k = 0, 1, . . . , N − 1 is referred to as a sparse sequence if the number, K, of
its nonzero elements, X (k) 6= 0, is much smaller than its total length, N, that is,

X (k) = 0

for k ∈
/ {k1 , k2 , . . . , k K } = K, where the sparsity support set K is a subset of all possible indices,

K = {k1 , k2 , . . . , k K } ⊂ {0, 1, . . . , N − 1}.

The number of nonzero elements can be written as

kXk0 = card {K } = K,

where card {K } is the notation for the number of elements in K. Counting the nonzero elements in a
signal representation X can be achieved using the so called ℓ0 -norm
N −1
k X k0 = ∑ | X (k)|0 .
k =0

This function is referred to as the ℓ0 -norm (norm-zero) although it does not satisfy the norm properties
(kcXk0 = kXk0 6= c kXk0 , for an arbitrary constant c). By definition | X (k)|0 = 0 for | X (k)| = 0 and
| X (k)|0 = 1 for | X (k)| 6= 0.
Ljubiša Stanković Digital Signal Processing 681

The set of numbers X (k), k = 0, 1, . . . , N − 1, is sparse if

card {K } = K ≪ N.

A signal x (n), n = 0, 1, . . . , N, is sparse in a representation domain with elements X (k),

k = 0, 1, . . . , N, if the set of these elements is sparse.

Example 13.2. Consider two sets of sparse numbers X (k) and H (k), k = 0, 1, . . . , N − 1, in vector
notations X and H. Show that the sparsity of the sum of these numbers is not greater than the
sum of their individual sparsities,

k H + X k0 ≤ k H k0 + k X k0 , (13.10)

meaning that the ℓ0 -norm satisfies the triangular property.

⋆ Assume that the sparsity support of X is K X and the sparsity support of H is K H . We can
differ the following cases:
- If K X ∩ K H = ∅, then the number of nonzero numbers in X (k) + H (k) is equal to the
sum of the numbers of nonzero elements in X (k) and H (k), and kH + Xk0 = kHk0 + kXk0 .
- If K X ∩ K H 6= ∅, then the number of nonzero numbers in X (k) + H (k) is always smaller
than the sum of the numbers of nonzero elements in X (k ) and H (k), for the number of overlapping
indices and possible number of the elements that cancel out. Then kH + Xk0 < kHk0 + kXk0 .
Inequality (13.10) follows from these two cases.

13.2.2 Measurements

A linear combination of the elements X (k), k = 0, 1, . . . , N − 1, of a vector X, given by

N −1
y(n) = ∑ a k ( n ) X ( k ) = a0 ( n ) X (0) + · · · + a N −1 ( n ) X ( N − 1), (13.11)
k =0

is called a measurement, with the weighting coefficients (weights) denoted by ak (n). The measurements
can written in a form of the system of M equations
    
y (0) a0 (0) a1 (0) a N −1 (0) X (0)
 y (1)   a0 (1) a1 (1) a N −1 (1)   
     X (1) 
 .. = .. .. ..  ..  (13.12)
 .   . . .  . 
y ( M − 1) a0 ( M − 1) a1 ( M − 1) a N −1 ( M − 1) X ( N − 1)

or in the matrix form as

y = AX,
where A is an M × N measurement matrix. An illustration of the measurements calculation is given in
Fig. 13.3.
682 Sensing of Sparse Signals

0
1
2
3
4
5
6
0 1 2 3 4 5 6 7 8 9 10 11 12 13

Figure 13.3 Principle of compressive sensing. The short and wide measurement matrix A maps the original
N-dimensional K-sparse vector, X, to an M-dimensional dense vector of measurements, y, with M < N and
K ≪ N. In our case N = 14, M = 7, and K = 2.

13.2.2.1 Sparsity Aware Form of the Measurements

The fact that the signal is sparse with X (k) = 0 for k ∈ / {k1 , k2 , . . . , k K } = K is not included in the
measurement matrix A since the positions of the nonzero values are unknown. If the knowledge that
X (k) = 0 for k ∈/ {k1 , k2 , . . . , k K } = K were included, then a reduced measurement matrix would be
obtained as
    
y (0) a k 1 (0) a k 2 (0) a k K (0) X (k1 )
 y (1)   a k 1 (1) a k 2 (1) a k K (1)   X (k2 ) 
    
 .. = .. .. ..  ..  (13.13)
 .   . . .  . 
y ( M − 1) a k 1 ( M − 1) a k 2 ( M − 1) a k K ( M − 1) X (k K )
or
y = AK XK .
The M × K matrix AK would be formed if we knew the positions of nonzero samples k ∈
{k1 , k2 , . . . , k K } = K. It would follow from the measurement matrix A by omitting the columns
corresponding to the zero-valued elements X (k). Vector XK consists of the assumed nonzero elements
X ( k ).
Assuming that there are K nonzero elements X (k), the total number of possible different matrices
AK is equal to the number of combinations with K out of N positions. It is equal to ( N K ). This matrix
will play an important role in the analysis that follows.

13.2.2.2 Signal Samples as Measurements

In signal processing, the sparsity domain is commonly one of the signal transformation domains. For a
linear signal transform X = Φx and its inverse transform x = ΨX the signal samples are
N −1
x (n) = ∑ X (k)ψk (n),
k =0
Ljubiša Stanković Digital Signal Processing 811

σ = 0.1 are assumed. The assumed threshold for considering hyperparameters extremely large is
Th = 100. Hyperparameters above this threshold are omitted from calculation (along with the
corresponding values in X, A, D and V). The results for estimated mean value V in the first
iteration are shown in Fig.14.38(c), along with the values of hyperparameters V in Fig.14.38(d).
The hyperparameters whose value is above Th are omitted (pruned) along with the corresponding
values at the same positions in all other matrices. In the second iteration the values of remaining
hyperparameters V are shown in Fig.14.38(e). After the elimination of hyperparameters above
the threshold, the third iteration is calculated with the remaining positions of the hyperparameters.
In this iteration all hyperparameters, except those whose values are close to one, are eliminated
Fig.14.38(f). The remaining positions, after this iteration, correspond to the nonzero elements
X (k i ), i = 1, 2, . . . , K positions, with corresponding pruned matrices ΣK , AK , DK . The values of
X (k i ) are estimated using Vi from
T
VK = ΣK AK y/σ2 = (AK T AK + σ2 DK )−1 AK
T
y

in the final iteration. If the measurements were noise-free this would be exact recovery. The values
of estimated X (k i ), i = 1, 2, . . . , K are shown in Fig.14.38(g). The diagonal values of ΣK are the
variances of X (k i ).
Choi-Willimas distribution, 637
Cohen class of distributions, 631
Index discrete form, 635
kernel decomposition, 636
Coherence, 688, 691
χ-squared distribution, 361 comb filter, 183
Complex sinusoidal signal, 18
Activation function, 529 Compressive sensing, 674
Adaptive systems, 446 Confidence Intervals, 347
Allpass system, 224 Constant overlap-add (COLA), 557
Ambiguity function, 611 Constant overlap-add (WOLA), 557
Analog signals, 14 Continuous signals, 17
Analytic part, 35 Convolution
Antenna array, 488 circular, 108
Anticausal systems, 174 continuous, 21
Attenuation, 55 discrete-time, 63
Auto-regressive (AR), 173 in frequency domain, 34, 71
Autocorrelation function, 318 Convolution filter, 525
Autocovariance function, 318 Convolution kernel, 525
Convolutional neural network – CNN, 525
Back-propagation in CNN, 538 Cramer-Rao bound, 341
Backward difference, 207 Critically dumped response, 48
Bandpass filter, 223 Cross-validation method, 427
Basis pursuit, 754 Cumulants, 418
Bayes’ theorem, 309
Bayesian inference, 330 Daniell periodogram, 377, 378
Bayesian reconstruction, 806 Deep neural network, 514
Bias-to-variance trade-off, 372, 427 Delta method, 353
Bilinear transform, 211 Derivative
Binomial random variable, 329 complex function, 30
Blackman window, 562 Difference equation, 171, 174
Block LMS algorithm, 495 Differential equation, 47
Bootstrap method, 354 Differentiator, 71
Born-Jordan distribution, 632 Digital signals, 14
Butterworth filter, 52, 219 Direct realization I, 242
discrete-time, 216 Direct realization II, 243
Dirichlet conditions, 23
Capon’s method, 599 Discrete Cosine transform (DCT), 132
local polynomial Fourier transform (LPFT), 604 Discrete Fourier transform (DFT), 102, 117
short-time Fourier transform (STFT), 601 Discrete Hartley transform (DCT), 155
Cascade realization, 249 Discrete pseudo Wigner distribution, 618
Causal system, 22, 65 Discrete Sine transform (DST), 135
Causal systems, 169 Discrete system, 62
Central limit theorem, 334, 336 Discrete-time signals (discrete signals), 58
Characteristic function, 321 Displacement, 130
Characteristic polynomial, 174 Downsampling, 647

812
Ljubiša Stanković Digital Signal Processing 813

Duality property, 32 Hartley series, 26

He initial values, 538
Eigenvalues, 458 Highpass filter, 222
Eigenvectors, 458 Hilbert transform, 36
Energy, 19, 61 Histogram, 289
Equiangular Tight Frame, 690 Homogeneous equation, 174
ergodic process, 321 Hypothesis testing, 356
Ergodicity, 320
Error function, 336 Image reconstruction, 799
Error signal Impulse invariance method, 200
adaptive system, 451 Impulse signal
continuous (delta function), 17
Fast Fourier transform discrete-time, 59
decimaton-in-frequency, 121 Indirect measurements, 778
Fast Fourier transform, 119 Infinite impulse response (IIR), 173
decimaton-in-time, 122 Initial condition
Final value theorem, 46 continuous, 44
Finite impulse response (FIR), 173 Initial value theorem, 46
frequency domain design, 262 Instantaneous frequency, 38, 614
realization, 257 Interpolation, 112
First-order statistics, 287 Intersecton of confidence intervals, 373
Fixed point arithmetic, 404 Inverse system, 226
Flattening, 531 Isometry, 686
Floating point arithmetic, 410 ISTA algorithm, 756
IEEE standard, 412 Iterative hard thresholding (IHT), 739
mu-law and A-law, 412
Forced response, 50 Kalman filter, 505
Forward propagation, 526 Kronecker delta function, 59
Fourier series, 24, 37, 117 Kurtosis, 435
Fourier transform, 31, 37, 117
matrix, 106 L-statistics, 296
of discrete-time signals, 66, 117 Lagrangian, 758
optimal window, 372 Laplace transform, 42
properties, 33 LASSO minimization, 754
signum function, 33 Leakage effect, 128
variance, 371 Least absolute shrinkage and selection operator
Fractional Fourier transform, 598 LASSO, 305
relation to the LPFT, 599 Linear adaptive adder, 449
windowed, 599 Linear phase systems, 257
Frequency estimation, 128 Linear system, 20
LMS algorithm, 474
Gershgorin theorem, 706 antenna systems, 488
Gershgorin theorem, 707 block, 495
Goertzel algorithm, 176 complex, 498
Gradient, 756 convergence, 475
Gradient descent with sparsification (GDS), 739 echo cacellation, 492
Gram matrix, 701 identification, 476
Gramm matrix, 698 noise cancelation, 480
Group delay, 41, 228, 257, 614 prediction, 485
sign, 494
Haar transform, 143, 646 sinusoidal disturbance, 483
Hadamard transform, 142 variable step, 497
Hamming window, 561 Local polynomial Fourier transform, 594
Hann(ing) window, 69, 559 moments, 596
Hard therholding, 717 relation to fractional Fourier transform, 598
Hard thresholding, 715 Log-likelihood function, 332
814 Index

Lowpass filter, 216 Norm zero, 709

Norm-one, 739
Magnitude, 19 ball, 744
Marginal probability, 310 Norm-zero, 680
Marginal properties, 629 Notch filter, 171, 198
Matched filter, 392
Matched z-transform method, 205 Opinion poll estimation, 350
Maximum a posterior, 334 Optimal filter, 397, 455
Maximum likelihood estimation, 333 Orthogonality principle, 464
Measurement Matrix Overdumped response, 49
Bernoulli Random, 685 Overflow, 406
Indirect, 684
Structured Random, 685, 779 Parallel realization, 253
Measurement matrix, 677, 681 Parseval’s theorem, 72, 108
Median, 293, 381 Perceptron, 516
Memoryless property, 366 Period of a discrete signal, 60
Minimum phase system, 226 Periodogram, 373
Moment generating function, 366 Pisarenko method, 602
Moment genrating function, 418 Polynomial fitting, 306
Moments, 418 Pooling, 531
LPFT, 597 Power, 19, 61
Morlet wavelet, 640 Power spectral density, 322, 385
Moving average (MA), 173 Probability
Moyal property, 616 density function, 306, 312
MUSIC, 602 Probability density function, 312
Pseudoinverse matrix, 712
Narrowband signals
spectral estimation, 388 Quantization, 400
Natural response, 51
Neural networks, 510 Random signals, 287
activation function, 512 Rank of matrix, 701
acyclic, 514 Rayleigh distribution
continuous output, 519 unform, 359
cyclic, 514 Reconstruction uniqueness, 771
error backpropagation, 522 Reconstruction uniquness, 775
layer, 514 Rectangular window, 68
multilayer, 522 Recursive systems
network function, 512 adaptive, 503
perceptron, 516 Reduced interference distributions
supervised, 515 discrete form, 632
unsupervised, 524 Region of convergence, 159
voting mashines, 524 ReLU activation function, 530
Neuron, 511 Resolution, 560
Noise, 327 Restricted isometry, 686, 691
Bernoulli, 328 constant, 687, 693
binary, 328 eigenvalues, 697
compex Gaussian, 359 uniqueness, 702
Gaussian, 335 Ridge regression, 304, 754
Laplacian, 361 RLS algorithm
missing samples, 727 variable step, 500
Poisson, 363
reconstruction, 732 S-method, 377, 624
shot, 363 S-transform (the Stockwell transform), 642
unform, 327 Sampling
Noisy signal nonunform, 779
Fourier transform, 367 Sampling theorem, 99
Ljubiša Stanković Digital Signal Processing 815

for periodic signals, 123 Coifflet, 669

in the frequency domain, 37 Daubechies D4, 655
in the time domain, 74 Daubechies D6, 668
Schwartz’s inequality, 690 filter bank, 645
Second-order statistics, 318 Haar, 653
Sensitivity of system, 246 orthogonality, 651
Shallow neural network, 514 reconstruction condition, 649
Short-time Fourier transform (STFT), 550 scale function, 664
discrete, 565 wavelet function, 664
discrete-time, 564 Welch bound, 688
filter bank, 569 Welch periodogram, 378, 390
frequency-varying, 592 Wide sense stationary signals, 320
hybrid, 594 Wide-sense cyclostationary, 320
inversion, 557, 572 Wiener filter, 397
optimal window, 553 Wigner distribution, 378, 606, 631
optimisation, 584 auto-terms, 610
overlapping, 570 cross-terms, 610
recursive, 568 discrete form, 619
time-varying, 581 properties, 613
Sign LMS algorithm, 494 pseudo, 616
Sinc distribution S-method, 624
discrete form, 632 smoothed, 617
Smoothed periodogram, 377 Window, 558
Soft-thresholding, 756 Bartlett (triangular), 260, 558
Spacification, 717 Blackman, 562
Spark of matrix, 701 Hamming, 262, 561
coherence, 753 Hann(ing), 262
uniqueness, 705 Hann(ing) (Hann), 559
Sparse signals, 674 Kaiser, 562
Sparsity, 680 rectangular, 558
Sparsity support, 680 Windows, 258
Stable system, 22, 65
Stable systems, 169 Xavier initial values, 538
Standard deviation, 297
starred transform, 179 Yule-Walk equation, 389
Stationary phase method, 37
Stationary signals, 320 z-transform, 158, 179
Steepest descend method, 465 inverse, 163
Stride, 530 Zero-padding, 112
Zhao-Atlas-Marks distribution
Taylor series, 38 discrete form, 632
Trace of matrix, 467

Unbiased autocorrelation estimator, 377

Underdumped response, 49
Unit step signal
continuous (Heaviside function), 17
discrete-time, 59
Upsampling, 648

Variance, 297
Variance stabilization, 352
Voting mashines, 524

Walsh-Hadamard transform, 142

Wavelet transform, 639
Bibliography
[1] S. T. Alexander, Adaptive Signal Processing: Theory and Applications, Springer-Verlag, 1986.
[2] R. Allred, Digital Filters for Everyone, CreateSpace Independent Publishing Platform; 2 edition,
2013.
[3] M. Amin, Compressive Sensing for Urban Radar, CRC Press, 2014.
[4] A. Antoniou, Digital Signal Processing: Signals, Systems, and Filters, McGraw-Hill Education,
2005.
[5] F. Auger and F. Hlawatsch, Time-Frequency Analysis: Concepts and Methods, Wiley-ISTE, 2008.
[6] D. Blandford and J. Parr, Introduction to Digital Signal Processing, Prentice Hall, 2012.
[7] B. Boashash, Time-Frequency Signal Analysis and Processing: A Comprehensive Reference,
Second Edition, Academic Press, 2015.
[8] P. Bremaud, Mathematical Principles of Signal Processing: Fourier and Wavelet Analysis,
Springer, 2002.
[9] S. A. Broughton, Discrete Fourier Analysis and Wavelets: Applications to Signal and Image
Processing, Wiley-Interscience, 2008.
[10] L. F. Chaparro, Signals and Systems using MATLAB, Academic Press, 2011.
[11] C. T. Chen, Signals and Systems: A Fresh Look, CreateSpace Independent Publishing Platform,
2011.
[12] V. Chen, D. Tahmoush, and W. J. Miceli, Radar Micro-Doppler Signature-Processing and
Applications, The Institution of Engineering and Technology (IET), 2014.
[13] L. Cohen, Time-frequency Analysis, Prentice-Hall, 1995.
[14] A. G. Constantinides, System Function of Discrete-Time Systems, Academic Press, 2001.
[15] M. Davenport, Digital Signal Processing, Kindle edition, 2011.
[16] P. S. R. Diniz, E. A. B. da Silva, and S. L. Netto, Digital Signal Processing: System Analysis and
Design , Cambridge University Press; 2 edition, 2010.
[17] I. Daubechies, Ten Lectures on Wavelets, SIAM, 1992.
[18] M. Elad, Sparse and Redundant Representations: From Theory to Applications in Signal and
Image Processing, Springer, 2010.
[19] Y. C. Eldar and G. Kutyniok Compressed Sensing: Theory and Applications, Cambridge University
Press, 2012.
[20] M. E. Frerking, Digital Signal Processing in Communication Systems, Springer, 1994.

816
Ljubiša Stanković Digital Signal Processing 817

[21] P. Flandrin, Time-Frequency/Time-Scale Analysis, Academic Press, 1999.

[22] R. C. Gonzalez and R. E. Woods, Digital Image Processing, Prentice Hall, 2007.
[23] N. Hamdy, Applied Signal Processing: Concepts, Circuits, and Systems, CRC Press, 2008.
[24] M. H. Hayes, Schaums Outline of Digital Signal Processing, 2nd Edition, Schaum’s Outline
Series, 2011.
[25] S. Haykin, Least-Mean-Square Adaptive Filters, Wiley-Interscience, 2003.
[26] S. Haykin and B. Van Veen, Signals and Systems, Wiley, 2002.
[27] E. Ifeachor and B. Jervis, Digital Signal Processing: A Practical Approach, Prentice Hall; 2
edition, 2001.
[28] V. K. Ingle and J. G. Proakis, Digital Signal Processing Using MATLAB, CL Engineering; 3
edition, 2011.
[29] S. Kay, Fundamentals of Statistical Signal Processing, Volume I: Estimation Theory, Prentice
Hall, 1993.
[30] S. Kay, Fundamentals of Statistical Signal Processing, Volume II: Detection Theory, Prentice
Hall, 1998.
[31] E. Kudeki and D. C. Munson Jr., Analog Signals and Systems, Prentice Hall, 2008.
[32] S. M. Kuo, B. H. Lee, and W. Tian, Real-Time Digital Signal Processing: Fundamentals,
Implementations and Applications, Wiley, third edition, 2013.
[33] B. P. Lathi and Z. Ding, Modern Digital and Analog Communication Systems , Oxford University
Press, 2009.
[34] B. P. Lathi and R. A. Green, Essentials of Digital Signal Processing, Cambridge University Press,
2014.
[35] S. R. Laxpati and V. Goncharoff, Practical Signal Processing and its Applications, Part I,
CreateSpace Independent Publishing Platform, 2013.
[36] R. G. Lyons, Understanding Digital Signal Processing, Pearson, 2010.
[37] S. Mallat, A Wavelet Tour of Signal Processing, Academic Press, second edition, 1999.
[38] D. P. Mandić and S. L. Goh, Complex Valued Nonlinear Adaptive Filters: Noncircularity, Widely
Linear and Neural Models, John Wiley & Sons, 2009.
[39] D. P. Mandić, M. Golz, A. Kuh, D. Obradovic, and T. Tanaka, Signal Processing Techniques for
Knowledge Extraction and Information Fusion, Springer, 2008.
[40] D. G. Manolakis and V. K. Ingle, Applied Digital Signal Processing: Theory and Practice,
Cambridge University Press, 2011.
[41] D. G. Manolakis, D. Manolakis, V. K. Ingle, S. M. and Kogon, Statistical and Adaptive Signal
Processing: Spectral Estimation, Signal Modeling, Adaptive Filtering and Array Processing,
Artech House, 2005.
[42] J. H. McClellan, R. W. Schafer, and M. A. Yoder, Signal Processing First, Prentice Hall, 2003.
[43] S. K. Mitra, Digital Signal Processing, McGraw-Hill, 2010.
[44] B. Mulgrew, P. Grant, and J. Thompson, Digital Signal Processing: Concepts and Applications,
Palgrave Macmillan; 2nd Edition edition, 2002.
[45] R. Newbold, Practical Applications in Digital Signal Processing, Prentice Hall, 2012.
[46] A. V. Oppenheim and A. S. Willsky, Signals and Systems, Prentice Hall, 2008.
[47] A. V. Oppenheim and R. W. Schafer, Discrete-Time Signal Processing, Prentice Hall, 2009.
818 Bibliography

[48] A. V. Oppenheim and G. C. Verghese, Signals, Systems and Inference, Prentice Hall, 2015.
[49] A. Papulis, Signal Analysis, McGraw-Hill, 1997
[50] C. L. Phillips, J. Parr, and E. Riskin, Signals, Systems, and Transforms, Prentice Hall, 2013.
[51] B. Porat, A Course in Digital Signal Processing, Wiley, 1996.
[52] J. G. Proakis and D. G. Manolakis, Digital Signal Processing: Principles, Algorithms, and
Applications, Pearson; 4 edition, 2013.
[53] A. Quinquis, Digital Signal Processing Using Matlab, Wiley-ISTE, 2008.
[54] L. R. Rabiner and B. Gold, Theory and Application of Digital Signal Processing, Prentice Hall,
1975.
[55] M. A. Richards, Fundamentals of Radar Signal Processing, McGraw-Hill Education, second
edition, 2014.
[56] M. J. Roberts, Signals and Systems: Analysis Using Transform Methods & MATLAB, McGraw-
Hill Education, 2011.
[57] A. H. Sayed, Adaptive Filters, Wiley-IEEE, 2008.
[58] L. L. Scharf, Statistical Signal Processing: Detection, Estimation, and Time Series Analysis,
Prentice Hall, 1991.
[59] M. Soumekh, Synthetic Aperture Radar Signal Processing with MATLAB Algorithms, Wiley-
Interscience, 1999.
[60] L. Stanković, Digital Signal Processing, Naucna knjiga, Beograd, Second edition 1989 (in
Montenegrin/Serbian/Croat/Bosnian).
[61] L. Stanković, M. Daković, and T. Thayaparan, Time-frequency Signal Analysis with Applications,
Artech House, Boston, 2013.
[62] L. Stanković and I. Djurović, Solved Problems in Digital Signal Processing, University of
Montenegro, 1998 (in Montenegrin/Serbian/Croat/Bosnian).
[63] S. Stanković, I. Orović, E. Sejdić, Multimedia Signals and Systems, Springer, second edition,
2015.
[64] H. Stark and J. W. Woods, Probability and Random Processes with Applications to Signal
Processing, Prentice Hall, third edition, 2001.
[65] L. Tan and J. Jiang, Digital Signal Processing: Fundamentals and Applications, Academic Press;
2 edition , 2013.
[66] S. Theodoridis and R. Chellappa, eds., Academic Press Library in Signal Processing, Vols.1-4,
Academic Press, 2013.
[67] M. Tipping, "Sparse Bayesian Learning and the Relevance Vector Machine", Journal of Machine
Learning Research, JMLR.org, 1, 2001, pp.211- 244.
[68] A. Uncini, Fundamentals of Adaptive Signal Processing (Signals and Communication Technol-
ogy), Springer, 2014.
[69] M.Vetterli and J.Kovačević, Wavelets and Subband Coding, CreateSpace Independent Publishing
Platform, 2013.
[70] M. Vetterli, J. Kovačević, and V. K. Goyal, Foundations of Signal Processing, Cambridge
University Press, 2014.
[71] B. Widrow and D. Stearns, Adaptive Signal Processing, Prentice Hall, 1985.
About the Author
Ljubiša Stanković was born in Montenegro on June 1, 1960. He received a BSc degree in electrical
engineering from the University of Montenegro in 1982 with the award as the best student at the
University. As a student, he won several competitions in mathematics in Montenegro and former
Yugoslavia. He received an MSc degree in communications from the University of Belgrade, and a
PhD degree in theory of electromagnetic waves from the University of Montenegro in 1988. As a
Fulbright grantee, he spent 1984-1985 academic year at the Worcester Polytechnic Institute, Worcester,
MASS. Since 1982, he has been on the faculty at the University of Montenegro, where he has been a
full professor since 1995.
In 1997-1999, he was on leave at the Ruhr University Bochum, Germany, supported by the
Alexander von Humboldt Foundation. At the beginning of 2001, he was at the Technische Universiteit
Eindhoven.
From 2003 to 2008, Stanković was the rector of the University of Montenegro. He was the
ambassador of Montenegro to the United Kingdom, Iceland, and Ireland 2011-2015. During his stay in
the United Kingdom he was a visiting academic at Imperial College London, 2013-2014.
His current interests are in signal processing. He published about 500 technical papers, more than
170 of them in the leading journals.
Stanković received the highest state award of Montenegro in 1997 for scientific achievements.
Stanković was an associate editor of the IEEE Transactions on Image Processing, an associate editor of
the IEEE Signal Processing Letters, an associate editor of the IEEE Transactions on Signal Processing,
and an associate editor of the IET Signal Processing.
Stanković is a member of the Editorial Board of Signal Processing (Elsevier), associate editor of
the CN Computer Sciences (Springer Nature), deputy editor of the IET Signal Processing, and a senior
area editor of the IEEE Transactions on Image Processing.
He is a member of the National Academy of Sciences and Arts of Montenegro (CANU) since
1996, vice-president of CANU since 2015, a member of the Academia Europaea, and a member of
the European Academy of Sciences and Arts. Stanković is a Fellow of the IEEE for contributions to
time-frequency signal analysis.
Stanković (with coauthors) won the Best paper award from the European Association for Signal
Processing (EURASIP) for 2017 for a paper published in Signal Processing.

For bibliographic data and copies of the published papers see www.tfsa.ac.me

819
820 Bibliography

Library of Congress Cataloging-in-Publication Data

Library of Congress Control Number: 2015912465

ISBN-13: 978-1514179987
ISBN-10: 1514179989

Printed by CreateSpace Independent Publishing Platform,

An Amazon.com Company
North Charleston, South Carolina, USA.

Available from Amazon.com and other on-line and bookstores

No part of this book may be reproduced or utilized in any form or by any means, electronic or
mechanical, including photocopying, recording, or by any information storage and retrieval system,
without permission in writing from the copyright holder.

(Original PDF) Applied Digital Signal Processing Theory and Practice Instant Download
100% (8)
(Original PDF) Applied Digital Signal Processing Theory and Practice Instant Download
45 pages
All Slides DT Only 2017
No ratings yet
All Slides DT Only 2017
551 pages
Phased Array System Toolbox™ Reference
No ratings yet
Phased Array System Toolbox™ Reference
1,698 pages
FPGA Adaptive Beamforming With HDL Coder and Zynq RFSoC
No ratings yet
FPGA Adaptive Beamforming With HDL Coder and Zynq RFSoC
36 pages
Liveloud Lyrics 2021
No ratings yet
Liveloud Lyrics 2021
606 pages
Software Radio
100% (1)
Software Radio
117 pages
(Original PDF) Applied Digital Signal Processing Theory and Practice Download
No ratings yet
(Original PDF) Applied Digital Signal Processing Theory and Practice Download
52 pages
UG 2624 WinDopp Users Guide
No ratings yet
UG 2624 WinDopp Users Guide
222 pages
Computer Explorations in Signals and Systems
100% (1)
Computer Explorations in Signals and Systems
218 pages
Test Planner - Neev 2026
No ratings yet
Test Planner - Neev 2026
3 pages
MIMO Wireless Communications Over Generalized Fading Channels - Copie
No ratings yet
MIMO Wireless Communications Over Generalized Fading Channels - Copie
290 pages
Technical Delay Report
100% (1)
Technical Delay Report
1 page
DOS Matlab
No ratings yet
DOS Matlab
821 pages
SDR200B - Theory
No ratings yet
SDR200B - Theory
53 pages
Bandpass Signaling
No ratings yet
Bandpass Signaling
76 pages
2 Complex Signals and Radios Short 2pp
No ratings yet
2 Complex Signals and Radios Short 2pp
32 pages
DSP Book v300 Extended Preview Internet
No ratings yet
DSP Book v300 Extended Preview Internet
303 pages
C-Band 5.6GHz FMCW Radar
No ratings yet
C-Band 5.6GHz FMCW Radar
20 pages
Tracking Radar
No ratings yet
Tracking Radar
84 pages
Sparse Representations For Radar With MATLAB Examples (Peter Knee) (Z-Library)
No ratings yet
Sparse Representations For Radar With MATLAB Examples (Peter Knee) (Z-Library)
87 pages
Ofdm Thesis
No ratings yet
Ofdm Thesis
128 pages
Waveform Code
No ratings yet
Waveform Code
25 pages
(Erik G. Larsson, Petre Stoica) Space-Time Block
No ratings yet
(Erik G. Larsson, Petre Stoica) Space-Time Block
300 pages
Simulation & Design of Software-Defined Radios
0% (1)
Simulation & Design of Software-Defined Radios
43 pages
EEM496 Communication Systems Laboratory - Report6 - Probability of Error For Mary PSK, Qam and Scatter Plots
No ratings yet
EEM496 Communication Systems Laboratory - Report6 - Probability of Error For Mary PSK, Qam and Scatter Plots
18 pages
Branko Kovačević, Zoran Banjac, Milan Milosavljević (Auth.) - Adaptive Digital Filters-Springer-Verlag Berlin Heidelberg (2013) PDF
No ratings yet
Branko Kovačević, Zoran Banjac, Milan Milosavljević (Auth.) - Adaptive Digital Filters-Springer-Verlag Berlin Heidelberg (2013) PDF
220 pages
322 - EC8395 Communication Engineering - Notes PDF
No ratings yet
322 - EC8395 Communication Engineering - Notes PDF
124 pages
Lets Assume System Synchronized 2
No ratings yet
Lets Assume System Synchronized 2
11 pages
Emitter-Detection DTIC ADA471571
No ratings yet
Emitter-Detection DTIC ADA471571
116 pages
Designing and Integrating Antenna Arrays With Multi Function Radar Systems
No ratings yet
Designing and Integrating Antenna Arrays With Multi Function Radar Systems
39 pages
Research Paper 4 - Wideband Self-Interference Cancellation Filter For Simultaneous Transmit and Receive Systems (June 2015)
100% (1)
Research Paper 4 - Wideband Self-Interference Cancellation Filter For Simultaneous Transmit and Receive Systems (June 2015)
2 pages
Primer On Digital Beamforming
No ratings yet
Primer On Digital Beamforming
15 pages
Small Radar - Sea.clutter - Scattering.the.k.distribution - And.radar - Performance.radar - Sonar..navigation
No ratings yet
Small Radar - Sea.clutter - Scattering.the.k.distribution - And.radar - Performance.radar - Sonar..navigation
586 pages
EITN90 Lecture7
No ratings yet
EITN90 Lecture7
60 pages
Multifunction Radar Simulator (MFRSIM) : Defence R&D Canada - Ottawa
No ratings yet
Multifunction Radar Simulator (MFRSIM) : Defence R&D Canada - Ottawa
55 pages
A High Linearity FMCW Sweep Generator
No ratings yet
A High Linearity FMCW Sweep Generator
10 pages
ch2 Deterministic and Random Signal Analysis
No ratings yet
ch2 Deterministic and Random Signal Analysis
32 pages
Advanced Digital Signal Processing Lecture 1
0% (1)
Advanced Digital Signal Processing Lecture 1
42 pages
Doppler Compensation
No ratings yet
Doppler Compensation
158 pages
Deepmusic: Multiple Signal Classification Via Deep Learning: Ahmet M. Elbir, Senior Member, Ieee
No ratings yet
Deepmusic: Multiple Signal Classification Via Deep Learning: Ahmet M. Elbir, Senior Member, Ieee
5 pages
High Performance Automotive Radar
No ratings yet
High Performance Automotive Radar
13 pages
Digital Modulation Schemes Proakis Slides
No ratings yet
Digital Modulation Schemes Proakis Slides
89 pages
2019 Book CompactSlotArrayAntennasForWir
No ratings yet
2019 Book CompactSlotArrayAntennasForWir
372 pages
Examining The Efficacy of FHSS and OFDM Protocol For Drone Comm
No ratings yet
Examining The Efficacy of FHSS and OFDM Protocol For Drone Comm
60 pages
PAPR Reduction of OFDM Signals Using Selected Mapping Technique
No ratings yet
PAPR Reduction of OFDM Signals Using Selected Mapping Technique
67 pages
Golay Codes PDF
No ratings yet
Golay Codes PDF
33 pages
Bo de Thi Tieng Anh Lop 4 Hoc Ki 1 Co Dap An
No ratings yet
Bo de Thi Tieng Anh Lop 4 Hoc Ki 1 Co Dap An
60 pages
Capital Budget
No ratings yet
Capital Budget
15 pages
PDF 24
0% (1)
PDF 24
2 pages
Multi-Standard Receiver Design
100% (1)
Multi-Standard Receiver Design
54 pages
Statistical Signal Processing: ECE 5615 Lecture Notes Spring 201 9
No ratings yet
Statistical Signal Processing: ECE 5615 Lecture Notes Spring 201 9
32 pages
University of Southeastern Philippines
No ratings yet
University of Southeastern Philippines
25 pages
Lab 06
No ratings yet
Lab 06
17 pages
RHCSA Rapid Track Course
No ratings yet
RHCSA Rapid Track Course
3 pages
CAT D399 Workshop Manual
97% (37)
CAT D399 Workshop Manual
434 pages
(Date) : Dheeraj
No ratings yet
(Date) : Dheeraj
22 pages
Ch1 Introduction+to+Radar+Systems
No ratings yet
Ch1 Introduction+to+Radar+Systems
70 pages
Effect of Sigma Delta Modulator On The Phase Noise of PLL2
No ratings yet
Effect of Sigma Delta Modulator On The Phase Noise of PLL2
12 pages
The Clergyman's Wife Chapter Sampler
0% (2)
The Clergyman's Wife Chapter Sampler
21 pages
Waveform Analysis Using The Ambiguity Function - MATLAB & Simulink Example
No ratings yet
Waveform Analysis Using The Ambiguity Function - MATLAB & Simulink Example
16 pages
SurgeTesting EARbasics 0716
100% (1)
SurgeTesting EARbasics 0716
2 pages
CIC Filters: by Sylas Ashton
No ratings yet
CIC Filters: by Sylas Ashton
14 pages
Radar Signals Tutorial II: The Ambiguity Function
No ratings yet
Radar Signals Tutorial II: The Ambiguity Function
38 pages
(Robert Mailloux) Electronically Scanned Arrays
No ratings yet
(Robert Mailloux) Electronically Scanned Arrays
92 pages
What Is Channel Estimation
No ratings yet
What Is Channel Estimation
15 pages
Marisela Frasuto - Beverly Hills Cop
No ratings yet
Marisela Frasuto - Beverly Hills Cop
5 pages
Mockingbird
No ratings yet
Mockingbird
4 pages
Rubrics For Group Presentation
100% (1)
Rubrics For Group Presentation
1 page
Exact Analysis of DDS Spurs and SNR Due To Phase Truncation and Arbitrary Phase-To-Amplitude Errors
No ratings yet
Exact Analysis of DDS Spurs and SNR Due To Phase Truncation and Arbitrary Phase-To-Amplitude Errors
9 pages
Chapter 0
No ratings yet
Chapter 0
32 pages
An Efficient Estimator For TDOA-based Source Localization With Minimum Number of Sensors
No ratings yet
An Efficient Estimator For TDOA-based Source Localization With Minimum Number of Sensors
4 pages
Based On May 2011 Occupational Standards: Ethiopian TVET-System
No ratings yet
Based On May 2011 Occupational Standards: Ethiopian TVET-System
92 pages
Time-Series Econometrics
No ratings yet
Time-Series Econometrics
36 pages
Radar Data and Relationships: Radar Bands Linear DB Scale
No ratings yet
Radar Data and Relationships: Radar Bands Linear DB Scale
2 pages
Notes On Works
No ratings yet
Notes On Works
2 pages
Allied Telesis
No ratings yet
Allied Telesis
2 pages
Business Research Methods: Problem Definition and The Research Proposal
No ratings yet
Business Research Methods: Problem Definition and The Research Proposal
37 pages
Technical Specifications / Tender Text Wöhr Autoparksysteme GMBH Parklift 462-2,0 D / 462-2,6 D
No ratings yet
Technical Specifications / Tender Text Wöhr Autoparksysteme GMBH Parklift 462-2,0 D / 462-2,6 D
1 page
Home Cell Group Explosion Compress
No ratings yet
Home Cell Group Explosion Compress
4 pages
Central University of Haryana: Temporary Camp Office: Govt. B.Ed. College Building, Narnaul (Distt. Mahendergarh) Haryana
No ratings yet
Central University of Haryana: Temporary Camp Office: Govt. B.Ed. College Building, Narnaul (Distt. Mahendergarh) Haryana
7 pages
64709b0902cd9 RN Ati Capstone Proctored Comprehensive Assessment 2019 B Ati Comprehensive Practice Test B Best Study Guide Version With Complete Solution 2 Revised (1) - 2
No ratings yet
64709b0902cd9 RN Ati Capstone Proctored Comprehensive Assessment 2019 B Ati Comprehensive Practice Test B Best Study Guide Version With Complete Solution 2 Revised (1) - 2
1 page
#13 Addition Polymerization: Preparation of Polystyrene Using Two Types of Initiators
No ratings yet
#13 Addition Polymerization: Preparation of Polystyrene Using Two Types of Initiators
9 pages
Solucionario Capitulo 17 Giancoli Septima Edicion
No ratings yet
Solucionario Capitulo 17 Giancoli Septima Edicion
28 pages
Written Performance Task in English 9
No ratings yet
Written Performance Task in English 9
4 pages
Seismic Fragility of Transportation Lifeline Piers in The Philippines, Under Confinement and Shear Failure.
No ratings yet
Seismic Fragility of Transportation Lifeline Piers in The Philippines, Under Confinement and Shear Failure.
20 pages
Indonesia Security Market Report 2017
No ratings yet
Indonesia Security Market Report 2017
6 pages
Safety Data Sheet: 1. Identification of The Substance/Mixture and The Supplier
No ratings yet
Safety Data Sheet: 1. Identification of The Substance/Mixture and The Supplier
8 pages
Self Assessment and Reflection 1
100% (2)
Self Assessment and Reflection 1
7 pages