Module 2
Module 2
Module 2: Review of Random Variables and Random Processes Review of random variables –
both discrete and continuous. CDF and PDF, statistical averages. (Only definitions, computations
and significance) Entropy, differential entropy. Differential entropy of a Gaussian RV.
Conditional entropy, mutual information. Stochastic processes, Stationarity. Conditions for WSS
and SSS. Autocorrelation and power spectral density. LTI systems with WSS as input.
Review of random variables – both discrete and continuous.
The word random effectively means unpredictable.
An experiment is called random experiment, if its outcome cannot be predicted precisely in
advance. In engineering practice we may treat some signals as random to simplify the analysis.
Result of the random experiments is called outcome or sample point. Sample space is the
collection of all possible outcomes of a random event.
Sample space: The domain of a random variable is a sample space, which is represented as the
collection of possible outcomes of a random event.
Random variable definition
Random variable is a real-valued function that maps all possible outcomes of a random
experiment to real numbers. It is defined over the sample space of a random experiment. We
represent capital X to denote the RV and small letter‘s’ to denote the value taken by RV. X(s).
It is a function which associates a unique numerical value with every outcome of an experiment
Example :
1. Transmission time of a message in a communication system.
2. Number of 1s arriving at a receiving station at a particular interval of time.
Some random experiments produce the outcome in terms of numerical quantity. Example 1.
Throw die 2. Generate the NRZ digital signal
Some random experiments produce the outcome in terms of non-numerical quantity. Example
1. Tossing of coin
Dr.S.Perumal sankar, Prof Toc H Institute of Science and Technology
FX ( x) P( X x) for all x R
Properties of CDF
Let X be a random variable (either continuous or discrete), then the CDF of X has the following
properties:
(i) 0 FX ( x ) 1)
The maximum of the CDF is when x = : FX () = 1.
The minimum of the CDF is when x = : FX () = 0.
(ii) The CDF is a non - decreasing.
The advantage of the CDF is that it can be defined for any kind of random variable (discrete,
continuous, and mixed).
Example
• Tossing three coins simultaneously Let us define RV X as the number of heads obtained
in each trail. Find the CDF.
-----------------------------------------------------------------------------------------------------------
Dr.S.Perumal sankar, Prof Toc H Institute of Science and Technology
when X is continuous, we can ignore the endpoints of intervals while finding probabilities of
continuous random variables. So P( x X x dx) = P( x X x dx) .
Definition 2:
Probability density function of the random variable X is defined as the derivative of the
dFX ( x)
distribution function f X ( x)
dx
P( x X x dx) FX ( x x) FX ( x)
FX ( x x) FX ( x) dFx
x lim x
x x 0 dx
Dr.S.Perumal sankar, Prof Toc H Institute of Science and Technology
dFX ( x)
So f X ( x)
dx
Also FX ( x) f X ( x) dx
f X ( x)dx 1
P( x X x dx) f X ( x)dx
Properties of PDF
i) f X ( x) dx 0 f(x) should be non - negative for all values of the random variable.
ii) f X ( x) dx 1 The area underneath f(x) should be equal to 1.
x dx
iii) P( x X x dx) f X ( x) dx
x
-----------------------------------------------------------------------------------------------------------
Dr.S.Perumal sankar, Prof Toc H Institute of Science and Technology
Statistical averages: Statistical averages are numbers that summarize a group of numbers into
a single value. Here we study the four statistical averages and their definitions.
1. Mean
2. Mean square
3. Variance
4. Standard deviation
1. Mean
The first moment of the probability distribution of a RV X is called mean value μ or expected
value of a random variable. Also represented by μ or m1
The mean, or expected value, of a discrete random variable is
E x xi p ( xi )
i
For continuous random variable
E Xn x n f X ( x )dx
When n 1, first moment E X x f X ( x )dx mean
Mean may be negative or positive.
The mean of the random variable is also called as Expected value of a random variable.
First moment of the probability distribution of a RV.
First order moment.
3. Variance
• Variance means to find the expected difference of deviation from actual value.
Therefore, variance depends on the standard deviation of the given data set. The more the value
of variance, the data is more scattered from its mean and if the value of variance is low or
minimum, then it is less scattered from mean.
• A positive quantity that measures the spread of the distribution of the random variable
about its mean value
• Larger values of the variance indicate that the distribution is more spread out
Dr.S.Perumal sankar, Prof Toc H Institute of Science and Technology
Variance is used to measure how far the data values are dispersed from the mean,
Standard deviation is the used to calculate the amount of dispersion of the given data set
values.
4. Standard deviation
Standard deviation measures the dispersion(how spread out data ) of a dataset relative to its
mean. It is calculated as the square root of the variance.
A lower standard deviation means that most of the numbers are close to the average value
A higher standard deviation means that the numbers are spread out from the average value.
Standard deviation ( )
Variance 2
-----------------------------------------------------------------------------------------------------
Dr.S.Perumal sankar, Prof Toc H Institute of Science and Technology
Entropy
Entropy can be defined as a measure of the average information content per source symbol.
k 1
H ( X ) pi log bits/symbo l
i 1 pi
The minimum average number of binary digits needed to specify a source output (message)
uniquely is called “SOURCE ENTROPY”
Entropy derivation
A communication system does not deal with single message. It deals with all possible messages.
The discrete source generates messages:
Binary source: x Є { 0, 1 } with P( x = 0 ) = p; and P( x = 1 ) = 1 – p
M-ary source: x Є {1,2, …., M} with ΣPi =1.
Example:
Discrete finite ensemble: a,b,c,d → 00, 01, 10, 11 there are four Messages(symbols).
k binary digits specify 2k messages
M messages need log 2 M bits
• Source generates long sequence of N symbols
• Let us assume an information source which emit K distinct symbols m1,m2,.., mk ,.., mK with
probability p1,p2 ,..,pk , respectively
• Assume source have delivered a statistically independent sequence of N symbols (N→∞) then
symbol mk occurs Npk times.
• The occurrence of each mk conveys information I(mk)=-log pk bits
• Like wise I(m1), I(m2) ,.., I(mk)
• n1 symbols are of type m1 and information corresponding to n1 symbols are of type m1 is
I(m1)= n1 I(m1)
• n2 symbols are of type m2 and information corresponding to n1 symbols are of type m1 is
I(m2)= n2 I(m2)
• nk symbols are of type mk
Total information associated with the generation of N symbols
n1I (m1 ) n 2 I (m2 ) ..... n k I (mk )
Entropy H of the source is define as the average information per symbol
n1I (m1 ) n 2 I (m2 ) ..... n k I (mk )
H(X)
N
n n n
H(X) 1 I (m1 ) 2 I (m2 ) ... k I (mk )
N N N
H(X) p1I (m1 ) p2 I (m2 ) ... pk I (mk )
1 1 1
H(X) p1 log p2 log ... pk log
p1 p2 pk
k 1
H ( X ) pi log bits/symbol
i 1 pi
Dr.S.Perumal sankar, Prof Toc H Institute of Science and Technology
---------------------------------------------------------------------------------------------------------
Differential entropy
1
f X ( x) e 2 2 where is the mean, 2 is the variance
2 2
Step 1
-x -
2
1 1
log - log 2 f X ( x ) - log 2 e 2 2 (1)
f X ( x) 2 2
log 2 x
we know that lnx so log 2 x lnx log 2e log e x log 2e
log 2e
So equation 1 will be
-x - -x -
2 2
1 1 1
log log 2e log e e 2 2 log 2e log e log e e 2 2
f X ( x) 2 2 2 2
- x - 2
log
1
log 2e - log e 2 2
2
f X ( x) 2
x -
2
1
log 2e log e 2 2
2
log
f X ( x) 2
Step 2
substitute this in h(X)
Dr.S.Perumal sankar, Prof Toc H Institute of Science and Technology
h(X) - f X ( x ) log 2 f X ( x ) dx
x - 2
h(X) - f X ( x ) log 2 e log e 2
2 dx
2
2
h(X)
- f X ( x ) log 2 e log e 2 log 2 e
x - 2
2
dx
2 2
log 2 e
h(X) log 2 2 2 1 x - 2 f X ( x ) dx
2 2
Where Var(X) E (X - ) 2 x - 2 f X ( x ) dx
log 2 2 2 log 2 e
log 2 e log 2 e 1 1
2 2 2 2
log 2 2 log 2 e log 2 2 log 2 e
1 1 1
h(X) 2 2
2 2 2
h(X)
1
log 2 2e 2
2
so Differenti al entropy of Gaussian random variable is h(X)
1
log 2 2e 2
2
-------------------------------------------------------------------------------------------------
Conditional Entropy
Conditional entropy measures the uncertainty of a variable Y when another variable X is
known. Conditional entropy quantifies the amount of information needed to describe the
outcome of a random variable Y given that the value of another random variable X is known.
Dr.S.Perumal sankar, Prof Toc H Institute of Science and Technology
A low conditional entropy indicates a strong dependency between the two variables, meaning
knowing the value of one significantly reduces the uncertainty about the other.
A discrete memoryless channel (DMC) is a statistical model that has discrete inputs and
outputs, and is characterized by a transition probability distribution from a single input symbol
to a single output symbol. The channel's current output symbol is dependent only on the current
input symbol, and not on any previous input symbols.
Let we note,that
H(X) in the amount of uncertainty about channel input before observing the Channel output
and
H(X/Y) is the amount of uncertainty about channel input after observing the channel output
n 1
Entropy of channel input symbol is H ( X ) p( xi ) log bits/symbol 1
i 1 p( xi )
m 1
Entropy of channel output symbol Y is H (Y ) p( y j ) log bits/symbol 2
j 1 p( y j )
m
p ( xi , y j ) p ( xi ) (3) and
j 1
n
p ( xi , y j ) p ( y j ) (4)
i 1
n m 1
H ( X ) p ( xi , y j ) log (5)
i 1 j 1 p ( xi )
m n 1
H (Y ) p ( xi , y j ) log (6)
j 1 i 1 p( y j )
Conditional Entropy
we find the entropy of input symbols after observing the channel output ie Y y j
Y can take any value from the symbols y1 . , y 2 . ....... y m .
H X / Y y j p ( xi / y j ) log
n 1
(7)
RV i 1 p ( xi / y j )
Dr.S.Perumal sankar, Prof Toc H Institute of Science and Technology
H X / Y y1 H X / Y y 2 ........H X / Y y m
The Average amount of information (entropy) of output symbol Y is
j 1
1
H X / Y p ( xi / y j ) log
n m
p( y j )
i 1 j 1 p( xi / y j )
1
H X / Y p ( xi / y j ) p ( y j ) log
n m
i 1 j 1 p ( xi / y j )
Where p ( xi / y j ) p ( y j ) p ( xi , y j )
1
H X / Y p ( xi , y j ) log
n m
( 8)
i 1 j 1 p( xi / y j )
1
H Y / X p( xi , y j ) log
n m
i 1 j 1 p ( y j / xi )
H X / Y Double summation of joint probabilit y
logrithom reciprocal of the probabilit y of interest
----------------------------------------------------------------------------------------------------
Dr.S.Perumal sankar, Prof Toc H Institute of Science and Technology
Mutual Information
• It is the measure of the amount of information that one RV contains about another RV.
Reduction in the uncertainty of one random variable due to the knowledge of the other.
p( xi , y j )
I X; Y p( xi , y j ) log
n m
i 1 j 1 p( xi ) p( y j )
• Relationship between entropy and mutual information
I(X;Y) = H(Y) - H(Y|X) = H(X) - H(X|Y)
• Mutual information is used in determining the similarity of two different clusterings of a
dataset.(input and output)
Stochastic processes, Stationarity. Conditions for WSS and SSS. Autocorrelation and power
spectral density. LTI systems with WSS as input.
Classification of random process: Random processes are mainly classified into four types
based on the time and random variable X as follows.
1. Continuous Random Process: A random process is said to be continuous if both the
random variable X and time t are continuous. The below figure shows a continuous random
process. The fluctuations of noise voltage in any network is a continuous random process.
2. Discrete Random Process: In discrete random process, the random variable X has only
discrete values while time, t is continuous. The below figure shows a discrete random
process. A digital encoded signal has only two discrete values a positive level and a negative
level but time is continuous. So it is a discrete random process.
Dr.S.Perumal sankar, Prof Toc H Institute of Science and Technology
5. Discrete Random Sequence: In discrete random sequence both random variable X and
time t are discrete. It can be obtained by sampling and quantizing a random signal. This is
called the random process and is mostly used in digital signal processing applications. The
amplitude of the sequence can be quantized into two levels or multi levels as shown in below
figure s (d) and (e).
Stationary Processes: A random process is said to be stationary if all its statistical properties
such as mean, moments, variances etc… do not change with time. The stationarity which
depends on the density functions has different levels or orders.
1. First order stationary process: A random process is said to be stationary to order one or
first order stationary if its first order density function does not change with time or shift in
time value.
If X(t) is a first order stationary process then fX(x1;t1) = fX(x1;t1+∆t) for any time t1.
Where ∆t is shift in time value. Therefore the condition for a process to be a first order
stationary random process is that its mean value must be constant at any time instant.
i.e. [𝑋(𝑡)] = 𝑋̅ = 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡.
2. Second order stationary process: A random process is said to be stationary to order two
or second order stationary if its second order joint density function does not change with
time or shift in time value i.e.
Dr.S.Perumal sankar, Prof Toc H Institute of Science and Technology
fX(x1, x2 ; t1, t2) = fX(x1, x2;t1+∆t, t2+∆t) for all t1,t2 and ∆t.
It is a function of time difference (t2, t1) and not absolute time t. Note that a second order
stationary process is also a first order stationary process.
The condition for a process to be a second order stationary is that
i) It’s autocorrelation should depend only on time differences and not on absolute time.
i.e. If RXX(t1,t2) = E[X(t1) X(t2)] is autocorrelation function and τ =t2 –t1 then
RXX(t1,t1+ τ) = E[X(t1) X(t1+ τ)] = RXX(τ) . RXX(τ) should be independent of time.
In other words
In other words, for values Xt1 and Xt2 then we will have the following be equal for an
arbitrary time shift t.
From this equation we see that the absolute time does not affect our functions, rather it
only really depends on the time difference between the two variables.
Autocorrelation:
Definition.
The autocorrelation function provides a measure of similarity between two observations of the
random process X(t) - a signal and its time delayed version. The autocorrelation function is
denoted by RXX(t1, t2). It is the correlation of a signal with a delayed copy of itself as a function
of delay . The “auto” in autocorrelation refers to the correlation of the process with itself.
The autocorrelation function of the process X(t) depends solely on the difference between any
two times at which the process is sampled;
Autocorrelation function of the stochastic process X(t) as the expectation of the product of two
random variables, X(t1) and X(t2), obtained by sampling the process X(t) at times t1 and t2,
respectively.
Problem 1
A discrete memory less source X has four symbols x1, x 2 , x 3 , x4 with probabilities
p(x1 ) 0.4 , p(x2 ) 0.3, p(x3 ) 0.2, p(x4 ) 0.1. Calculate H(X).
k 1 4
H ( X ) pi log - pi log pi
i 1 pi i 1
H ( X ) - 0.4 log 2 0.4 0.3 log 2 0.3 0.2 log 2 0.2 0.1log 2 0.1
H ( X ) 1.85 bits/symbo l
Problem2
Find the differential entropy h(X) of a continuous random variable X is uniformly distributed in the
interval [0, 8].
pdf of f X ( x
C 0 x8
f X ( x)
0 elsewhere
C 10 x 8
f X ( x) 8 where 8 - 0 8 as it is uniformly distributed f X ( x ) 1 / 8
0 elsewhere
1
h(X) f X ( x ) log dx
f X ( x)
8 8 8
h(X) 18 log 1 dx 18 log 2 8 dx 18 log 2 8 dx a1 log 2 a x 80
1 1
8
log 2 8 (8) log 2 8 3
0 8 0 0
Problem 3
Find if the value of a is fixed where probability distribution function of X is defined as
a e -0.2x for x 0
f X ( x)
0 elsewhere
We know f X ( x ) dx 1
so a e -0.2x dx 1
0
e -0.2x
a 1
0.2 0
e - e 0 0 1
a a 1
0.2 0.2
a 0.2
Dr.S.Perumal sankar, Prof Toc H Institute of Science and Technology
Problem4
Find " c" for the valid probability density function of a continuous random variable X is
f X ( x ) c x e - x for 0 x .
We know f X ( x ) dx 1
so c x e - x dx 1
0
x e- x x e- x
c 1 2
1
1 1 0
c (0 1) 1
c 1
Problem5
A continous RV has a PDF of f X ( x ) k x2 e- x 0 x 1. Find the value of k, mean and variance
We know f X ( x ) dx 1
so k x 2 e - x dx 1
0
1
2k 1 so k
2
Mean
x f X ( x ) dx
1
x x 2 e - x dx whe re k 1/2
0 2
1 3 - x
x e dx x 3 e - x 3 x 2 e - x 6 x(e - x ) 6( e - x )0 3
1
2 0 2
2
_
2
Variance 2 E X X X x 2 X x 2
2
So Variance 2 X x 2 X x 2 12 - 3 2 3
Dr.S.Perumal sankar, Prof Toc H Institute of Science and Technology
Problem6
Pdf of a random variable is uniformly distributed over the interval (2,10).Calculate the variance.
1
a for 2 x 10
f X ( x) 8
0 elsewhere
Var ( X ) E ( X 2 ) E ( X )2
8
10 1 1 10 1 x2
Where E ( X ) x f(x) dx x dx x dx 6
2 8 82 8 2 2
8
10 1 1 10 1 x3 124
Where E ( X 2 ) x 2 f(x) dx x 2 dx x 2 dx
2 8 82 8 3 2 3
124
So Var ( X ) E ( X 2 ) E ( X )2 36
3
Problem7
Random variable X be the output of the source that is
uniformly distribute d with size N. Find its entropy.
1
If the source is uniformly distribute d with size N, then p k for
N
N 1 1 1 1
H ( X ) - pi log 2 pi - log 2 - N log 2 log 2 N
i 1 N N N N
Dr.S.Perumal sankar, Prof Toc H Institute of Science and Technology
Problem8
sin x for 0 x A
f X ( u)
0 elsewhere
What is the value of constant A
What is the correspond ing CDF, Fx (c )
What is E X
What is Var(X).
We know f X ( x ) dx 1
A A 0 A
f X ( u) du sin x dx sin x dx sin x dx
0
F X (c ) 0 1 cos( A) 1
What is E X
A / 2 A / 2 A / 2
E X u f X ( u) du u sin(u) du - ucos(u) 0
/2
cos ( u) du 1
0 0 0
A / 2 A / 2
E X 2 u 2 f X ( u) du sin(u) du - u 2 cos(u) 0
A /2
u2 2 ucos ( u) du
0 0 0
/2
E X 2 0 2u sin( u)0 /2
2 sin(u) du - 2
0
vari X E X 2 E X - 2 - 1 - 3
Dr.S.Perumal sankar, Prof Toc H Institute of Science and Technology
Problem 9
Determine the ACF R X t1 , t 2 for the random process X(t) Acos2f ct
where A and f c are constant and is uniformly distribute d over the interval(-. ).
1
-
that is f () 2
0 else where
Check whether the given random process is WSS.
R X t1 , t 2 E[ X (t1 ) X (t 2 )]
R X t1 , t 2 E[A(x)cos2f ct1 Acos2f ct 2 ]
A2 A2
E[cos2f c t1 t 2 ) 2 ] E[cos2f c t1 t 2 ]
2 2
A2 1 A2
R X t1 , t 2 [cos2 f t t ) 2 ]d cos2f c t1 t 2
2 2
c 1 2
2
1
since [cos2f c t1 t 2 ) 2 ]d 0
2
A2 A2
R X t1 , t 2 cos2f c t1 t 2 cos2f c t1 t 2
2 2
A2
RX cos2f c t1 t 2 result 1
2
Mean X x f x ( x)dx
1 1
X Acos(2f ct 2) d 0 0 result 2
2 2
From result 1 and 2 we say that the given random process X(t) is WSS
Dr.S.Perumal sankar, Prof Toc H Institute of Science and Technology
Dr.S.Perumal sankar, Prof Toc H Institute of Science and Technology
Dr.S.Perumal sankar, Prof Toc H Institute of Science and Technology
Recall that any function x(t) can be represented in terms of the Dirac delta function as follows
x(t ) x()(t s )ds
For LTI System Y(t)=X(t)*h(t) = h(t)*x(t)
Taking Fourier transform Y(ω)= X(ω)× H(ω)
H FT h t h(t) e jt dt
Dr.S.Perumal sankar, Prof Toc H Institute of Science and Technology
The response of a linear time-invariant (LTI) system to a wide-sense stationary (WSS) input
is also a WSS random process. The output of the system is the convolution of the input and
the impulse response of the system
Consider an LTI system with impulse response h(t). Suppose {X (t)} is a WSS process
input to the system. The output {Y (t)} of the system is given by
Introduction: Power Spectral Density also known as PSD is a fundamental concept used
in signal processing to measure how the average power or the strength of the signal is
distributed across different frequency components.
Definition:
The distribution of average power of a signal x(t)x(t) in the frequency domain is
called the power spectral density (PSD) or power density (PD) or power density
spectrum. The PSD function is denoted by S(ω) and is given by,
S X () R X e j2 f d
The power spectral density (or PSD, for short) SX(f) of a stationary random
process {X(t)} is the Fourier transform of the autocorrelation function RX(τ). That is,
the autocorrelation function and the power spectral density are Fourier pairs.
RX(τ)F⟷SX(f).
In LTI System X(t) is the input , Y(t) is the output and h(t) is the impulse response of the
LTI system.
.
Y ( t ) X t ht
Y X H(0)
RY h 1 h 2 R X 1 2 d 1d 2
we know that
the autocorrel ation function and the power spectral density are Fourie r pairs.
impuse response and fourier transforms are Fourie r pairs.
F
R X (t) S X (f)
F
Impulse response Frequency response
.
Dr.S.Perumal sankar Prof ECE Toc H Institute of science and Technology
2 1 so 1 2
EY 2 (t ) R Y 0 H(f) df h2 e j2 f 2 R X 2 1 d1d2
EY 2 (t ) R Y 0 H(f) df h 2 e j2 f2 e j2 f R X 2 1 d1 d 2
EY 2 (t ) R Y 0 H(f) df h 2 e j2 f2 d 2 R X 2 1 e j2 f d1
EY 2 (t ) R Y 0 H(f) df h 2 e j2 f2 d 2 R X e j2 f d1 2 1 so 1 2
where R X e j2 f d is the Power spectral density
Dr.S.Perumal sankar Prof ECE Toc H Institute of science and Technology
--------------------------------------------------------------------------------------------------------
Dr.S.Perumal sankar Prof ECE Toc H Institute of science and Technology
Problem
Given random process x(t)=A e-at where a is the constant and A is unform random
variable varies from 0 to 1. Find the mean and auto correlation of x(t).
As it is uniform distribution
1
1 0 1 0 a 1
f A (a)
0 otherwise
Mean = E[X]
= E[ A e-at ]
Here A is the random variable
So Mean = e-at E[A]
E[ X ] x f X ( x) dx
1
1 a2 e-at
e-at a f A (a ) da e-at a 1 da e-at
0 2 0 2
Autocorrelation
R XX t1 , t2 E[ X (t1 ) X (t2 )]
t1 t t2 t
R XX t , t E[ X (t ) X (t )]
R XX t , t E[ Ae-at Ae-a (t ) ]
E[ A2e-at e-a (t ) ] E[ A2e- 2at e-a ]
1 1
e- 2at e-a E[ A2 ] e- 2at e-a a2 f A (a ) da e- 2at e-a a 2 1 da
0 0
1
a3
R XX t , t e- 2at e e- 2at -a
- a
3 0
Dr.S.Perumal sankar Prof ECE Toc H Institute of science and Technology
Problem
Dr.S.Perumal sankar Prof ECE Toc H Institute of science and Technology
--------------------------------------------------------------------------------------------------------------------------------------------