Comm 05 Random Variables and Processes 1
Comm 05 Random Variables and Processes 1
◊ 5.1 Introduction
◊ 5.2 Probability
◊ 5.3 Random Variables
◊ 5.4 Statistical Averages
◊ 5.5 Random Processes
◊ 5.6 Mean, Correlation and Covariance Functions
◊ 5.7 Transmission of a Random Process through a Linear Filter
◊ 5.8 Power Spectral Density
◊ 5.9 Gaussian Process
◊ 5.10 Noise
◊ 5.11 Narrowband Noise
2
5.1 Introduction
3
5.2 Probability
◊ Probability theory is rooted in phenomena that, explicitly or
implicitly, can be modeled by an experiment with an outcome that is
subject to chance.
◊ Example: Experiment may be the observation of the result of
tossing a fair coin. In this experiment, the possible outcomes of a
trial are “heads” or “tails”.
◊ If an experiment has K possible outcomes, then for the kth possible
outcome we have a point called the sample point, which we denote
by sk. With this basic framework, we make the following definitions:
◊ The set of all possible outcomes of the experiment is called the
4
5.2 Probability
◊ A single sample point is called an elementary event.
◊ The entire sample space S is called the sure event; and the null set
5
5.2 Probability
6
5.2 Probability
◊ The following properties of probability measure P may be derived
from the above axioms:
1. P A 1 P A 5.4
2. When events A and B are not mutually exclusive:
P A B P A P B P A B 5.5
3. If A1 ,A2 ,...,Am are mutually exclusive events that include all
possible outcomes of the random experiment, then
P A1 P A2 P Am 1 5.6
7
5.2 Probability
◊ Let P[B|A] denote the probability of event B, given that event A has
occurred. The probability P[B|A] is called the conditional
probability of B given A.
◊ P[B|A] is defined by
P A B
P B A 5.7
P A
◊ Bayes’ rule
◊ We may write Eq.(5.7) as P[A∩B] = P[B|A]P[A] (5.8)
◊ It is apparent that we may also write P[A∩B] = P[A|B]P[B] (5.9)
◊ From Eqs.(5.8) and (5.9), provided P[A] ≠ 0, we may determine P[B|A] by
using the relation
P A B P B
P B A 5.10
P A
8
5.2 Conditional Probability
◊ Suppose that the condition probability P[B|A] is simply equal to the
elementary probability of occurrence of event B, that is
P B A P B P A B P A P B
so that P A B P A P B
P A B P A 5.13
P B P B
◊ Events A and B that satisfy this condition are said to be
statistically independent.
9
5.2 Conditional Probability
◊ Example 5.1 Binary Symmetric Channel
◊ This channel is said to be discrete in that it is designed to handle
discrete messages.
◊ The channel is memoryless in the sense that the channel output at
any time depends only on the channel input at that time.
◊ The channel is symmetric, which means that the probability of
receiving symbol 1 when 0 is sent is the same as the probability
of receiving symbol 0 when symbol 1 is sent.
10
5.2 Conditional Probability
P B1 A0 P B0 A1 p
◊ The probability of receiving symbol 0 is given by:
P B0 P B0 A0 P A0 P B0 A1 P A1 1 p p0 pp1
◊ The probability of receiving symbol 1 is given by:
P B1 P B1 A0 P A0 P B1 A1 P A1 pp0 1 p p1
11
5.2 Conditional Probability
P B0 A0 P A0 1 p p0
P A0 B0
P B0 1 p p0 pp1
P B1 A1 P A1 1 p p1
P A1 B1
P B1 pp0 1 p p1
12
5.3 Random Variables
◊ We denote the random variable as X(s) or just X.
◊ X is a function, s is the outcome of the experiment.
◊ Random variable may be discrete or continuous.
◊ Consider the random variable X and the probability of the event
X ≤ x. We denote this probability by P[X ≤ x].
◊ To simplify our notation, we write
FX x P X x 5.15
◊ The function FX(x) is called the cumulative distribution
function (cdf) or simply the distribution function of the random
variable X.
◊ The distribution function FX(x) has the following properties:
1. 0 Fx x 1
2. Fx x1 Fx x2 if x1 x2
13
5.3 Random Variables
14
5.3 Random Variables
◊ If the distribution function is continuously differentiable, then
d
fX x FX x 5.17
dx
◊ fX(x) is called the probability density function (pdf) of the random
variable X.
◊ Probability of the event x1 < X ≤ x2 equals
P x1 X x2 P X x2 P X x1
x
FX x2 FX x1 FX x f X ξ dξ
x1
5.19
x2
f X x dx
x1
0, xa
1
fX x , a xb
b a
0, xb
0, xa
x a
FX x , a xb
b a
0, xb
16
5.3 Random Variables
f X ,Y x,y
fY y x 5.28
fX x
18
5.3 Random Variables
fY y x fY y
by 5.28
f X ,Y x, y f X x fY y 5.32
P X A, Y B P X A P Y B 5.33
19
5.3 Random Variables
◊ Example 5.3 Binomial Random Variable
◊ Consider a sequence of coin-tossing experiments where the
probability of a head is p and let Xn be the Bernoulli random
variable representing the outcome of the nth toss.
◊ Let Y be the number of heads that occur on N tosses of the coins:
N
Y Xn
n 1
N y
P Y y p 1 p
Ny
y
N N!
y y! N y !
20
5.4 Statistical Averages
21
5.4 Statistical Averages
1
, x
f X x 2
0, otherwise
1
E Y cos x dx
2
1
sin x x
2
0
22
5.4 Statistical Averages
◊ Moments
◊ For the special case of g(X) = X n, we obtain the nth moment of
the probability distribution of the random variable X; that is
E X x n f X x dx
n
5.39
◊ Mean-square value of X :
E X x 2 f X x dx
2
5.40
23
5.4 Statistical Averages
◊ For n = 2 the second central moment is referred to as the variance of
the random variable X, written as
Var X E X X x X f X x dx 5.42
2 2
◊ The variance of a random variable X is commonly denoted as X2 .
◊ The square root of the variance is called the standard deviation of
the random variable X.
Var X E X X
◊ 2 2
X
E X 2 2 X X X2
E X 2 2 X E X X2
E X 2 X2 5.44
24
5.4 Statistical Averages
◊ Characteristic function X is defined as the expectation of the
complex exponential function exp( jυX ), as shown by
X j E exp j X f X x exp j X dx 5.45
1
fX x X j exp j X d 5.46
2
25
5.4 Statistical Averages
◊ Characteristic functions
◊ First moment (mean) can be obtained by:
d ( jv)
E ( X ) mx j
dv v 0
◊ Since the differentiation process can be repeated, n-th
moment can be calculated by:
d n
( jv )
E ( X n ) ( j ) n
dv n v 0
26
5.4 Statistical Averages
◊ Characteristic functions
◊ Determining the PDF of a sum of statistically independent
random variables:
n
n
Y Xi Y ( jv) E (e jvY
) E exp jv X i
i 1 i 1
n jvX i
n
jvxi
E e
... e f X1 , X 2 ,..., X n ( x1 , x2 ,..., xn )dx1dx2 ...dxn
i 1 i 1
Since the random variables are statistically independent,
n
f X1 , X 2 ,..., X n ( x1 , x2 ,..., xn ) f X1 ( x1 ) f X 2 ( x2 )... f X n ( xn ) Y ( jv) X i ( jv)
i 1
27
5.4 Statistical Averages
◊ Characteristic functions
◊ The PDF of Y is determined from the inverse Fourier
transform of ΨY(jv).
◊ Since the characteristic function of the sum of n statistically
independent random variables is equal to the product of the
characteristic functions of the individual random variables, it
follows that, in the transform domain, the PDF of Y is the n-
fold convolution of the PDFs of the Xi.
◊ Usually, the n-fold convolution is more difficult to perform
than the characteristic function method in determining the PDF
of Y.
28
5.4 Statistical Averages
0 (odd k )
29
5.4 Statistical Averages
◊ Joint Moments
◊ Consider next a pair of random variables X and Y. A set of
statistical averages of importance in this case are the joint
moments, namely, the expected value of Xi Y k, where i and k
may assume any positive integer values. We may thus write
E X Y x i y k f X ,Y x, y dxdy
i k
5.51
◊ Covariance of X and Y :
Cov XY E X E X Y E Y = E XY X Y 5.53
31
5.4 Statistical Averages
◊ Correlation coefficient of X and Y :
Cov XY
5.54
XY
uncorrelated.
◊ The converse of the above statement is not necessarily true.
32
5.4 Statistical Averages
p jk
p 1 p
where the E X 2j k 0 k 2 P X k .
1
33
5.5 Random Processes
34
5.5 Random Processes
35
5.5 Random Processes
characterized statistically by their joint PDF f X xt1 , xt2 ,..., xtn .
36
5.5 Random Processes
37
5.5 Random Processes
E X n
ti xtni f X xti dxti
38
5.5 Random Processes
40
5.5 Random Processes
◊ Auto-covariance function
◊ The auto-covariance function of a stochastic process is
defined as:
Cov X t1 , X t2 E X t1 m t1 X t2 m t2
RX t1 , t2 m t1 m t2
◊ When the process is stationary, the auto-covariance
function simplifies to:
Cov( X t , X t ) C X (t1 t2 ) C X ( ) RX ( ) m 2
1 2
41
5.6 Mean, Correlation and Covariance Functions
42
5.6 Mean, Correlation and Covariance Functions
1. RX 0 E X 2 t 5.64
2. RX RX 5.65
3. RX RX 0 5.67
◊ Proof of (5.64) can be obtained from (5.63) by putting τ = 0.
44
5.6 Mean, Correlation and Covariance Functions
◊ Proof of (5.65):
RX E X t X t E X t X t RX
◊ Proof of (5.67):
E X t X t 0
2
E X 2 t 2E X t X t E X 2 t 0
2 RX 0 2 RX 0
RX 0 RX RX 0
RX RX 0
45
5.6 Mean, Correlation and Covariance Functions
46
5.6 Mean, Correlation and Covariance Functions
f XY ( xt1 , xt2 ,..., xtn , yt ' , yt ' ,..., yt ' ) f X ( xt1 , xt2 ,..., xtn ) fY ( yt ' , yt ' ,..., yt ' )
1 2 m 1 2 m
for all choices of ti and t’i and for all positive integers n and m.
◊ The processes are said to be uncorrelated if
Rxy (t1 , t2 ) E ( X t1 ) E (Yt2 ) Cov( X t1 , Yt2 ) 0
49
5.6 Mean, Correlation and Covariance Functions
R12 E X 1 t X 2 t
E X t X t cos 2 f c t sin 2 f c t 2 f c
E X t X t E cos 2 f c t sin 2 f c t 2 f c
1
RX E sin 4 f c t 2 f c 2 sin 2 f c
2
1
RX sin 2 f c R12 0 E X 1 t X 2 t 0
2
50
5.6 Mean, Correlation and Covariance Functions
◊ Ergodic Processes
◊ In many instances, it is difficult or impossible to observe all sample
functions of a random process at a given time.
◊ It is often more convenient to observe a single sample function for a
long period of time.
◊ For a sample function x(t), the time average of the mean value over
an observation period 2T is
1 T
x ,T x t dt 5.84
2T T
Y t E Y t E h 1 X t 1 d 1
= h 1 E X t 1 d 1
h 1 X t 1 d 1 5.86
52
5.7 Transmission of a Random Process Through
a Linear Filter
RY t , u E Y t Y u E h 1 X t 1 d 1 h 2 X u 2 d 2
d 1h 1 d 2 h 2 E X t 1 X u 2
d 1h 1 d 2 h 2 RX t 1 , u 2
53
5.7 Transmission of a Random Process Through
a Linear Filter
RY h h R
1 2 X 1 2 d 1d 2 5.90
54
5.8 Power Spectral Density
◊ The Fourier transform of the autocorrelation function RX(τ) is called
the power spectral density SX( f ) of the random process
X(t).
S X f RX exp j 2 f d 5.91
RX S X f exp j 2 f df 5.92
55
5.8 Power Spectral Density
56
Proof of Eq. (5.95)
It can be shown that (see eq. 5.106) SY f S X f H f
2
◊
RY SY f exp j 2 f df S X f H f exp j 2 f df
2
RY 0 E Y t S X f H f df 0 for any H f
2 2
(5.64)
◊ Suppose we let |H( f )|2=1 for any arbitrarily small interval f1 ≤ f ≤ f2 ,
and H( f )=0 outside this interval. Then, we have:
f2
S X f df 0
f1
57
5.8 Power Spectral Density
58
5.8 Power Spectral Density
Y f X t cos 2 f c t (5.101)
59
5.8 Power Spectral Density
h 1 h 2 RX 1 2 e j 2 f d 1d 2 d
Let 1 2 0
h 1 h 2 RX 0 e
j 2 f 0 1 2
d 1d 2 d 0
h 1 e j 2 f 1
d 1 h 2 e j 2 f 2
d 2 RX 0 e j 2 f 0 d 0
H f H f SX f H f SX f 5.106
2
61
5.10 Noise
◊ The sources of noise may be external to the system (e.g.,
atmospheric noise, galactic noise, man-made noise), or internal to
the system.
62
5.10 Noise
◊ Thermal Noise
◊ Thermal noise is the name given to the electrical noise arising
from the random motion of electrons in a conductor.
63
5.10 Noise
◊ White Noise
◊ The noise analysis is customarily based on an idealized form of
noise called white noise, the power spectral density of which is
independent of the operating frequency.
◊ White is used in the sense that white light contains equal
amount of all frequencies within the visible band of
electromagnetic radiation.
◊ We express the power spectral density of white noise, with a
sample function denoted by w(t), as
N0
SW f
2
N 0 kTe
The dimensions of N0 are in watts per Hertz, k is Boltzmann’s
constant and Te is the equivalent noise temperature of the receiver.
64
5.10 Noise
◊ White Noise
◊ The equivalent noise temperature of a system is defined as the
temperature at which a noisy resistor has to be maintained such
that, by connecting the resistor to the input of a noiseless
version of the system, it produces the same available noise
power at the output of the system as that produced by all the
sources of noise in the actual system.
◊ The autocorrelation function is the inverse Fourier transform of
the power spectral density:
N0
RW τ τ
2
◊ Any two different samples of white noise, no matter how
closely together in time they are taken, are uncorrelated.
◊ If the white noise w(t) is also Gaussian, then the two samples
are statistically independent.
65
5.10 Noise
◊ Example 5.14 Ideal Low-Pass Filtered White Noise
◊ Suppose that a white Gaussian noise w(t) of zero mean and
power spectral density N0/2 is applied to an ideal low-pass filter
of bandwidth B and passband amplitude response of one.
◊ The power spectral density of the noise n(t) is
N0
, B f B
SN f 2
0, f B
N0
B
RN τ exp j 2πfτ df
B 2
N 0 B sinc 2Bτ
66