0% found this document useful (0 votes)
62 views37 pages

Random Variables

This document provides an introduction to the ECE531 course on Principles of Detection and Estimation. It outlines the major topics to be covered, including administrative details, mathematical notation, an introduction to the course, and a review of essential probability concepts. It also gives examples of typical detection and estimation problems and discusses the importance of quantifying the consequences of decisions and estimates. Finally, it provides an overview of key probability concepts such as events, random variables, probability density functions, mean, variance, and uniform random variables.

Uploaded by

Harish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views37 pages

Random Variables

This document provides an introduction to the ECE531 course on Principles of Detection and Estimation. It outlines the major topics to be covered, including administrative details, mathematical notation, an introduction to the course, and a review of essential probability concepts. It also gives examples of typical detection and estimation problems and discusses the importance of quantifying the consequences of decisions and estimates. Finally, it provides an overview of key probability concepts such as events, random variables, probability density functions, mean, variance, and uniform random variables.

Uploaded by

Harish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

ECE531: Course Introduction

ECE531: Principles of Detection and Estimation


Course Introduction
D. Richard Brown III
WPI

19-January-2011

WPI

D. Richard Brown III

19-January-2011

1 / 37

ECE531: Course Introduction

Lecture 1 Major Topics


1. Administrative details:

Course web page.


Syllabus and textbook.
Academic honesty policy.
Students with disabilities statement.

2. Mathematical notation.
3. Course introduction.
4. Review of essential probability concepts.

WPI

D. Richard Brown III

19-January-2011

2 / 37

ECE531: Course Introduction

Some Notation

A set with discrete elements: S = {1, , 6}.


The cardinality of a set: |S| = 3.
The set of all integers: Z = {. . . , 1, 0, 1, . . . }.
The set of all real numbers: R = (, ).
Intervals on the real line: [3, 1], (0, 1], (1, 1), [10, ).
Multidimensional sets:

{a, b, c}2 is shorthand for the set {aa, ab, ac, ba, bb, bc, ca, cb, cc}.
R2 is the two-dimensional real plane.
R3 is the three-dimensional real volume.

An element of a set: s S.
A subset: W S.
The probability of an event A: Prob[A] [0, 1].
The joint probability of events A and B: Prob[A, B] [0, 1].
The probability of event A conditioned on event B:
Prob[A | B] [0, 1].
WPI

D. Richard Brown III

19-January-2011

3 / 37

ECE531: Course Introduction

Typical Detection Problems


3

y(t)

0.2

0.4

0.6

0.8

1
time

1.2

1.4

1.6

1.8

Is this a sine wave plus noise, or just noise?


Is the frequency of the sine wave 1Hz or 2Hz?
Detection is about making smart choices (and the consequences).
WPI

D. Richard Brown III

19-January-2011

4 / 37

ECE531: Course Introduction

Typical Estimation Problems


3

y(t)

0.2

0.4

0.6

0.8

1
time

1.2

1.4

1.6

1.8

What is the frequency, phase, and/or amplitude of the sine wave?


What is the mean and/or variance of the noise?
Estimation is about guessing values (and the consequences).
WPI

D. Richard Brown III

19-January-2011

5 / 37

ECE531: Course Introduction

Joint Estimation and Detection


Suppose we have a binary communication system with an intersymbol
interference channel. M symbols are sent through the channel and we
observe
yk =

L1
X

h sk + wk

=0

for k {0, . . . , L + M 2} where

Unknown binary symbols [s0 , . . . , sM 1 ] {1, +1}M


Unknown discrete-time impulse response of channel
[h0 , . . . , hL1 ] RL
Unknown noise [w0 , . . . , wL+M 2 ] RL+M 1

In some scenarios, we may want know the bits that were sent and the
channel coefficients. This is a joint estimation and detection problem.
Why?
WPI

D. Richard Brown III

19-January-2011

6 / 37

ECE531: Course Introduction

Consequences
To develop optimal decision rules or estimators, we need to quantify the
consequences of incorrect decisions or inaccurate estimates.

Simple Example
It is not known if a coin is fair (HT) or double headed (HH). We are given
one observation of the coin flip. Based on this observation, how do you
decide if the coin is HT or HH?
Observation
H
T

Rule 1
HH
HH

Rule 2
HT
HT

Rule 3
HH
HT

Rule 4
HT
HH

Suppose you have to pay $100 if you are wrong. Which decision rule is
optimum?

WPI

D. Richard Brown III

19-January-2011

7 / 37

ECE531: Course Introduction

Rule 1: Always decide HH


Note that the observation is ignored here.

If the coin is HT (fair), the decision was wrong and you must pay
$100.

If the coin is HH (double headed), the decision was right and you pay
nothing.

The maximum cost (between HH or HT) for Rule 1 is $100.


The average cost for Rule 1 is
C1 = Prob[HT] $100 + Prob[HH] $0
where Prob[HT] and Prob[HH] are the prior probabilities (the probability
before any observations) on the coin being fair or double headed,
respectively.
For purposes of illustration, lets assume Prob[HT] = Prob[HH] = 0.5 so
that C1 = $50.
WPI

D. Richard Brown III

19-January-2011

8 / 37

ECE531: Course Introduction

Rule 2: Always decide HT


Again, the observation is being ignored. Same analysis as for Rule 1...

If the coin is HT (fair), the decision was right and you pay nothing.

If the coin is HH (double headed), the decision was wrong and you
must pay $100.

The maximum cost for Rule 2 is $100.


The average cost for Rule 2 is
C2 = Prob[HT] $0 + Prob[HH] $100
If Prob[HT] = Prob[HH] = 0.5, then C2 = $50.

WPI

D. Richard Brown III

19-January-2011

9 / 37

ECE531: Course Introduction

Rule 3: Decide HH if H observed, HT if T observed

If the coin is HT (fair), there is a 50% chance the observation will be


H and you will decide HH. This will cost you $100. There is also a
50% chance that the observation will be T and you will decide HT. In
this case, you made the correct decision and pay nothing.
CHT = Prob[H|HT] $100 + Prob[T|HT] $0 = $50

If the coin is HH (double headed), what is our cost? $0

The maximum cost for Rule 3 is $50.


The average cost for Rule 3 is
C3 = Prob[HT] $50 + Prob[HH] $0
If Prob[HT] = Prob[HH] = 0.5, then C3 = $25.

WPI

D. Richard Brown III

19-January-2011

10 / 37

ECE531: Course Introduction

Rule 4: Decide HT if H observed, HH if T observed


Obviously, this is a bad rule.

If the coin is HT (fair), there is a 50% chance the observation will be


T and you will decide HH. This will cost you $100. There is also a
50% chance that the observation will be H and you will decide HT. In
this case, you made the correct decision and pay nothing.
CHT = Prob[T|HT] $100 + Prob[H|HT] $0 = $50

If the coin is HH (double headed), what is our cost? $100

The maximum cost for Rule 4 is $100.


The average cost for Rule 4 is
C3 = Prob[HT] $50 + Prob[HH] $100
If Prob[HT] = Prob[HH] = 0.5, then C4 = $75.
WPI

D. Richard Brown III

19-January-2011

11 / 37

ECE531: Course Introduction

Remarks

The notion of maximum cost is the maximum over the possible


states of nature (HH and HT in our example), but averaged over
the probabilities of the observation.

HT

HH

p(y|)

states of nature

y
observations

In our example, we could always lose $100, irrespective of the decision


rule. But the maximum cost of Rule 3 was $50.

Is Rule 3 optimal?
WPI

D. Richard Brown III

19-January-2011

12 / 37

ECE531: Course Introduction

Probability Basics: Events


Let A be a possible (or impossible) outcome of a random experiment. We
call A an event and Prob[A] [0, 1] is the probability that A happens.
Examples:
A = tomorrow will be sunny in Worcester, Prob[A] = 0.4.
A = a 9 is rolled with two fair 6-sided dice, Prob[A] = 4 .
36
A = a 13 is rolled with two fair 6-sided dice, Prob[A] = 0.
A = an odd number is rolled with two fair 6-sided dice,
Prob[A] =

4
6
4
2
1
2
+
+
+
+
=
36 36 36 36 36
2

A = any number but 9 is rolled with two fair 6-sided dice,


4
= 32
Prob[A] = 1 36
36
= 1, where A means
The last result used the fact that Prob[A] + Prob[A]

not event A and Prob[A] is the probability that A doesnt happen.

WPI

D. Richard Brown III

19-January-2011

13 / 37

ECE531: Course Introduction

Probability Basics: Random Variables


Definition
A random variable is a mapping from random experiments to real numbers.
Example: Let X be the Dow Jones average at the close tomorrow.
We can easily relate events and random variables.
Example: What is the probability that X 11500?
X is the random variable. It can be anything on the interval [0, ).
The event is A = X is no less than 11500.
To answer these types of questions, we need to know the probabilistic
distribution of the random variable X. Every random variable has a
cumulative distribution function (CDF) defined as
FX (x) := Prob[X x]
for all x R.
WPI

D. Richard Brown III

19-January-2011

14 / 37

ECE531: Course Introduction

Probability Basics: Properties of the CDF


FX (x) := Prob[X x]

The following properties are true for any random variable X:


FX () = 0.
FX () = 1.
If y > x then FX (y) FX (x).
Example: Let X be the Dow Jones average at the close on Friday.
FX (x)
1

x
11500
WPI

D. Richard Brown III

19-January-2011

15 / 37

ECE531: Course Introduction

Probability Basics: The PDF


The probability density function (PDF) of the random variable X is
d
FX (x)
pX (x) :=
dx
The following properties are true for any random variable X:
pX (x) 0 for all x.
R

pX (x) dx = 1.
Rb
Prob[a < X b] =
a pX (x) dx = FX (b) FX (a).
Example: Let X be the Dow Jones average at the close on Friday.
pX (x)

x
11500

WPI

D. Richard Brown III

19-January-2011

16 / 37

ECE531: Course Introduction

Probability Basics: Mean and Variance


Definition
The mean of the random variable X is defined as
Z
xpX (x) dx.
E[X] =

The mean is also called the expectation.

Definition
The variance of the random variable X is defined as
Z
(x E[X])2 pX (x) dx.
var[X] =

Remark: The standard deviation of X is equal to std[X] =


WPI

D. Richard Brown III

p
var[X].

19-January-2011

17 / 37

ECE531: Course Introduction

Probability Basics: Properties of Mean and Variance


Assuming c is a known constant, it is not difficult to show the following
properties of the mean:
1. E[cX] = cE[X] (by linearity)
2. E[X + c] = E[X] + c (by linearity)
3. E[c] = c
Assuming c is a known constant, it is not difficult to show the following
properties of the variance:
1. var[cX] = c2 var[X]
2. var[X + c] = var[X]
3. var[c] = 0

WPI

D. Richard Brown III

19-January-2011

18 / 37

ECE531: Course Introduction

Uniform Random Variables


Uniform distribution: X U (a, b) for a b.
pX (x)
1
ba

pX (x) =

1
ba

axb
otherwise

Sketch the CDF.


Suppose X U (1, 5).

What
What
What
What
What
WPI

is
is
is
is
is

Prob[X = 3]?
Prob[X < 2]?
Prob[X > 1]?
E[X]?
var[X]?
D. Richard Brown III

19-January-2011

19 / 37

ECE531: Course Introduction

Discrete Uniform Random Variables


Uniform distribution: X U (S) where S is a finite set of discrete points
on the real line. Each element in the set is equally likely. Example:
pX (x)
x
1

Given S = {s1 , . . . , sn }, then Prob[X = s1 ] = . . . Prob[X = sn ] =


pX (x) =

1
n

and

1
((x s1 ) + + (x sn ))
n

Sketch the CDF.


What is Prob[X = 3]?
What is Prob[X < 2]?
What is Prob[X 2]?
What is E[X]?
What is var[X]?
WPI

D. Richard Brown III

19-January-2011

20 / 37

ECE531: Course Introduction

Gaussian Random Variables


Gaussian distribution: X N (, 2 ) for any and .


(x )2
1
exp
pX (x) =
2 2
2
Remarks:
1. E[X] = .
2. var[X] = 2 .
3. Gaussian random variables are completely specified by their mean and
variance.
4. Lots of things in the real world are Gaussian or approximately
Gaussian distributed, e.g. exam scores, etc. The Central Limit
Theorem explains why this is so.
5. Probability calculations for Gaussian random variables often require
the use of erf and/or erfc functions (or the Q-function).
WPI

D. Richard Brown III

19-January-2011

21 / 37

ECE531: Course Introduction

Final Remarks on Scalar Random Variables


1. The PDF and CDF completely describe a random variable.

If X and Y have the same PDF, then they have the same CDF, the
same mean, and the same variance.

2. The mean and variance are only partial statistical descriptions of a


random variable.

If X and Y have the same mean and/or variance, they might have the
same PDF/CDF but not necessarily.

One Random Variable


cdf
pdf

mean
WPI

D. Richard Brown III

var
19-January-2011

22 / 37

ECE531: Course Introduction

Joint Events
Suppose you have two events A and B. We can define a new event
C = both A and B occur
and we can write
Prob[C] = Prob[A B] = Prob[A, B]

A
AB
B

WPI

D. Richard Brown III

19-January-2011

23 / 37

ECE531: Course Introduction

Conditional Probability of Events


Suppose you have two events A and B. We can condition on the event B
to write the probability
Prob[A | B] = the probability of event A given event B happened
When Prob[B] > 0, this conditional probability is defined as
Prob[A | B] =

Prob[A, B]
Prob[A B]
=
Prob[B]
Prob[B]

Three special cases:

A
B

WPI

D. Richard Brown III

19-January-2011

24 / 37

ECE531: Course Introduction

Conditional Probabilities: Our Earlier Example


It is not known if a coin is fair (HT) or double headed (HH). We can write
the conditional probabilities of a one-flip observation as
Prob[observe H | coin is HT] = 0.5
Prob[observe T | coin is HT] = 0.5

Prob[observe H | coin is HH] = 1


Prob[observe T | coin is HH] = 0

Can you compute Prob[coin is HH | observe H]?


We can write
Prob[coin is HH | observe H] =
=

Prob[coin is HH, observe H]


Prob[observe H]
Prob[observe H | coin is HH]Prob[coin is HH]
Prob[observe H]

We are missing two things: Prob[coin is HH] and Prob[observe H]....


WPI

D. Richard Brown III

19-January-2011

25 / 37

ECE531: Course Introduction

Conditional Probabilities: Our Earlier Example


The term Prob[coin is HH] is called the prior probability, i.e. it is our
belief that the coin is unfair before we have any observations. This is
assumed to be given in some of the problems that we will be considering,
so lets say for now that Prob[coin is HH] = 0.5.
The term Prob[observe H] is the unconditional probability that we
observe heads. Can we calculate this?

Theorem (Total Probability Theorem)


If the events B1 , . . . , Bn are mutually
exclusive, i.e. Prob[Bi , Bj ] = 0 for
Pn
all i 6= j, and exhaustive, i.e. i=1 Prob[Bi ] = 1, then
Prob[A] =

n
X
i=1

Prob[A | Bi ]P [Bi ].

So how can we use this result to compute Prob[observe H]?


WPI

D. Richard Brown III

19-January-2011

26 / 37

ECE531: Course Introduction

Independence of Events
Two events are independent if and only if their joint probability is equal to
the product of their individual probabilities, i.e
Prob[A, B] = Prob[A]Prob[B]
Lots of events can be assumed to be independent. For example, suppose
you flip a coin twice with A =the first coin flip is heads, B =the
second coin flip is heads, and C =both coin flips are heads.
Are A and B independent?
Are A and C independent?
Note that when events A and B are independent,
Prob[A | B] =

Prob[A]Prob[B]
Prob[A, B]
=
= Prob[A].
Prob[B]
Prob[B]

This should be intuitively satisfying since knowing B happened doesnt


give you any useful information about A.
WPI

D. Richard Brown III

19-January-2011

27 / 37

ECE531: Course Introduction

Conditional Probability Example


[Leon-Garcia p. 50] An urn contains two black balls and three white balls.
Two balls are selected at random from the urn (without replacement).

1. What is the probability that the first ball you select from the urn is
black?
2. Given that the first ball that you select from the urn is black, what is
the probability that the second ball you select from the urn is also
black?
3. What is the probability that both balls you select are black?
WPI

D. Richard Brown III

19-January-2011

28 / 37

ECE531: Course Introduction

Another Conditional Probability Example


An urn contains one ball known to be either black or white with equal
probability. A white ball is added to the urn, the urn is shaken, and a ball
is removed randomly from the urn.
1. If the ball removed from the urn is black, what is the probability that
the ball remaining in the urn is white?

2. If the ball removed from the urn is white, what is the probability that
the ball remaining in the urn is white?

WPI

D. Richard Brown III

19-January-2011

29 / 37

ECE531: Course Introduction

Joint Random Variables


When we have two random variables, we require a joint distribution. The joint
CDF is defined as
FX,Y (x, y) = Prob[X x Y y] = Prob[X x, Y y]
and the joint PDF is defined as
pX,Y (x, y) =

2
FX,Y (x, y)
xy

If you know the joint distribution, you can get the marginal distributions:
FX (x)

FY (y) =
pX (x)

pY (y) =

FX,Y (x, )

FX,Y (, y)
Z
pX,Y (x, y) dy

Z
pX,Y (x, y) dx

Marginals are not enough to specify the joint distribution, except in special cases.
WPI

D. Richard Brown III

19-January-2011

30 / 37

ECE531: Course Introduction

Joint Statistics
Note that the means and variances are defined as usual for X and Y .
When we have a joint distribution, we have two new statistical quantities:

Definition
The correlation of the random variables X and Y is defined as
Z Z
xypX,Y (x, y) dx dy.
E[XY ] =

Definition
The covariance of the random variables X and Y is defined as
Z Z
cov[X, Y ] =
(x E[X])(y E[Y ])pX,Y (x, y) dx dy.

WPI

D. Richard Brown III

19-January-2011

31 / 37

ECE531: Course Introduction

Conditional Distributions
If X and Y are both discrete random variables (both are drawn from finite sets)
with Prob[X = x] > 0, then
Prob[Y = y | X = x] =

Prob[X = x, Y = y]
Prob[X = x]

If Y is a discrete random variable and X is a continuous random variable, then


the conditional probability that Y = y given X = x is
Prob[Y = y | X = x] =

pX,Y (x, y)
pX (x)

=y]
where pX,Y (x, y) := limh0 Prob[xh<Xx,Y
is the joint PDF-probability of
h
the random variable X and the event Y = y.
If X and Y are both continuous random variables, then the conditional PDF of Y
given X = x is

pY (y | X = x) = pY (y | x) =

pX,Y (x, y)
.
pX (x)

with pX,Y (x, y) as the usual joint distribution of X and Y .


WPI

D. Richard Brown III

19-January-2011

32 / 37

ECE531: Course Introduction

Conditional Statistics
Definition
The conditional mean of a random variable Y given X = x is defined as
Z
E[Y | x] =
ypY (y | x) dy.

The definition is identical to the regular mean except that we use the
conditional PDF.

Definition
The conditional variance of a random variable Y given X = x is defined
as
Z
(y E[Y | x])2 pY (y | x) dx.
var[Y | x] =

WPI

D. Richard Brown III

19-January-2011

33 / 37

ECE531: Course Introduction

Independence of Random Variables


Two random variables are independent if and only if their joint distribution
is equal to a product of their marginal distributions, i.e.
pX,Y (x, y) = pX (x)pY (y).
If X and Y are independent, the conditional PDFs can be written as
pY (y | x) =

pX,Y (x, y)
pX (x)pY (y)
=
= pY (y)
pX (x)
pX (x)

pX (x | y) =

pX,Y (x, y)
pX (x)pY (y)
=
= pX (x).
pY (y)
pY (y)

and

These results should be intuitively satisfying since knowing X = x (or


Y = y) doesnt tell you anything about Y (or X).
WPI

D. Richard Brown III

19-January-2011

34 / 37

ECE531: Course Introduction

Jointly Gausian Random Variables


Definition: The random variables X = [X1 , . . . , Xk ] are jointly Gaussian
if their joint density is given as


(x X ) P 1 (x X )
1/2
pX (x) = |2P |
exp
2
where X = E[X] and P = E[(X X )(X X ) ].
Remarks:
1. X = [E[X1 ], . . . , E[Xk ]] is a k-dimensional vector of means
2. P is a k k matrix of covariances,

E[(X1 X1 )(X1 X1 )]

..
P =
.
E[(Xk Xk )(X1 X1 )]
WPI

i.e.,
...
..
.
...

D. Richard Brown III

E[(X1 X1 )(Xk Xk )]

..

.
E[(Xk Xk )(Xk Xk )]
19-January-2011

35 / 37

ECE531: Course Introduction

Where We Are Heading


We are going to begin our study of detection and estimation by learning
the fundamental concepts of hypothesis testing.
Hypothesis testing involves making inferences about unknown things
(states of nature) from observations. Examples:

Infer if the coin is fair or unfair after observing one or more flips.
HT

HH

p(y|)

states of nature

y
observations

Infer if the airplane is friend or foe by observing a radar signature.

We will be working with lots of conditional probability expressions in our


study of hypothesis testing. You should feel comfortable with this material.
WPI

D. Richard Brown III

19-January-2011

36 / 37

ECE531: Course Introduction

Where We Will Be in April: Kalman Filtering


2008 Charles Stark Draper Prize: Dr. Rudolph Kalman

The Kalman Filter uses a mathematical technique that removes noise from
series of data. From incomplete information, it can optimally estimate and
control the state of a changing, complex system over time. Applications
include target tracking by radar, global positioning systems, hydrological
modeling, atmospheric observations, and time-series analyses in
econometrics.
(paraphrased from http://www.nae.edu/)
WPI

D. Richard Brown III

19-January-2011

37 / 37

You might also like