week two note

Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

2.

0 Mathematical Expectation

A very important concept in probability and statistics is that of mathematical

expectation, expected value or briefly the expectation, of a random variable.

2.1 Expectation of a Discrete Random Variable

For a discrete random variable X having the possible values x1, x2, …, xn, the

expectation of X is defined as:

E ( X )  x1P( X  x1 )  x2 P( X  x2 )  ...  xn P( X  xn )
n
  xi P( X  xi )
i 1

or equivalently if P( X  xi )  f ( xi )

E ( X )  x1 f ( x1 )  x2 f ( x2 )  ...  xn f ( xn )
n
  xi f ( xi )
i 1

  xf ( x)

where the last summation is taken over appropriate values of x .

If the probabilities are all equal, then

x1  x2   xn
E( X )  which is called the mathematical mean or simply the mean
n

of x1 , x2 ,..., xn . When X takes on an infinite number of values, x1 , x2 ,... , we have


E ( X )   xi f ( xi ) provided that the infinite series converges absolutely.
i 1

1
Example 2.1

Suppose that a game is to be played with a single die assumed fair. In this game,

a player wins KShs. 20 if a 2 turns up; KShs. 40 if a 4 turns up; loses KShs. 30

if a 6 turns up; while the player neither wins nor loses if any other face turns

up. Find the expected sum of money to be won.

Solution.

Let X be the random variable giving the amount of money to be won on any toss.

Then X ={-30, 0, 20, 40}.

The probability function becomes:

Table Error! No text of specified style in document..1

X -30 0 20 40

f(x) 1
6 ½ 1
6
1
6

∴ E(x) = ∑ xf(x).

= -30 × 1
6 +0 × ½ +20 × 1
6 + 40 × 1 6

=-30/6 +20/6+ 40/6

=30/6=5.

2
It follows that the player can expect to win KShs. 5. In a fair game, therefore, the

player should be expected to pay KShs. 5in order to play the game.

2.2 Expectation of a Continuous Random Variable

For a continuous random variable having density function f(x), the expectation

of X is defined as E ( X )   xf ( x)dx

provided that the integral converges

absolutely. The expectation of X is very often called the mean of X and is denote

by μx or simply μ, when the particular random variable is understood. The mean

or expectation of X gives a single value that acts as a representative or average

of the values of X, and for this reason it is often called a measure of central

tendency.

Example 2.2

1
 x, 0<x<2
The density function of a random variable X is given by f ( x)   2 .
0, otherwise

Find the expected value of X.

Solution.


E ( x)     xf ( x)dx


2 2
1 1
=  x xdx   x 2 dx
0
2 20

3
2
1  x3 
   = ½ × 8/3 =4/3
2  3 0

2.3 Functions of Random Variables

Let X be a discrete random variable with probability function f(x). Then Y=g(X) is

also a discrete random variable and the probability function of Y is:

h(y) = p(Y=y)

  p( X  x)
 x| g ( x )  y

  f ( x)
 x| g ( x )  y

If X takes on values x1, x2, …, xn and Y the values y1, y2, …, ym (m≤n), then

y1h(y1)+ y2h(y2)+…+ ymh(ym)= g(x1)f(x1)+ g(x2)f(x2)+…+ g(xn)f(xn).

Therefore

E[g(x)]= g(x1)f(x1)+ g(x2)f(x2)+…+ g(xn)f(xn)

n
  g ( xi ) f ( xi )   g ( x ) f ( x )
i 1

Similarly if X is a continuous random variable having probability density

function f(x), then it can be shown that E[ g ( x)]   g ( x) f ( x)dx.




4
Example 2.3

x
 , 0<x<2
Let X be a random variable with density function f ( x)   2 . Find
0, otherwise

E(3x2-2x).

Solution.

 2

  3x  2 x  f ( x)dx    3x 2  2 x . dx
x
E(3x2-2x)= 2

 0
2

2
1 3 2 
2

=   3 x 3  2 x 2 dx   x 4  x 3 
1
20 2 4 3 0

1 3
 2    2    12    6  
4 2 3 1 16 8 10
= 
2 4 3  2  3 3 3

Some Theorems on Expectation

1) If K is any constant, then E(KX)=KE(X)

2) If X and Y are any random variables, then E (X+Y) = E(X) + E(Y)

3) If X and Y are independent, then E(XY)= E(X).E(Y)

2.4 The Variance and Standard Deviation


Another quality of great importance in probability and statistics is called the

variance and is defined by var(X) = E[(X-μ)2], where μ is the mean of the random

variable X. The variance is a non-negative number. The positive square root of

the variance is called the standard deviation and given by

5
 x  var( x)  E[( x   )2 . Where no confusion can result, the standard deviation

is often denoted by σ instead of σx and the variance in such a case is σ2.

If X is a discrete random variable taking on the values of x1, x2, …, xn and having

probability function f(x), then the variance is given by

n
 x2  E  x        xi    f  xi    ( x   ) 2 f ( x ) .
2 2
  i 1

1
If the probabilities are all equal, we have  2 
n
 ( x   )2 .

If X is a continuous random variable having density function f(x), then the

variance is given by


  E ( x   ) 2    (x  ) f ( x)dx provided the integral converges.
2 2
x


The variance (or standard deviation) is a measure of dispersion or scatter of the

values of the random variable about the mean μ. If the values tend to be

concentrated near the mean, the variance is small while if the values tend to be

distributed far from the mean, the variance is large.

6
f(x)

Small variance

Large variance

Example 2.4

A random variable X has a probability distribution function given by

x
 , 0<x<2
f ( x)   2 . Find the variance and standard deviation of X.
0, otherwise

Solution

Var(X)=  x2  E ( x   )2 


  (x  )
2
f ( x)dx


2
x
  ( x   )2 . dx
0
2

But   E ( X )   x. f ( x)dx


7
2
1  x3 
2 2
x 1 2
  x. dx   x dx   
0
2 20 2  3 0

= ½ * 8/3 =4/3.

2
 4 x
2

∴ var( x)      x   . dx
2
x
0
3 2

1  2 8 16  1  16 
2 2
8
  x  x  x  dx    x3  x 2  x dx
20  3 9 2 0 3 9 

2
1  x4 8 8 
   x3  x 2 
2 4 9 9 0

1 64 32  2
 4   .
2 9 9  9

2 2
Hence the standard deviation,  x   .
9 3

Some Theorems on Variance


1) σ2 = E [(X-μ )2] = E(X2) - [E(X)]2 ; where μ = E(X).

2) If K is any constant, then var(KX) = K2 var (X)

3) The quantity E[(X-a)2] is a minimum when a=μ =E(x).

4) If X and Y are any two random variables, then

i). var(X+Y) = var(X) + var(Y) +2Cov(X,Y)

ii). var(X-Y) = var(X) + var(Y) - 2Cov(X,Y)

5) If X and Y are independent random variables, then

i). var(X+Y) = var(X) + var(Y)

8
ii). var(X-Y) = var(X) - var(Y)

2.5 Moments and Moment Generating Functions

The nth moment of a random variable X about the mean μ, also called the n th

central moment, is defined as μn=E[(x-μ)n], where n=0,1,2,….

It follows that μ0=1, μ1=0 and μ2=σ2 i.e. the second central moment or the second

moment about the mean is the variance. Assuming absolute convergence, we

have the following:

n    x    f ( x) when X is discrete
n
i).


n   x  
n
ii). f ( x)dx when X is continuous


The nth moment of X about the origin, also called the nth raw moment, is given

by  n'  E ( X n ) where n=0,1,2,…. and in this case there are formulae analogous to

(i) and (ii) in which μ=0. The relationship between these moments is given by

n n
n  n'    n' 1  ...   1   n' i  i  ...  (1) n 0'  n .
i

1  i 

And as special cases, we have:

1'   and 0'  1 ,  2   2'   2 ; 3  3'  32'   2 2 ; 4  4'  43'   62'  2  3 4

The moment generating function of X is defined by Mx(t) = E(etx). Thus assuming

convergence,

i). Mx(t)=∑ etx f(x) if X is discrete

9

ii). M x (t )  e f ( x )dx if X is continuous.
tx



Using Taylor series expansion, we can easily show that

t2 tn
M x (t )  1  t  2'  ...  n'  ...
2! n!

Since the coefficients in this expansion enable us to find the moments, the

reason for the name moment generating function is apparent. From the

expansion, we can show that

dn
  n M x (t )
'
n i.e.  n' is the nth derivative of Mx(t) evaluated at t=0. Where no
dt t 0

confusion can result, we often write M(t) instead of Mx(t).

Example 2.5

n n i n


Prove that n  n'    n' 1    n'  2  2  ...   1   n' i  i  ...  (1) n 0'  n
1  2 i 

Solution.

 n  E  X    
n
 

 n  n i  n 
= E  X n    X n 1    X n  2  2  ...   1   X n i  i  ...  (1) n 0'  n 
 1   2 i  

n n i n


E  X n     E  X n 1      E  X n  2   2  ...   1   E  X n i   i  ...  (1) n 0'  n
= 1  2 i 

10
n n n
n'    n' 1    n'  2  2  ...   1   n' i  i  ...  (1) n 0'  n
i

= 1  2 i 

Example 2.6

t2 tn
M x (t )  1  t  2'  ...  n'  ...
Prove that 2! n!

Solution

 tx   tx   tx 
2 3 n

e tx
 1  tx    ...   ...
Taylor series expansion 2! 3! n! .

 t 2 x 2 t 3 x3 t n xn 
E  etx   E 1  tx    ...   ...
Therefore,  2! 3! n! 

t2 t3 tn
= 1  tE  x   E  x   E  x   ...  E  x n   ...
2 3

2! 3! n!

t2 ' t3 ' tn
1  t  2  3  ...  n'  ...
= 2! 3! n! 〗

Example 2.7

A random variable can assume 1 and -1 with probability ½ each. Find

a) Moment generating function

b) The first four moments about the origin

Solution.

11
1
 , x=-1,1
f ( x)   2
a) 0, otherwise

M (t )  E (etx )
=

e tx
f ( x)
= x 1,1

1
e tx
.
= x 1,1 2

1
=  e  t  et 
2

 n'
b) Since is the nth derivative of M(t) evaluated at t=0, we have

d
n'  M (t )
dt t 0

d  1 t t 
  e  e 
1
 e  t  et  e  e   0
1 0 0
= dt  2  t 0 = 2 t 0 = 2

d2
2'  M (t )
dt 2

d  1 t t 
dt  2
 e  e   1
t 0

d3
3'  M (t )
dt 3

d  1 t t 
dt  2
 e  e   0
t 0

12
d4
4'  M (t )
dt 4

d  1 t t 
dt  2
 e  e   1
t 0

t2 t3 tn t2 t3 tn
Alternatively et  1  t    ...   ... et  1  t    ...  (1)  ...
2! 3! n! and 2! 3! n!

t2 t4 tn
So that  e  e   1  t    ...   ...
1 t t
2 2! 3! n! ………………………..(2.1)

But M (t )  E (etx )

1 
= E   et  e  t  
2 

 t2 t4 tn 
= E 1  t    ...   ...
 2! 4! n! 

2' 4' n'


=1    ...   ...
2! 4! n! …………………………………………….(2.2)

Comparing (2.1) and (2.2) we have

1' =0,  2' =1,  3' =0 and  4' =1 i.e. the odd moments are all zero and the even

moments are all one.

13
2.6 Markov and Chebychev inequality

2.6.1 Markov’s inequality

If X is a random variable that takes only non-negative values, then for any value

a>0,

E( X )
p  X  a  .
a

Proof

We give a proof for the case where X is continuous with density f(x).


E( X )   x. f ( x)dx



=  x. f ( x)dx since x≥0
0

a 
=  x. f ( x)dx   x. f ( x)dx
0 a


≥  x. f ( x)dx
a


≥  a. f ( x)dx
a


≥ a  f ( x)dx
a

≥ a p(a< x < ∞ )

14
≥ a p(x≥a)

Thus E(X) ≥ a p(x≥a)

E( X )
Hence p  X  a   .
a

Example 2.8

Suppose we know that the number of items produced in a factory during a week

is a random variable with mean 500. What can be said about the probability that

this week’s production will be at least 1000?

Solution

E( X )
By Markov’s inequality, p  X  1000  .
1000

500
 ≤½.
1000

Chebychev’s inequality
Suppose that X is a random variable (discrete or continuous) having mean μ and

2
variance σ2 which are finite. Then if ϵ is any positive number p  x       or
2

equivalently, with ϵ=kσ, p  x    k  


1 1
2
, where 2 is called the upper bound
k k

of the p  x    k  .

For example, if k=2, then p (|x-μ|≥2σ) ≤ 0.25 or p (|x-μ|<2σ) ≥ 0.75.

15
In other words, the probability of x differing from its mean by more than 2

standard deviations is less than or equal to 0.25 or equivalently, the probability

that x will lie within 2 standard deviations of its mean is greater than or equal to

0.75.

Note:

In Chebychev’s inequality, we do not need to specify the probability distribution

of X. This inequality reveals a general property of discrete or continuous random

variables having finite means and variance. More generally, the importance of

Chebychev’s and Markov’s inequalities is that they enable us to derive the

bounds and probabilities when only the mean or both the mean and the variance

of the probability distribution are known.

Example 2.9

2e2 x , x0
A random variable X has a density function given by f ( x)  
0, elsewhere

a. Find p (|x-μ|≥1)

b. Use Chevychev’s inequality to obtain an upper bound on p (|x-μ|≥1)

Solution.

a. p (|x-μ|≤1) = p (-1 ≤ x-μ ≤1)

= p (-1 + μ ≤ x ≤1+ μ )

But μ = E(x)

16

=  x. f ( x)dx


 

0 x.2e dx 20 xe dx
2 x 2 x

= =


 e 2 x  e2 x   e2 x e2 x 
2 x  dx  2   x  
  4 0
=  =
2 0
2 2

  e0  
2  0  0    0   
=   4 

=2* ¼ = ½ .

Alternatively,

d
 M (t )
dt t 0
, where M(t) = E(etx)

 

 e . f ( x)dx e .2e2 x dx
tx tx

=  =0

 
 e (2t ) x 
2 e  (2t ) x
dx 2  
 (2  t )  0
= 0 =

 e0  2
= 2 0  
 2t  2t

d 2  (2  t ).0  2(1)
∴    =
dt  2  t  t 0 2  t 
2
t 0

=2/22 = ½ .

Therefore p (|x-μ|≤1)= p (-1 + ½ ≤ x ≤1+ ½ )

17
= p ( -½ ≤ x ≤3/2 )

= p ( -½ ≤ x <0 ) + p ( 0 ≤ x ≤3/2 )

=0 + p ( 0 ≤ x ≤3/2 )
3 3
2 2

=  f ( x)dx =  2e
2 x
dx
0 0

=  e2 x  = -e-3+e0
2

=1-e-3

Hence p (|x-μ|≥1)=1-(1-e-3)

=-e-3

=0.04979.

b. Using the, inequality

p  x    k  
1
k 2 , we see that

kσ = 1…………………………………………………………………(3.2.1)

 2  E ( X 2 )   E ( X )
2

but

=E(X2)- ¼.

Now


E ( X 2 )   x 2f(x)dx


 
=  x 2 2e2 x dx = 2 xe2 x dx
0 0

18
 e 2 x e 2 x 
= 2  x2  .2 xdx 
 2 2 

 e2 x e 2 x 
=   x 2e 2 x  2  xe 2 x dx  =   x 2e2 x  2 x.  2 .1dx 
 2 2 


 2 2 x 2 x e2 x 
=   x e  xe  
 2 0

=(0-0-0) – (0-0-e0/2) = ½ .

Therefore σ2 = ¼ ⇒ σ = ½ .

On substituting in (3.2.1), we obtain k* ½ = 1 ⇒ k = 2.

Hence the upper bound is 1/k2 = ¼.

Alternatively,

d2
E ( X ) 2  2'  M (t )
dt 2 t 0

d  2 
=  
dt   2  t 2 
t 0

2  t  .0  2(2)(2  t )(1)
2

= = ½.
2  t 
4
t 0

∴ σ2 = ½ - ¼ = ¼.

Since kσ = 1 ⇒ k × ½ = 1

∴ k = 2.

Hence upper bound = ¼.

19

You might also like