Unit3-Probability and Stochastic Processes (18MAB203T)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

Probability and Stochastic processes (Unit III)

Department of Mathematics ,
College of Engineering and Technology
SRM Institute of Science and Technology
Markov Inequality

Markov’s inequality gives an upper bound for the probability that a


non-negative random variable is greater than or equal to some
positive constant.

Theorem
If X is any nonnegative random variable, then
E (X )
P (X ≥ c) ≤ , for any c > 0. (1)
c
Proof ( X is discrete):

Σ
E (X ) = x P (X = x )
Σ Σ
= x P (X = x ) + x P (X = x )
x≥ c x<c
Σ
≥ x P (X = x ) (for any c > 0)
x≥ c
Σ
≥ c P (X = x ) (since x ≥ c)
x≥ c
Σ
= c P (X = x )
x≥c

= c P (X ≥ c).

Hence,
E (X )
P (X ≥ c) ≤ .
c
Proof ( X is continuous):

∫ ∞
E (X ) = x f (x )dx
−∞
∫ ∞
= x f (x )dx (since X is positive-valued)
∫ 0∞
≥ x f (x )dx (for any c > 0)
c
∫ ∞
≥ c f (x )dx (since x > c in the integrated region)
c
∫ ∞
= c f (x )dx
c
= c P (X ≥ c).

Hence,
E (X )
P (X ≥ c) ≤ .
c
Other form of Markov’s inequality
When X is at least k times of mean E ( X ) ,
1
P (X ≥ k E(X)) ≤ , for any k > 0. (2)
k

Example 1
Suppose that the average grade on the upcoming math exam is 70%.
Give an upper bound of the proportion of the students who score at
least 90%.

E (X ) = 7 .
P (X ≥ 90) ≤
90 9
Example 2
Suppose X ∼ Binomial (100, 12 ). Find an upper bound of P (X ≥ 75).

Here, n = 100, p = 1. Mean E (X ) = np = 50.


2

E (X ) = 50 = 2 .
P (X ≥ 75) ≤
75 75 3

General form
If U ( X ) is a function of the random variable X , then

E (U (X ))
P (U (X ) ≥ c) ≤ , for any c > 0. (3)
c
Chebyshev’s Inequality

If we know the probability distribution (i.e., pdf or pmf) of a random


variable X , we may compute E ( X ) and Var(X). Conversely, if E ( X )
and Var (X) are known, we can not construct the distribution of X
and hence compute quantities such as P { | X −E ( X ) | ≤ c}. Although
we cannot evaluate such probabilities, Chebyshev’s inequality
provides bounds to such probabilities.

Theorem
If X is a random variable with E ( X ) = µ and Var (X) = σ 2 , then

σ2
P {|X −µ|≥ c} ≤ , for any c > 0. (4)
c2
or,
σ2
P {|X −µ|< c} ≥ 1 − 2 , for any c > 0. (5)
c
Alternative forms
If we put c = kσ, where k > 0, then If X is a random variable with
E ( X ) = µ and Var (X) = σ 2 , then
1
P {|X −µ|≥ kσ} ≤ , (6)
k2
or,
1
P {|X −µ|< kσ} ≥ 1 − . (7)
k2
Example 3
If X is a RV with E ( X ) = 3 and E ( X 2 ) = 13. Find the lower bound
for P (−2 < X < 8).

σ 2 = E (X 2 ) −{ E (X )} 2 = 4
σ2 21
P (−2 < X < 8) = P (|X −3|< 5) ≥ 1 − =
52 25

Example 4
Two dice are thrown once, if X is the sum of the number showing up,
prove that P ( | X −7|≥ 3) ≤ 35
54
and compare this value with exact
probability.

X 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 5 4 3 2 1
P (X) 36 36 36 36 36 36 36 36 36 36 36
Σ
E (X ) = x P (X = x ) = 7
x
Σ 1974
E (X 2 ) = x 2 P (X = x ) =
x
36

35
Var(X) = σ 2 =
6
By Chebyshev’s inequality

35/6
P (|X −7|≥ 3) ≤ = 0.6481
9
Exact Probability:

P (|X −7|≥ 3) = 1 −P (|X −7|< 3) = 1 −P (4 < X < 10)


= 1 −{ P (5) + P (6) + P (7) + P (8) + P (9)}
= 0.3333
Example 5
Using Chebyshev’s inequality, find how many times a fair coin must
be tossed in order that the probability that the ratio of the number of
heads to the number of tosses will lie between 0.4 and 0.6 wil be at
least 0.9.

Given that,
X
P 0.4 < n < 0.6 ≥ 0.9. (8)

We know from Chebyshev’s inequality,

σ2 σ2
P . X —µ . < c ≥ 1− = ⇒ P (µ −c <
X
< µ + c) ≥ 1 − 2
.n . c2 n c

Comparing with equation (8) we get, µ = 0.5, c = 0.1 and σ 2 = 0.001.


Now
X 1 pq
Var = 2 Var (X) = = 0.001
n n n
= ⇒ n = 250 (Here p = q = 1/2).
Example 6
A random variable X has pdf f (x) = e − x , x ≥ 0. Use Chebyshev’s
inequality to show that P { | X −1|> 2} < 14 and show also that the
actual probability is e − 3 .

Here X ∼ Exponential(1). So, µ = 1 and σ 2 = 1. Hence,

σ2 1
P (|X −µ|> c) < = ⇒ P (|X −1|> 2) < .
c2 4
Actual probability:

P (|X −1|> 2) = 1 −P (|X −1|≤ 2)


= 1 −P (−1 ≤ X ≤ 3)
∫ 3
−x
= 1− e dx (since X ≥ 0)
0

= e−3.
Weak law of large number

Convergence in Probability
If P { | X n −X | > ϵ} → 0 as n → ∞, then the sequence of RVs { X n } is
said to convergence to X in probability.

Weak law of large number


Let X 1 , X 2 , X 3 , . . ., X n , . . . be a sequence of independent random
variables with E ( X i ) = µ and Var(X i ) = σ 2 .
Let X n = X 1 + X 2n+ ···+ X n .
Then for any ϵ > 0, If P { | X n −µ|> ϵ} → 0 as n → ∞.

Proof:
X1 X2 Xn
E ( X n) = E + E + ··· + E
n n n
= 1 { E (X 1 ) + E (X 2 ) + · · · + E (X n )} = µ.
n
X1 X2 Xn
Var(X n ) = Var + Var + · · · + Var
n n n
1 σ2
= {Var(X 1 ) + Var(X 2 ) + · · · + Var(X n )} = .
n 2 n
By Chebyshev’s inequality we have,
σ2
P (|X n −µ|> ϵ) <
nϵ2

= ⇒ P (|X n −µ|> ϵ) → 0 as n → ∞
Central Limit Theorem
Central Limit Theorem
Let X 1 , X 2 , X 3 , . . ., X n , . . . be the sequence of independent and
identically distributed random variables with E ( X i ) = µ < ∞ and
0 < Var(X i ) = σ 2 < ∞.
Then S n = X 1 + X 2 + · · · + X n follows a normal distribution with
mean nµ and variance nσ 2 as n tends to infinity, i.e., S n ∼ N (nµ, nσ 2 )

Example 7
Let X 1 , X 2 , X 3 , . . ., X 10 be independent Poisson random variables
with mean 1. Using CLT find P { ( X 1 + X 2 + · · · + X 1 0 ) ≥ 15}
E ( X i ) = var(X i ) = 1, S 1 0 = X 1 + X 2 + · · · + X 1 0 ∼ N (10, 10)

S 10 −10 15 −10
P ( S 10 ≥ 15) = P √ ≥ √
10 10

= P (z ≥ 1.58), where z = (S 10 −10)/ 10 ∼ N (0, 1)
= 0 . 5 - P ( 0 < Z < 1 . 5 8 ) = 0.0571
Example 8
A bank teller serves customers standing in the queue one by one.
Suppose that the service time X i for customer i has mean E ( X i ) = 2
and Var(X i ) = 1. Assume that service times for different bank
customers are independent. Let S be the time the bank teller spends
serving 50 customers. Find P (90 < S < 110).

Here n = 50, S = X 1 + X 2 + · · · + X 50 , E (X i ) = 2 and Var(X i ) = 1


(i = 1, 2, . . . , 50). S ∼ N (100, 50) Now

90 −100 S −100 110 −100


P (90 < S < 110) = P √ < √ < √
50 50 50
= P (−1.41 < z < 1.41)
= P (z < 1.41) −P (z < −1.41)
= P (z < 1.41) −P (z > 1.41)
= 2 P (z < 1.41)
= 2× 0.4207 = 0.8414
Corollary of CLT
2
σ
If X = 1
n
(X 1 + X 2 + · · · + X n ), then E (X ) = µ and Var(X ) = n
.
2
σ
Therefore, X follows N µ, n
as n → ∞ .

Example 9
The life time of a certain brand of an electric bulb may be considered
as a random variable with mean 1200 hours and standard deviation
250 hours. Find the probability using central limit theorem, that the
average life time of 60 bulbs exceeds 1250 hours.

Here n = 60, E (X i ) = 1200 and Var(X i ) = 250 (i = 1, 2, . . . , 60). Let


X = X 1 + X 260
+ · · · + X 60
. By above corollary, X ∼ N 1200, 250
60
Now
X −1200 1250 −1200
P (X > 1250) = P √ > √
60 60
= P (z > 1.55)= 0.5 - P(0 < Z < 1.55) =
= 0.5 – 0.4394
= 0.0606
Example 10
A random sample of size 100 is taken from a population whose mean
is 60 and variance 400. Using central limit theorems find what
probability that we can assert that the mean of the sample will not
differ from population mean more than 4.
Here n = 100, E ( X i ) = 60 and Var(X i ) = 400 (i = 1, 2, . . . , 100). Let
X = X 1 + X 2100
+ · · · + X 100
. By above corollary, X ∼ N (60, 4)
Now

P (−4 < X −µ < 4) = P (−4 < X −60 < 4)


4 X −60 4
= P (− < < )
2 2 2
= P (−2 < z < 2)
= 2 P (z < 2)
= 0.9544
Strong Law of Large Number

Strong Law of Large Number


Let X 1 , X 2 , X 3 , . . ., X n , . . . be the sequence of independent and
identically distributed random variables with E ( X i ) = µ < ∞
(i = 1, 2, . . .). Then, with probability 1,

X 1 + X2 + · · · + n
→ µ as n → ∞,
X n
X 1 + X 2 + ···+ X n
i.e., P lim = µ = 1.
n→ ∞ n
One Sided Chebyshev’s Inequality

One Sided Chebyshev’s Inequality


If X is a random variable with mean 0 and finite variance σ 2 , then for
any c > 0,
σ2
P (X ≥ c) ≤ 2 .
σ + c2

Corollary of one-sided Chebyshev’s inequality


If E ( X ) = µ, Var (X) = σ 2 , then for c > 0,

σ2
P (X ≥ µ + c) ≤ ,
σ2 + c2

σ2
P (X ≤ µ −c) ≤ ,
σ2 + c2
Example 11 (Markov inequality weaker than one-sided Chebyshev’s
inequality)
If the number of items produced in a factory during a week is a
random variable with mean 100 and variance 400, compute an upper
bound on the probability that this week’s production will be at least
120.

From one-sided Chebyshev’s inequality:


400 1
P (X ≥ 120) = P (X −100 ≥ 20) ≤ 2 = .
400 + (20) 2

From Markov inequality:

E (X ) = 5 ,
P (X ≥ 120) ≤
120 6
which is far weaker than the Chebyshev’s.
Cauchy Schwartz inequality
For any two random variables X and Y , we have

| E ( X Y )|≤ E ( X 2 ) E ( Y 2 ),

where equality holds iff X = c Y , for any real number c.

Example 12
Using the Cauchy-Schwartz inequality, show that for any two random
variables X and Y
|ρ(X , Y )|≤ 1.

X − E (X ) Y − E (Y )
Let U = σX
and V = σY
.
Then E ( U ) = E ( V ) = 0, and Var(U ) = Var(V ) = 1. Now applying
Cauchy Schwartz inequality for U and V , we get

|E (U V )|≤ E (U 2 )E (V 2 ) = ⇒ |ρ(X , Y )|≤ 1.

E { (X −E (X ))(Y −E (Y ))} )
as ρ(X , Y ) = = E(UV )
σXσY
Chernoff Bounds

When the moment generating function of the random variable X is


known, we can obtain even more effective bounds on P (X ≥ c).

Chernoff Bounds
If M (t) = E ( e t X ) be the moment generating function of the random
variable X , then

P (X ≥ c) ≤ e− tc M (t), for all t > 0

P (X ≤ c) ≤ e− tc M (t), for all t < 0


Example 13
Find the Chernoff Bounds for standard normal random variable.
Let Z is a standard normal random2
variable, then its moment
generating function is M (t) = et /2 , so the Chernoff bound on
P (Z ≥ c) is given by
t2
P (Z ≥ c) ≤ e 2 − tc , for all t > 0
t2
Now the value of t, t > 0, that minimizes e 2 − t c is the value that
2
minimizes t2 −tc, which is t = c. Thus for c > 0 we see that
2
P (Z ≥ c) ≤ e− c /2

Similarly, we can show that for c < 0,


2
P (Z ≤ c) ≤ e− c /2
Jensen’s Inequality
Convex function
A twice-differential real valued function f (x) is said to be convex if
f ′ ′ ( x ) ≥ 0 for all x (Example: f (x) = x 2 , f (x) = e cx ).

Jensen’s Inequality
If f (x ) is convex function, then

E (f (X )) ≥ f (E (X )),

provided that the expectations exist and finite.

Example 14
Show that the variance of a random variable X is non-negative.

We know that f (x) = x 2 is a convex function. Now from Jensen’s


inequality we get
E(X2) ≥ {E(X)}2
= ⇒ Var(X ) = E (X 2 ) −{ E (X )} 2 ≥ 0

You might also like