0% found this document useful (0 votes)
18 views

Stats2 Textbook Week4

This document discusses continuous random variables. It begins by introducing continuous random variables and explaining how they are useful for modeling phenomena with large amounts of data or outcomes. It then defines the cumulative distribution function (CDF) and provides examples of CDFs for Bernoulli and uniform random variables. The document goes on to discuss properties of CDFs and how they can be used to calculate probabilities for intervals of random variables. It also introduces the probability density function and discusses common continuous distributions like the uniform, exponential, and normal distributions.

Uploaded by

23f1002410
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Stats2 Textbook Week4

This document discusses continuous random variables. It begins by introducing continuous random variables and explaining how they are useful for modeling phenomena with large amounts of data or outcomes. It then defines the cumulative distribution function (CDF) and provides examples of CDFs for Bernoulli and uniform random variables. The document goes on to discuss properties of CDFs and how they can be used to calculate probabilities for intervals of random variables. It also introduces the probability density function and discusses common continuous distributions like the uniform, exponential, and normal distributions.

Uploaded by

23f1002410
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Practice Book

for
STATISTICS FOR DATA SCIENCE - 2
Contents
1 Continuous random variable 3
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Cumulative Distribution function . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 CDF of a random variable . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.2 Properties of CDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Continuous Random Variable: Approximation of CDF from Discrete to Con-
tinuous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Cumulative Distribution Functions . . . . . . . . . . . . . . . . . . . . . . . 9
1.4.1 Examples of valid CDF’s . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4.2 Probability of intervals using continuous CDF . . . . . . . . . . . . . 10
1.5 General random variables and continuous random variables . . . . . . . . . . 12
1.5.0.1 CDFs and random variables . . . . . . . . . . . . . . . . . . 13
1.5.1 Properties of CDF: . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.5.2 Continuous random variable . . . . . . . . . . . . . . . . . . . . . . . 14
1.5.3 Some scenarios for continuous models . . . . . . . . . . . . . . . . . . 18
1.6 Probability Density Function . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.6.1 Properties of PDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.7 Common distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.7.1 Uniform distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.7.2 Exponential distribution . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.7.2.1 Memoryless property of Exponential . . . . . . . . . . . . . 30
1.7.3 Normal distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.7.3.1 PDF of Normal distribution . . . . . . . . . . . . . . . . . . 31
1.7.3.2 CDF of Normal distribution . . . . . . . . . . . . . . . . . . 32
1.7.3.3 Standard normal distribution . . . . . . . . . . . . . . . . . 32
1.7.3.4 Probability computations with normal distribution . . . . . 33

2
Chapter 4

1 Continuous random variable


1.1 Introduction
So far, we have looked at discrete random variables, where X usually take values like
{1, 2, 3, . . .}. We also saw the notion of PMF. Now we are going to see what is called a
continuous random variable. We will see how the numbers can become very unwieldy if we
want to stick to the discrete domain. Continuous random variable domain gives very easy
ways to deal with such situations.

Consider the following two situations:


• Situation I: Suppose we have the weight of different meteorites1 that enters the earth.

– We have the data for 45, 000+ meteorites and the range of weights vary from 0.01
grams to 60 tons.
– If we want to do some statistical study on this vast data, we will think of meteorite
weight as the random variable. If we stick with the discrete random variable, it
becomes really difficult to gain any insight from it.

• Situation II: Suppose we have Binomial(n, p) distribution, where p is fixed and n is


growing very large. If we look at Bernoulli trials where population is very large and p
is a constant, we will have to deal with big calculations in combinatorics.
In such random-like phenomena, the notion of continuous random variable enters the picture.
The number of outcomes is so large that dealing with discrete PMF becomes difficult. The
core idea is when we have a lot of data and that data takes lot of different values within the
same range, we should use continuous model rather than discrete model.
A trick that can be used when we model data is taking logarithm to it. This will help in
making the range smaller and make data fall into a certain range. Now we will be focusing
on intervals rather than the individual values that the random variable is taking. We will
see how to do it in the above two situations.

(i) Meteorite data

• Preprocessing: Take (log2 ) to the data. Now, the range is reduced from [0.01,
60000000] to [−6.6, 25.8]. But still, we have 45000+ data.
• Main idea: Divide [−6.6, 25.8] into 100 intervals of the size 0.3. We will have

[−6.6, −6.3], [−6.3, −6], . . . , [25.5, 25.8]

Now count the number of values falling inside each interval.

3
Histogram for the log of weight of meteorite data

The x-axis represents the bin size and the y-axis gives the count of number of values
that lie in each bin.

(ii) Binomial(n, p)

PMF of a Binomial(100, p) distribution

We can see here the pmf has a nice shape but dealing with calculation is not easy. So,
we can give up the precision of individual values and focus on the shapes and come
with some alternative model.
1

1
Meteorite data has been taken from NASA’s open data portal.

4
1.2 Cumulative Distribution function
1.2.1 CDF of a random variable
Definition: The Cumulative Distribution Function (CDF) of a random variable X, denoted
FX (x), is a function from R to [0, 1], defined as

FX (x) = P (X ≤ x)

CDF is a very important bridge between the discrete world and the continuous world.

1.2.2 Properties of CDF


i) FX (b) − FX (a) = P (a < X ≤ b)

ii) FX : non-decreasing function taking non-negative values.

iii) As x → −∞, FX goes to 0.

iv) As x → ∞, FX goes to 1.

1.2.3 Examples
i) Bernoulli random variable
Consider a Bernoulli random variable X with X taking the values 0 and 1 with proba-
bilities (1 − p) and p, respectively.

For x < 0, FX (x) = P (X ≤ x) = 0


For 0 ≤ x < 1, FX (x) = P (X ≤ x) = P (X = 0) = 1 − p
For x > 1, FX (x) = P (X ≤ x) = P (X = 0) + P (X = 1) = p + (1 − p) = 1

Therefore, CDF of X is given by



0,
 x<0
FX (x) = 1 − p, 0 ≤ x < 1

1, x≥1

Compute: (a)FX (−1), (b) FX (x), where 0 ≤ x < 1.

Solution:

(a) FX (−1) = P (X ≤ −1) = 0


(b) FX (x)0≤x<1 = P (X ≤ x) = P (X = 0) = 1 − p

5
ii) Throw a die
Consider a random variable X that represent the outcomes on throwing a fair die. The
outcomes are {1, 2, 3, 4, 5, 6}.

X ∼ Uniform{1, 2, 3, 4, 5, 6}

For x < 1, FX (x) = P (X ≤ x) = 0


1
For 1 ≤ x < 2, FX (x) = P (X ≤ x) = P (X = 1) =
6
1 1 1
For 2 ≤ x < 3, FX (x) = P (X ≤ x) = P (X = 1) + P (X = 2) = + =
6 6 3
P 1 1 1 1
For 3 ≤ x < 4, FX (x) = P (X ≤ x) = P (X = k) = + + =
k=1,2,3 6 6 6 2
P 1 1 1 1 2
For 4 ≤ x < 5, FX (x) = P (X ≤ x) = P (X = k) = + + + =
k=1,2,3,4 6 6 6 6 3
P 1 1 1 1 1 5
For 5 ≤ x < 6, FX (x) = P (X ≤ x) = P (X = k) = + + + + =
k=1,2,3,4,5 6 6 6 6 6 6
P 1
For x ≥ 6, FX (x) = P (X ≤ x) = P (X = k) = 6. = 1
k=1,2,3,4,5,6 6

6
Therefore, the CDF of X is given by



0, x<1
1≤x<2



1/6,

2/6, 2≤x<3



FX (x) = 3/6, 3≤x<4

4/6, 4≤x<5





5/6,


 5≤x<6

1, x≥6

Compute: P (X = 4.5)
P (X = 4.5) = 0 (since there is no jump at x = 4.5)

iii) CDF of a discrete random variable

Let random variable X has the following PMF:

x x1 x2 x3 x4 x5
fX (x) p1 p2 p3 p4 p5

Remark: CDF of a discrete random variable have a step like structure.

7
iv) Computing probability of intervals using CDF

Let X ∼ Uniform{1, 2, . . . , 100}


The CDF of X is given by

0
 x<1
FX (x) = k/100 k ≤ x < k + 1, k = 1, 2, . . . , 99

1 x ≥ 100

Compute: (a) P (3 < X ≤ 10), (b) P (3.2 < X ≤ 10.6), (c) P (X ≤ 17), (d) P (X ≤ 17.3),
(e) P (X > 87), (f) P (X > 87.4)

Solution:
10 3 7
(a) P (3 < X ≤ 10) = FX (10) − FX (3) = − =
100 100 100
10 3 7
(b) P (3.2 < X ≤ 10.6) = FX (10.6) − FX (3.2) = − =
100 100 100
17
(c) P (X ≤ 17) = FX (17) =
100
17
(d) P (X ≤ 17.3) = FX (17.3) =
100
87 13
(e) P (X > 87) = 1 − FX (87) = 1 − =
100 100
87 13
(f) P (X > 87.4) = 1 − FX (87.4) = 1 − =
100 100
8
1.3 Continuous Random Variable: Approximation of CDF from
Discrete to Continuous
Consider the plot of CDF of Binomial random variable (n, 0.6). Keep the scale of the picture
same and increase the value of n.

Notice that the CDFs start to look like a continuous line with the increase in the value
of n.

1.4 Cumulative Distribution Functions


Definition: A function F : R → [0, 1] is said to be a Cumulative Distribution Function (CDF)
if it satisfies the following four properties:

1. F is a non-decreasing function taking values between 0 and 1.

2. As x → −∞, F goes to 0.

3. As x → ∞, F goes to 1.

4. Technical: F is continuous from the right.

The functions defined in the above manner mirror the properties of CDF of a random
variable. If we take any arbitrary CDF, it does not have to be a step like structure, it can
also be smooth and continuous.

9
1.4.1 Examples of valid CDF’s

• We can observe that all the CDF’s are non-decreasing, starts at 0 and ends at 1, so
they are valid CDF’s.

• Continuous CDFs appear to be close approximations to CDFs of discrete random


variables, particularly when alphabet grows.

• When we want to move towards a continuous random variable, we should have a


distribution function which is continuous. The graphs on the bottom seem like a
continuous here, but it sort of mirrors the discrete picture on the top very closely.

• We can describe the continuous curves in many interesting ways. Also, the calculations
with probabilities of intervals become much simpler if we have a continuous model.

1.4.2 Probability of intervals using continuous CDF


Using continuous CDF, we will now see how to find probabilities for discrete uniform distri-
bution.

1. Let X ∼ Uniform {1, 2, . . . , 100}.

Let FX (k) denotes the CDF of X and F (x) is the approximate CDF of X.

0
 x≤0
FX (k) = k/100 k ≤ x < k + 1, k = 1, 2, . . . , 99

1 x ≥ 100

10

0
 x≤0
F (x) = x/100 0 ≤ x ≤ 100

1 x ≥ 100

CDF of X

10 3 7
(a) P (3 < X ≤ 10) = F (10) − F (3) = − =
100 100 100
10.6 3.2 7.4
(b) P (3.2 < X ≤ 10.6) = F (10.6) − F (3.2) = − =
100 100 100
17
(c) P (X ≤ 17) = F (17) =
100
17.3
(d) P (X ≤ 17.3) = F (17.3) =
100
87 13
(e) P (X > 87) = 1 − F (87) = 1 − =
100 100
87.4 12.6
(f) P (X > 87.4) = 1 − F (87.4) = 1 − =
100 100
Observations:

• We will get the same value for P (3 < x ≤ 10) even if we use the exact CDF FX (k)
or the approximated CDF F (x).
• We will get different values for P (3.2 < X ≤ 10.6) if we use FX (k) and if we use
F (x). There is a small difference in the values.

11
2. Binomial using continuous CDF

Let X ∼ Binomial(100, 0.6).


Let FX (k) denotes the CDF of X and F (x) is the approximate CDF of X.

k  
X 100
FX (k) = (0.6)j (0.4)n−j
j=0
j

1
F (x) =  
−1.65451(x − 60)
1 + exp √
24
Here 24 is the variance of Binomial(100, 0.6) and 60 is the mean of Binomial(100, 0.6).

(a) P (40 < X ≤ 50) = F (50) − F (40) = 0.0318


(b) P (50 < X ≤ 60) = F (60) − F (50) = 0.4670
(c) P (60 < X ≤ 70) = F (70) − F (60) = 0.4670
(d) P (70 < X ≤ 80) = F (80) − F (70) = 0.0318

F (x) is a good approximation for FX (k). To check, we will compute the probabilities
using both FX (k) and F (x). We will see that both gives a very close value. We are
not losing much.

Remark: Better approximations are possible, particularly as n grows. Continuous CDF


turns out to be a valuable tool in our hands.

1.5 General random variables and continuous random variables


We saw that the discrete CDF came from an actual probability space. The probability space
had outcomes, events, PMF and then we got the CDF. Now the question to ask here is from
where this continuous CDF coming from.
In this section, we will sort of accept the continuous CDF as corresponding to a random
variable and start studying what kind of random variable it is, how to deal with it, etc. This
brings us into this wonderful possibility of general random variables, continuous random
variables. The continuous random variable has very interesting new properties, and it is
very useful in modeling, that we have already seen in section 1.1.

12
1.5.0.1 CDFs and random variables

Theorem (Random variable with CDF F (x)) Given a valid CDF F (x), there exists a random
variable X taking values in R such that

P (X ≤ x) = F (x)
Remarks:

• This theorem allows us to define a CDF first, a valid CDF that can be defined in any
way we want. It assures that there is a random variable in some probability space.

• The value of the CDF at a particular input x, F (x) is P (X ≤ x). This connection
between the random variable and the CDF is very important, and it also allows us to
use the CDF directly to compute probabilities involved in the random variable.

• Any event we define using the random variable X, for example, X > a or X < a, etc.
we can use this connection to derive the probabilities.

1.5.1 Properties of CDF:


i) P (a < X ≤ b) = F (b) − F (a).

ii) If F (x) rises from F1 to F2 at x1 , P (X = x1 ) = F2 − F1 .

iii) If F (x) is continuous at x0 , P (X = x0 ) = 0 (non-intuitive!)

Example: Let X be a random variable with CDF F (x)



0
 x<0
F (x) = 0.5 + 0.1x 0 ≤ x < 5

1 x>5

Find:

(i) P (X = 0)

(ii) P (1.99 < X ≤ 2.01)

(iii) P (1.9999999 < X ≤ 2.0000001)

(iv) P (X = 2.00000 . . .)

Solution:

(i) P (X = 0) = 0.5

(ii) P (1.99 < X ≤ 2.01) = F (2.01) − F (1.99)[Using the properties of CDF]

13
(iii) P (1.9999999 < X ≤ 2.0000001) = 0.00000002
We can observe that as the precision increases, probability decreases.

(iv) P (X = 2.00000 . . .) = 0
Here X is taking value with infinite precision and F (x) is continuous at x = 2, so the
probability is 0.

Remark:

• If F (x) jumps at a point, then it takes that value with non-zero probability.

• If there is no jump in F (x), if it is smooth and continuous at that point, it takes that
value with probability 0.

1.5.2 Continuous random variable


Definition: A random variable X with CDF FX (x) is said to be a continuous random vari-
able if FX (x) is continuous at every x.

Remarks:

• CDF has no jumps or steps.

• P (X = x) = 0 for all x.

• Probability of X falling in an interval will be nonzero.

P (a < X ≤ b) = F (b) − F (a)

• Since P (X = a) = 0 and P (X = b) = 0, we have

P (a ≤ X ≤ b) = P (a < X ≤ b) = P (a ≤ X < b) = P (a < X < b)

14
Examples:
1. Given below are the plot of few CDFs. Identify the kind of distribution from the
following:

(a)

(b)

(c)

(d)
Solution:

15
(a) Since the CDF is continuous, it has a continuous distribution.
(b) The CDF has a step-like structure, so it a discrete distribution.
(c) At 0, there is a jump in the CDF, and then it is a continuous curve, so it has
mixture (both discrete and continuous) distribution.
(d) Since the CDF is continuous, it has a continuous distribution.

2. Consider a random variable X with CDF




 0 x < −5

0.2 −5 ≤ x < 0
F (x) =
0.2 + 0.2x

 0≤x<4
x≥4

1

CDF of X

i) Find P (X < −3), P (−3 < X < −1), P (−1 < X < 1), P (X ≤ 3), P (0 ≤ X < 3).
ii) Is there an x0 for which P (X = x0 ) > 0?
iii) Is X a continuous random variable?

Solution:

(i) P (X < −3)

• P (X < −3) = F (−3) = 0.2

• P (−3 < X < −1) = F (−1) − F (−3) ; (Using properties of CDF)


=0

• P (−1 < X < 1) = F (1) − F (−1) ; (Using properties of CDF)


= (0.2 + 0.2) − (0.2)
= 0.2

16
(ii) As we can observe from the figure that P (X = −5) = 0.2, so there is an x0 for
which P (X = x0 ) = 0.
(iii) Since there is a jump in the CDF at x = −5, therefore, it has a mixed distribution.
3. Consider a random variable X with CDF


 0 x < −5

0.04x + 0.2 −5 ≤ x < 0
F (x) =


 0.2 + 0.2x 0≤x<4
x≥4

1

CDF of X

i) Find P (X < −3), P (−3 < X < −1), P (−1 < X < 1), P (X ≤ −3), P (0 ≤ X < 3).
ii) Is there an x0 for which P (X = x0 ) > 0?
iii) Is X a continuous random variable?

Solution:
(i)
• P (X < −3) = F (−3) = (0.04 × −3) + 0.2 = 0.08

• P (−3 < X < −1) = F (−1) − F (−3) ; (Using properties of CDF)


=0

• P (−1 < X < 1) = F (1) − F (−1) ; (Using properties of CDF)


= (0.2 + 0.2) − (0.2 − 0.04)
= 0.16

17
(ii) The CDF is continuous for all x, so there is not any x0 for which P (X = x0 ) > 0.
(iii) Since the CDF is continuous, random variable X is continuous .

1.5.3 Some scenarios for continuous models


Given below are few scenarios, where the number of values that are taken in that random
phenomenon is such that discrete may not really be very useful. So, we may want to use the
continuous random variable to approximate situation.

• Throw a dart onto a circular board - distance of the point of impact from the center
of the board.

• Weight of a meteorite hitting earth.

• Weight of a human being, height of a human being Speed of a delivery in cricket.

• Price of a stock.

1.6 Probability Density Function


For the discrete random variable, we had looked at the probability mass function, probability
that discrete random variable takes a particular value. In case of continuous random variable,
PMF will give us the probabilities zeroes. So, we need something called the density.
Continuous random variables will take values over an interval and they will have a certain
density over that interval but not over a particular value. A random variable whose CDF is
continuous at every point is termed as the continuous random variable.

Definition: A continuous random variable X with CDF FX (x) is said to have a PDF fX (x)
if, for all x0 ,
Zx0
FX (x0 ) = fX (x0 ) dx
−∞

Derivative of the CDF (if exists) is given by PDF.


Zx0
fX (x0 ) dx = FX (x0 ) − FX (−∞)
−∞

Why do we need PDF when we already have a CDF?

1. Whenever we want to have a measure of some distribution, we describe it with the


CDF. If PDF is high, the probability of X taking a value there is high or if it is low,
then probability is low.

18
2. CDF is an increasing function. CDF being higher at some x does not mean that X
takes more values there. On the other hand, if the density is higher, then X takes
mores values around those points.

3. Density gives a clear picture of how the distribution looks like, but in case of CDF, we
only see how the probability increases.

4. PDF is much easier in probability computations.

Examples:

1. CDF and PDF of Uniform distributions

The above figure is for the Uniform distribution. The image on the left hand side is
for Uniform[0, 5] and the image on the right hand side is for Uniform[0, 1/2].

2. CDF and PDF of Exponential and Normal distribution


The images on the left hand side is for the Exponential distribution and the image on
the right hand side is for the Normal distribution.

19
1.6.1 Properties of PDF
A function f : R → R is said to be a density function if

1. f (x) ≥ 0
R∞
2. fX (x) dx = 1
−∞

3. f (x) is piecewise continuous.

Remark: Given a density function f , there is a continuous random variable X with PDF
as f .

Support of a random variable: It is defined as the points where the density function is
strictly greater than 0. Mathematically, for any random variable X with density fX (x)

Supp(X) = {x : fX (x) > 0}

Note: Supp(X) contains intervals in which X can fall with positive probability.
For any event A defined using the random variable X, probability of event is computed as
Z
P (A) = f (x) dx
A

Examples:

20
i) Consider the function (
3x2 , 0 < x < 1
f (x) =
0, otherwise

(a) Show that f is a density function.


• f (x) ≥ 0.

Z∞ Z1
f (x) dx = 3x2 dx
−∞ 0
1
3
=x
0
=1

Since f (x) satisfies all the properties, it is a valid density function.


(b) Consider a random variable X with density f . Find P (X = 1/5), P (X = 2/5), P (X ∈
[1/5 − ϵ, 1/5 + ϵ]), P (X ∈ [2/5 − ϵ, 2/5 + ϵ]).

Solution
• P (X = 1/5) = 0 ; (Since X is continuous)
• P (X = 2/5) = 0 ; (Since X is continuous)
• P (X ∈ [1/5 − ϵ, 1/5 + ϵ])

  1/5+ϵ
Z
1 1
P −ϵ<X < +ϵ = 3x2 dx
5 5
1/5−ϵ
1/5+ϵ
3
=x
1/5−ϵ
 3  3
1 1
= +ϵ − −ϵ
5 5
6
= ϵ + 2ϵ3 , where ϵ << 0
25
 3  3
2 2 24
• Similarly, P (X ∈ [2/5 − ϵ, 2/5 + ϵ]) = +ϵ − − ϵ = ϵ + 2ϵ3 , where
5 5 25
ϵ << 0

21
ii) Consider a random variable X with density
(
2x, 0 < x < 1
f (x) =
0, otherwise

Find P (X ∈ [0.1, 0.3]), P (X ∈ (0.1, 0.03]), P (X ∈ [0.1, 0.03)), P (X ∈ (0.1, 0.03)).

Solution:

• P (X ∈ [0.1, 0.3])

Z0.3
P (0.1 ≤ X ≤ 0.3) = 2x dx
0.1
0.3
2
=x
0.1
=(0.3)2 − (0.1)2
=0.08

• P (X ∈ (0.1, 0.3])

Z0.3
P (0.1 < X ≤ 0.3) = 2x dx
0.1
0.3
2
=x
0.1
=(0.3)2 − (0.1)2
=0.08

• P (X ∈ [0.1, 0.3))

Z0.3
P (0.1 ≤ X < 0.3) = 2x dx
0.1
0.3
2
=x
0.1
=(0.3) − (0.1)2
2

=0.08

22
• P (X ∈ (0.1, 0.03))

Z0.3
P (0.1 < X < 0.3) = 2x dx
0.1
0.3
2
=x
0.1
=(0.3)2 − (0.1)2
=0.08

Observe that in all the cases, probabilities are same.

iii) Consider the function




 k 0 < x < 1/4

2k 1/4 < x < 3/4
f (x) =
3k
 3/4 < x < 1


0, otherwise
Find k such that f (x) is a valid density function.

Solution:

R∞
For f (x) to be a valid density function, f (x) dx should be 1. Therefore,
−∞

Z∞
f (x) dx = 1
−∞

Z0 Z1/4 Z3/4 Z1 Z∞
=⇒ f (x) dx + f (x) dx + f (x) dx + f (x) dx + f (x) dx = 1
−∞ 0 1/4 3/4 1

Z0 Z1/4 Z3/4 Z1 Z∞
=⇒ 0 dx + k dx + 2k dx + 3k dx + f (x) dx = 1
−∞ 0 1/4 3/4 1
1/4 3/4 1

=⇒ k(x) + 2k(x) + 3k(x) =1


0 1/4 3/4
1
=⇒ k =
2

23
1.7 Common distributions
1.7.1 Uniform distribution
A continuous random variable X is said to be uniform in [a, b], if it has a flat PDF in the
range [a, b].

PDF of X is given by: 


 1 , a≤x≤b
fX (x) = b − a
0, otherwise
CDF of X:

Za Zx Z∞
1
FX (x) = 0 dx + dx + 0 dx
b−a
−∞ a b
x−a
=0 + +0
b−a
x−a
=
b−a


 0, x≤a
x − a
FX (x) = , a<x<b

 b−a
1, x≥b

Example: Suppose X ∼ Uniform[−10, 10].

Find: P (−3 ≤ X ≤ 2), P (5 <| X |< 7), P (1 − ϵ < X < 1 + ϵ), P (9 − ϵ < X < 9 + ϵ),
P (X > 7|X > 3).

Solution: (Using the PDF)

Since X ∼ Uniform[−10, 10], PDF of X is given by



 1 , −10 ≤ x ≤ 10
fX (x) = 20
0, elsewhere

24
Now,
R2 1 5 1
• P (−3 ≤ X ≤ 2) = dx = =
−3 20 20 4

• P (5 <| X |< 7)

P (5 <| X |< 7) =P (5 < X < 7) + P (−7 < X < −5)


Z7 Z−5
1 1
= dx + dx
20 20
5 −7
1
=
5

• P (1 − ϵ < X < 1 + ϵ)

Z1+ϵ
1
P (1 − ϵ < X < 1 + ϵ) = dx
20
1−ϵ

= , where ϵ << 0
20


For any x0 in the ϵ interval of [−1, 1], P (x0 − ϵ < X < x0 + ϵ) = .
20
• P (9 − ϵ < X < 9 + ϵ)

Similarly, for any x0 in the ϵ interval of [−9, 9], P (x0 − ϵ < X < x0 + ϵ) = .
20
• P (X > 7|X > 3)

P (X > 7, X > 3)
P (X > 7|X > 3) =
P (X > 3)
P (X > 7)
=
P (X > 3)
R10 1
20
dx
7
= 10
R 1
20
dx
3
3/20 3
= =
7/20 7

25
Solution: (Using the CDF)

Since X ∼ Uniform[−10, 10], CDF of X is given by




 0, −10 < x
x + 10

FX (x) = , −10 ≤ x < 10

 20
1, x ≥ 10

Now,
   
2 + 10 −3 + 10 5
• P (−3 ≤ X ≤ 2) = FX (2) − FX (−3) = − =
20 20 20
• P (5 <| X |< 7)

P (5 <| X |< 7) =P (5 < X < 7) + P (−7 < X < −5)


=[FX (7) − FX (5)] + [FX (−5) − FX (−7)]
       
7 + 10 5 + 10 −5 + 10 −7 + 10
= − + −
20 20 20 20
2 2
= +
20 20
4
=
20

• P (1 − ϵ < X < 1 + ϵ)

P (1 − ϵ < X < 1 + ϵ) =FX (1 + ϵ) − FX (1 − ϵ)


   
1 + ϵ + 10 1 − ϵ + 10
= −
20 20

= , where ϵ << 0
20


For any x0 in the ϵ interval of [−1, 1], P (x0 − ϵ < X < x0 + ϵ) = .
20
• P (9 − ϵ < X < 9 + ϵ)

Similarly, for any x0 in the ϵ interval of [−9, 9], P (x0 − ϵ < X < x0 + ϵ) = .
20

26
• P (X > 7|X > 3)

P (X > 7, X > 3)
P (X > 7|X > 3) =
P (X > 3)
P (X > 7)
=
P (X > 3)
1 − P (X ≤ 7)
=
1 − P (X ≤ 3)
1 − FX (7)
=
1 − FX (3)
 
7 + 10
1−
20 3/20 3
=  = =
3 + 10 7/20 7
1−
20

1.7.2 Exponential distribution


X ∼ Exponential(λ), where λ > 0

If X is a random variable with pdf given by


(
λe−λx , x > 0
fX (x) =
0, otherwise

then, X is exponentially distributed with parameter λ.

Check: fX (x) is a valid PDF.

1. Clearly fX (x) ≥ 0.

2. Support of X is {x : x > 0}

Z∞ Z0 Z∞
fX (x) dx = 0 dx + λe−λx dx
−∞ −∞ 0
−λx
e
=0 + λ
−λ

= − e−λx =1
0

Therefore, fX (x) is a valid PDF.

27
Rx Rx
Now, P (X ≤ x) = fX (x) dx = λe−λx dx = −e−λx |x0 = 1 − e−λx
−∞ −∞

Therefore, the CDF of exponential distribution is


(
0, x≤0
FX (x) = −λx
1−e , x>0
Remarks:

1. In case of Uniform random variable, support is finite, whereas in case of exponential


random variable, support is infinite. Since the support is very large, it is much more
likely that it take values close to 0 than say close to 100.

2. fX (x) never becomes 0 when x is positive.

3. In practice, it is a good model for many situations.

Example: Suppose X ∼ Exponential(2).

Find: P (5 < X < 7), P (X > 4), P (1−ϵ < X < 1+ϵ), P (9−ϵ < X < 9+ϵ), P (X > 7|X > 3).

Solution: (Using the PDF)

Since X ∼ Exponential(2), PDF of X is given by


(
2e−2x , x > 0
fX (x) =
0, elsewhere

Now,
7
R7 e−2x
 
• P (5 < X < 7) = 2e−2x dx = 2 = e−10 − e−14
5 −2
5

28
• P (1 − ϵ < X < 1 + ϵ)
Z1+ϵ
P (1 − ϵ < X < 1 + ϵ) = 2e−2x dx
1−ϵ
1+ϵ
e−2x
 
=2
−2
1−ϵ
−2(1−ϵ) −2(1+ϵ)
=e −e , where ϵ << 0

Exercise: Find: P (9 − ϵ < X < 9 + ϵ)


• P (X > 7|X > 3)
P (X > 7, X > 3)
P (X > 7|X > 3) =
P (X > 3)
P (X > 7)
=
P (X > 3)
R∞ −2x
2e dx
7
= R∞
2e−2x dx
3

e−2x
 
2
−2
=  7
∞ = e−8
e−2x

2
−2
3

Solution: (Using the CDF)

Since X ∼ Exponential(2), PDF of X is given by


(
1 − e−2x , x > 0
fX (x) =
0, elsewhere
Now,
• P (5 < X < 7) = FX (7) − FX (5) = (1 − e−7×2 ) − (1 − e−5×2 ) = e−10 − e−14
• P (1 − ϵ < X < 1 + ϵ)
P (1 − ϵ < X < 1 + ϵ) =FX (1 + ϵ) − FX (1 − ϵ)
=(1 − e−2(1+ϵ) ) − (1 − e−2(1−ϵ) )
=e−2(1−ϵ) − e−2(1+ϵ) , where ϵ << 0

29
• P (X > 7|X > 3)

P (X > 7, X > 3)
P (X > 7|X > 3) =
P (X > 3)
P (X > 7)
=
P (X > 3)
1 − FX (7)
=
1 − FX (3)
1 − (1 − e−14 )
=
1 − (1 − e−6 )
=e−8

1.7.2.1 Memoryless property of Exponential

If X ∼ Exponential(λ), then for any s, t > 0, we have

P (X > s + t | X > s) = P (X > t)

Proof:

P (X > s + t, X > s)
P (X > s + t | X > s) =
P (X > s)
P (X > s + t)
=
P (X > s)
1 − FX (s + t)
=
1 − FX (s)
1 − (1 − e−λ(s+t) )
= = e−λt = P (X > t)
1 − (1 − e−λs )

Suppose you are waiting for a bus at a bus stop, it is a random waiting time, so that
random waiting time is very commonly modeled as an exponential random variable. No
matter at what time you go to the bus stop, the waiting time is going to be same. For any
real life situation, memoryless property is very useful.

1.7.3 Normal distribution


The most common distribution among all the models is Normal distribution. It is also termed
as the Gaussian distribution.
Normal distribution has two parameters; the first parameter is denoted as µ which repre-
sent the mean and the the second parameter is denoted as σ 2 which represent the variance.
It is denoted as Normal(µ, σ 2 ).

30
1.7.3.1 PDF of Normal distribution

The PDF of normal distribution is given by

−(x − µ)2
1
fX (x) = √ e 2σ 2 , where Supp(X) = R
σ 2π
µ ∈ R, σ is a positive real number.

Observations:

• As the value of X becomes much larger from µ, fX (x) goes to 0. Similarly as the value
of X becomes much smaller from µ, fX (x) goes to 0.

• fX (x) takes the maximum value at x = µ.


1
• Height of the curve is √ .
σ π
• X take the values closer to µ with higher probability and as it move away from µ, the
probabilities decrease.

Exercise: Show fX (x) is a valid PDF.

1. fX (x) is non-negative.
−(x − µ)2
R∞ 1
2. √ e 2σ 2 dx = 1.
−∞ σ 2π
 
x−µ
Proof: Let =z
σ
Now, x = σz + µ =⇒ dx = σdz

Z∞ −(x − µ)2 Z∞
1 1 2
√ e 2σ 2 dx = √ e−z /2 σ dz (1)
σ 2π σ 2π
−∞ −∞

31
z2
Now, substitute = y 2 in (1), we will have
2
Z∞ Z∞
1 2 1 2
√ e−z /2 σ dz = √ e−y dy
σ 2π π
−∞ −∞

R∞ 2 √
We know that e−y dy = π
−∞
−(x − µ)2
R∞ 1
Therefore, √ e 2σ 2 dx = 1
−∞ σ 2π

Exercise: Show that P (X < a) = P (X > a) = 1/2.

1.7.3.2 CDF of Normal distribution

Zx
FX (x) = fX (u) du
u=∞

There is no closed form expression for CDF of normal distribution.

1.7.3.3 Standard normal distribution


Normal distribution with mean 0 and variance 1 is called standard normal distribution. It
is denoted as Normal(0, 1).

Standardization: If X ∼ Normal(µ, σ 2 ), then

X −µ
Z= ∼ Normal(0, 1).
σ
PDF:
1 2
fZ (z) = √ e−z /2

32
CDF:
Zz
1 2
FZ (z) = √ e−u /2 du

−∞

1.7.3.4 Probability computations with normal distribution

1. First convert the probability computation to the standardized case.


2. There are many scientific computing devices available which can be used to calculate
FZ (z). Standard normal table can also be used.
Example 1: Suppose X ∼ Normal(2, 5). Find:

i) P (X < 5)
Solution:
 
X −2 5−2
P (X < 5) =P √ < √
5 5

=P (Z < 3/ 5)

=FZ (3/ 5)

ii) P (X < −5)


Solution:
 
X −2 −5 − 2
P (X < −5) =P √ < √
5 5

=P (Z < −7/ 5)

=FZ (−7/ 5)

iii) P (X > 5)
Solution:
 
X −2 5−2
P (X > 5) =P √ > √
5 5

=P (Z > 3/ 5)

=1 − FZ (3/ 5)

Similarly try to find P (X < 10), P (X < −10), P (X > 10).

Example 2: Suppose X ∼ Normal(3, 1). Find:

33
i) P (5 < X < 7)
Solution:
 
5−3 X −3 7−3
P (5 < X < 7) =P √ < √ > √
1 1 1
=P (2 < Z < 4)
=FZ (4) − FZ (2)

ii) P (1 − ϵ < X < 1 + ϵ)


Solution:
 
1−ϵ−3 X −3 1+ϵ−3
P (1 − ϵ < X < 1 + ϵ) =P √ < √ > √
1 1 1
=P (−2 − ϵ < Z < −2 + ϵ)
=FZ (−2 + ϵ) − FZ (−2 − ϵ)

iii) P (X > 7|X > 3)


Solution:

P (X > 7, X > 3)
P (X > 7|X > 3) =
P (X > 3)
P (X > 7)
=
P (X > 3)
P (X − 3 > 7 − 3)
=
P (X − 3 > 3 − 3)
P (Z > 4)
=
P (Z > 0)
1 − FZ (4)
=
1 − FZ (0)

Similarly, try to find P (9 − ϵ < X < 9 + ϵ), P (−5 < X < 5), P (X > 4).

34

You might also like