0% found this document useful (0 votes)
10 views11 pages

Lecture Note 3

This document covers statistical methods related to random variables and probability distributions, including definitions and examples of discrete and continuous random variables. It explains probability mass functions, cumulative distribution functions, moments, moment-generating functions, and characteristic functions, along with specific distributions like Bernoulli, Binomial, and Poisson. The document also provides exercises and solutions to illustrate the concepts discussed.

Uploaded by

Kabir Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views11 pages

Lecture Note 3

This document covers statistical methods related to random variables and probability distributions, including definitions and examples of discrete and continuous random variables. It explains probability mass functions, cumulative distribution functions, moments, moment-generating functions, and characteristic functions, along with specific distributions like Bernoulli, Binomial, and Poisson. The document also provides exercises and solutions to illustrate the concepts discussed.

Uploaded by

Kabir Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

MTL390: Statistical Methods

Instructure: Dr. Biplab Paul


January 8, 2025

Lecture 3

Revision of Probability Distribution (cont.)

Random variable and Probability distributions


A random variable X is a function defined on a sample space, Ω, that associates a real
number, X(ω) = x, with each outcome ω in Ω.
The random variable (RV) is represented by a capital letter (X, Y, Z, ...), and any
particular real value of the random variable is denoted by the corresponding lowercase
letter (x, y, z, ...). We define two types of random variables, discrete and continuous.

Discrete random variable: A random variable X is said to be of the discrete type, or


simply discrete, if there exists a countable (finite or countably infinite) set E ⊂ R such
that P {X ∈ E} = 1.
The discrete probability mass function (pmf ) of a discrete random variable X is the
function:
p(xi ) = P (X = xi ), i = 1, 2, 3, . . .
The cumulative distribution function (cdf ) F of the random variable X is defined by:
X
F (x) = P (X ≤ x) = p(y), for − ∞ < x < ∞.
all y≤x

A cumulative distribution function is also called a probability distribution function or


simply the distribution function.
Example Suppose that a fair coin is tossed twice,

Ω = {HH, HT, T H, T T }.

Let X be the number of heads, i.e., X can assume values 0, 1 and 2.

(a) Find the probability function for X.

(b) Find the cumulative distribution function of X.

Solution We have:
1
P ({HH}) = P ({HT }) = P ({T H}) = P ({T T }) = .
4

1
(a) the probability mass function (pmf) is given by:
1

 4
, if x = 0,
 1 , if x = 1,

p(x) = 21
 , if x = 2,
4


0, otherwise.

(b) The cumulative distribution function (CDF) FX (x) is given by:




 0, if − ∞ < x < 0,
 1 , if 0 ≤ x < 1,

F (x) = 43
 , if 1 ≤ x < 2,
4


1, if x ≥ 2.

Continuous random variable: We define a continuous random variable as one that


assumes uncountably many values, such as the points on a real line. Below is the definition
of a continuous random variable.
Let X be a random variable. Suppose that there exists a nonnegative real-valued
function
f : R → [0, ∞)
such that for any interval [a, b],
Z b
P (X ∈ [a, b]) = f (t) dt.
a

Then X is called a continuous random variable. The function f is called the probability
density function (pdf ) of X.
The cumulative distribution function (cdf) is given by:
Z x
F (x) = P (X ≤ x) = f (t) dt.
−∞

For a given function f to be a pdf, it must satisfy the following two conditions:
1. f (x) ≥ 0 for all values of x, and
R∞
2. −∞ f (x) dx = 1.
Additionally, if f is continuous, then:
dF (x)
= f (x),
dx
where F (x) is the cdf. This follows from the fundamental theorem of calculus.
If f is the pdf of a random variable X, then:
Z b
P (a ≤ X ≤ b) = f (t) dt.
a

As a result, for any real number a,

P (X = a) = 0.

2
Additionally,

P (a ≤ X ≤ b) = P (a < X ≤ b) = P (a ≤ X < b) = P (a < X < b).

If we have the cumulative distribution function (cdf) F (x), then:

P (a ≤ X ≤ b) = F (b) − F (a).

Some Properties of the Distribution Function

1. 0 ≤ F (x) ≤ 1.

2. limx→−∞ F (x) = 0, and limx→+∞ F (x) = 1.

3. F (x) is a nondecreasing function and is right continuous.

Example Suppose that a large grocery store has shelf space for 150 cartons of fruit
drink that are delivered on a particular day of each week. The weekly sale for fruit drink
shows that the demand increases steadily up to 100 cartons and then levels off between
100 and 150 cartons. Let Y denote the weekly demand in hundreds of cartons. It is
known that the pdf of Y can be approximated by:

y, 0 ≤ y ≤ 1,

f (y) = 1, 1 < y ≤ 1.5,

0, elsewhere.

(a) Find F (y).

(b) Find P (0.5 ≤ Y ≤ 1.2).

Solution

(a)


 0, y < 0,
R y t dt,

0 ≤ y < 1,
F (y) = R01 Ry
 t dt + 1 dt, 1 ≤ y < 1.5,
R01

 R 1.5
0
t dt + 1 dt, y ≥ 1.5.


 0, y < 0,
 y2 ,

0 ≤ y < 1,
= 2 1
y − 2 , 1 ≤ y < 1.5,


y ≥ 1.5.

1,

(b)

P (0.5 ≤ Y ≤ 1.2) = F (1.2) − F (0.5) = 0.725 − 0.15 = 0.575.

3
Moments and Moment-Generating Functions
Moments
Moments are statistical measures that provide important information about the shape
and characteristics of a probability distribution.
The k-th moment of a random variable X about the origin is defined as:

µ′k = E[X k ],

where E[·] denotes the expected value.


The k-th central moment of X is defined as:

µk = E[(X − µ)k ],

where µ = E[X] is the mean of the random variable X.


Special cases:

• The first moment about the origin is the mean: µ′1 = E[X].

• The second central moment is the variance: µ2 = E[(X − µ)2 ].

• The third central moment relates to skewness, indicating asymmetry. The stan-
dardized third moment about the mean:
E[(X − µ)3 ] µ3
α3 = 3
= 3/2
σ µ2

is called the skewness of the distribution of X.

• The fourth central moment relates to kurtosis, indicating the “peakedness” of the
distribution. The standardized fourth moment about the mean:
E[(X − µ)4 ]
α4 =
σ4
is called the kurtosis of the distribution. The kurtosis for a standard normal
distribution is three. For this reason, some sources use the following definition of
kurtosis (often referred to as “excess kurtosis”):

Excess Kurtosis, β = α4 − 3.

Moment-Generating Functions
The moment-generating function (MGF) of a random variable X is defined as:

MX (t) = E[etX ],

for all values of t for which the expectation exists.

4
Properties of the MGF:
• The k-th moment of X can be obtained by differentiating MX (t) k times with
respect to t and evaluating at t = 0:
dk
µ′k = MX (t) .
dtk t=0

• If two random variables have the same MGF, they have the same distribution
(uniqueness property).
• If X and Y are independent, then:
MX+Y (t) = MX (t) · MY (t).
That is, the mgf of the sum of two independent random variables is the product
of the mgfs of the individual random variables. This result can be extended to n
random variables.
• Let Y = aX + b. Then:
MY (t) = ebt MX (at).

Examples
1. MGF of a Bernoulli Random Variable: Let X ∼ Bernoulli(p). The MGF is:
MX (t) = E[etX ] = (1 − p) + pet .

2. MGF of a Normal Random Variable: Let X ∼ N (µ, σ 2 ). The MGF is:


σ 2 t2
MX (t) = eµt+ 2 .

Remark The MGF uniquely determines a distribution and, conversely, if the MGF exists,
it is unique.
Exercise Calculate the mgf for
1. Binomial distribution
2. Poisson distribution, and
3. Let X be a random variable with the probability density function (pdf):
(
1 −x/β
e , x > 0,
f (x) = β
0, otherwise.

Find the moment generating function (mgf) MX (t).

Characteristic Functions
Let X be a random variable (RV). The complex-valued function φ defined on R by:

ϕ(t) = E(eitX ) = E(cos(tX)) + iE(sin(tX)), t ∈ R,



where i = −1 is the imaginary unit, is called the characteristic function (CF)
of the random variable X.

5
Examples

1. CF of a Bernoulli Random Variable: Let X ∼ Bernoulli(p). The CF is:

ϕ(t) = E[etX ] = (1 − p) + peit .

2. CF of a Normal Random Variable: Let X ∼ N (µ, σ 2 ). The CF is:


σ 2 t2
ϕ(t) = eiµt− 2 .

Remark Unlike a moment-generating function (MGF) that may not exist for some
distributions (e.g., Cauchy distribution), the characteristic function (CF) always exists
(also unique), which makes it a much more convenient tool.
Exercise Calculate the CF for

1. Binomial distribution

2. Poisson distribution, and

3. Let X be a random variable with the probability density function (pdf):


(
1 −x/β
e , x > 0,
f (x) = β
0, otherwise.

Find the moment generating function (mgf) MX (t).

Some Special Distributions


Bernoulli distribution
We define a random variable X associated with this experiment as taking the value
1 with probability p if heads occur, and the value 0 with probability q if tails occur.
Such a random variable X is said to have a Bernoulli distribution. That is, X is a
Bernoulli random variable if for some p, 0 ≤ p ≤ 1, the probability P (X = 1) = p and
P (X = 0) = 1 − p.
The pmf of a Bernoulli random variable X can be expressed as:
(
px (1 − p)1−x , x = 0, 1,
p(x) = P (X = x) =
0, otherwise.

Note that this distribution is characterized by the single parameter p. It can be easily
verified that the mean and variance of X are:

E[X] = p, Var(X) = pq,

respectively. The moment-generating function is given by:

MX (t) = pet + (1 − p).

6
Binomial distribution
A random variable X is said to have a binomial probability distribution with parameters
(n, p) if and only if:

( 
n x n−x
x
p q , x = 0, 1, 2, . . . , n, 0 ≤ p ≤ 1, q = 1 − p,
p(x) = P (X = x) =
0, otherwise,

we will write X ∼ b(n, p).


If X is a binomial random variable with parameters n and p, then the following
properties hold:

• Expected Value:
E(X) = µ = np

• Variance:
Var(X) = σ 2 = np(1 − p)

• Moment-Generating Function (MGF):


n
MX (t) = pet + (1 − p)


Remark 1 Binomial distribution can also be considered as the distribution of the sum
of n independent, identically distributed Bernoulli RV (b(1, p)) random variables.
Remark 2 P Let Xi (i = 1, 2, . . . , k) be independent random variables with Xi ∼ b(ni , p).
Then Sk = ki=1 Xi has a b(n1 + n2 + · · · + nk , p) distribution.
In practice, the binomial probability distribution is used when we are concerned with
the occurrence of an event, not its magnitude. For example, in a clinical trial, we may
be more interested in the number of survivors after a treatment.

Example It is known that screws produced by a certain machine will be defective with
probability 0.01, independently of each other. If we randomly pick 10 screws produced
by this machine, what is the probability that at least two screws will be defective?
Solution Let X be the number of defective screws out of 10. Then X can be considered
as a binomial random variable with parameters (10, 0.01). Hence, using the binomial
probability function p(x), we obtain:
10  
X 10
P (X ≥ 2) = (0.01)x (0.99)10−x
x=2
x
= 1 − [P (X = 0) + P (X = 1)]
= 0.004.

Poisson distribution
Consider a statistical experiment of which A is an event of interest. A random variable
that counts the number of occurrences of A is called a counting random variable. The
Poisson random variable is an example of a counting random variable. Here, we assume

7
that the numbers of occurrences in disjoint intervals are independent and that the mean
number of occurrences is constant.
A discrete random variable X is said to follow the Poisson distribution with parameter
λ > 0, denoted by Pois(λ), if:

e−λ λx
p(x) = P (X = x) = , x = 0, 1, 2, . . .
x!
If X is a Poisson random variable with parameter λ, then:

• Expected Value:
E(X) = µ = λ

• Variance:
Var(X) = σ 2 = λ

• Moment-Generating Function (MGF):


t
MX (t) = eλ(e −1) .

Remark When n is large and p small, binomial probabilities are often approximated by
Poisson probabilities. If X is a binomial random variable with parameters n and p, then
for each value x = 0, 1, 2, . . ., and as p → 0, n → ∞ with np = λ constant, we have:

e−λ λx
 
n x
lim p (1 − p)n−x = .
n→∞ x x!

This distribution is of fundamental theoretical and practical importance. Rare events


are modeled by the Poisson distribution. For example, the Poisson probability distri-
bution has been used in the study of telephone systems. The number of incoming calls
into a telephone exchange during a unit time might be modeled by a Poisson variable,
assuming that the exchange services a large number of customers who call more or less
independently.

Example If the probability that an individual suffers an adverse reaction from a partic-
ular drug is known to be 0.001, determine the probability that out of 2000 individuals,

1. exactly three individuals will suffer an adverse reaction, and

2. more than two individuals will suffer an adverse reaction.

Solution Let Y be the number of individuals who suffer an adverse reaction. Then Y
follows a binomial distribution with parameters n = 2000 and p = 0.001. Because n is
large and p is small, we can use the Poisson approximation with λ = np = 2.

1. The probability that exactly three individuals will suffer an adverse reaction is:

λ3 e−λ 23 e−2
P (Y = 3) = = = 0.18.
3! 3!
Thus, there is approximately an 18% chance that exactly three individuals out of
2000 will suffer an adverse reaction.

8
2. The probability that more than two individuals will suffer an adverse reaction is:

P (Y > 2) = 1 − P (Y ≤ 2) = 1 − [P (Y = 0) + P (Y = 1) + P (Y = 2)] .

We calculate:
20 e−2
P (Y = 0) = = e−2 ,
0!
21 e−2
P (Y = 1) = = 2e−2 ,
1!
2 −2
2e
P (Y = 2) = = 2e−2 .
2!
Thus:
P (Y > 2) = 1 − 5e−2 ≈ 0.323.
Therefore, there is approximately a 32.3% chance that more than two individuals
will have an adverse reaction.

Uniform distribution
A random variable X is said to have a uniform probability distribution on (a, b), denoted
by U (a, b), if the density function of X is given by:
(
1
, a ≤ x ≤ b,
f (x) = b−a
0, otherwise.

The cumulative distribution function is given by:



Z x 0,
 x < a,
x−a
F (x) = f (t) dt = b−a , a ≤ x < b,
−∞ 
1, x ≥ b.

If X is a uniformly distributed random variable on (a, b), then:


a+b
E[X] = ,
2
and
(b − a)2
Var(X) = .
12
Also, the moment-generating function is:
( bt
e −eat
t(b−a)
, t ̸= 0,
MX (t) =
1, t = 0.

Example The melting point, X, of a certain solid may be assumed to be a continuous


random variable that is uniformly distributed between the temperatures 100°C and 120°C.
Find the probability that such a solid will melt between 112°C and 115°C.
Solution: The probability density function is given by:
(
1
, 100 ≤ x ≤ 120,
f (x) = 20
0, otherwise.

9
Hence, the probability is:
Z 115
1 3
P (112 ≤ X ≤ 115) = dx = = 0.15.
112 20 20

Thus, there is a 15% chance that this solid will melt between 112°C and 115°C.

Normal distribution
The single most important distribution in probability and statistics is the normal dis-
tribution. The density function of a normal distribution is bell-shaped and symmetric
about the mean. The normal probability distribution was introduced by the French
mathematician Abraham de Moivre in 1733. He used it to approximate probabili-
ties associated with binomial random variables when n is large. This was later
extended by Laplace to the so-called Central Limit Theorem, which is one of the most
important results in probability. Because Gauss played such a prominent role in deter-
mining the usefulness of the normal distribution, the normal distribution is often called
the Gaussian distribution.
A random variable X is said to have a normal probability distribution with parameters
µ and σ 2 , if it has a probability density function given by
1 (x−µ)2
f (x) = √ e− 2σ2 , −∞ < x < ∞, −∞ < µ < ∞, σ > 0.
2πσ
If µ = 0 and σ = 1, we call it a standard normal random variable. For any normal
random variable with mean µ and variance σ 2 , we use the notation

X ∼ N (µ, σ 2 ).

If X ∼ N (µ, σ 2 ), then E(X) = µ and Var(X) = σ 2 . Also, the moment-generating


function is given by:
1 2 2
MX (t) = etµ+ 2 t σ .
We should also mention here that almost all basic statistical inference is based on
the normal distribution. The question that often arises is: when do we know that our
data follow the normal distribution? To answer this question, we have specific statistical
procedures that we study in later. At this point, however, we can obtain some construc-
tive indications of whether the data follow the normal distribution by using descriptive
statistics. That is, if the histogram of our data can be capped with a bell-shaped curve,
it may suggest that the data follow a normal distribution.
If X ∼ N (µ, σ 2 ), then the z-transform (or z-score) of X, given by:

X −µ
Z= ,
σ
is a random variable that follows the standard normal distribution:

Z ∼ N (0, 1).

Example For a standard normal random variable Z, find the value of z0 such that:

(a) P (Z > z0 ) = 0.25,

10
(b) P (Z < z0 ) = 0.95.

Solution: The required values of z0 can be obtained using the standard normal distri-
bution table or a statistical calculator. For a standard normal random variable Z, we
have:

(a) From the normal table, and using the fact that the shaded area in the figure is 0.25,
we obtain:
z0 ≈ 0.675.

(b) P (Z < z0 ) = 0.95. From the standard normal table, z0 ≈ 1.645.

11

You might also like