Lecture Note 3
Lecture Note 3
Lecture 3
Ω = {HH, HT, T H, T T }.
Solution We have:
1
P ({HH}) = P ({HT }) = P ({T H}) = P ({T T }) = .
4
1
(a) the probability mass function (pmf) is given by:
1
4
, if x = 0,
1 , if x = 1,
p(x) = 21
, if x = 2,
4
0, otherwise.
Then X is called a continuous random variable. The function f is called the probability
density function (pdf ) of X.
The cumulative distribution function (cdf) is given by:
Z x
F (x) = P (X ≤ x) = f (t) dt.
−∞
For a given function f to be a pdf, it must satisfy the following two conditions:
1. f (x) ≥ 0 for all values of x, and
R∞
2. −∞ f (x) dx = 1.
Additionally, if f is continuous, then:
dF (x)
= f (x),
dx
where F (x) is the cdf. This follows from the fundamental theorem of calculus.
If f is the pdf of a random variable X, then:
Z b
P (a ≤ X ≤ b) = f (t) dt.
a
P (X = a) = 0.
2
Additionally,
P (a ≤ X ≤ b) = F (b) − F (a).
1. 0 ≤ F (x) ≤ 1.
Example Suppose that a large grocery store has shelf space for 150 cartons of fruit
drink that are delivered on a particular day of each week. The weekly sale for fruit drink
shows that the demand increases steadily up to 100 cartons and then levels off between
100 and 150 cartons. Let Y denote the weekly demand in hundreds of cartons. It is
known that the pdf of Y can be approximated by:
y, 0 ≤ y ≤ 1,
f (y) = 1, 1 < y ≤ 1.5,
0, elsewhere.
Solution
(a)
0, y < 0,
R y t dt,
0 ≤ y < 1,
F (y) = R01 Ry
t dt + 1 dt, 1 ≤ y < 1.5,
R01
R 1.5
0
t dt + 1 dt, y ≥ 1.5.
0, y < 0,
y2 ,
0 ≤ y < 1,
= 2 1
y − 2 , 1 ≤ y < 1.5,
y ≥ 1.5.
1,
(b)
3
Moments and Moment-Generating Functions
Moments
Moments are statistical measures that provide important information about the shape
and characteristics of a probability distribution.
The k-th moment of a random variable X about the origin is defined as:
µ′k = E[X k ],
µk = E[(X − µ)k ],
• The first moment about the origin is the mean: µ′1 = E[X].
• The third central moment relates to skewness, indicating asymmetry. The stan-
dardized third moment about the mean:
E[(X − µ)3 ] µ3
α3 = 3
= 3/2
σ µ2
• The fourth central moment relates to kurtosis, indicating the “peakedness” of the
distribution. The standardized fourth moment about the mean:
E[(X − µ)4 ]
α4 =
σ4
is called the kurtosis of the distribution. The kurtosis for a standard normal
distribution is three. For this reason, some sources use the following definition of
kurtosis (often referred to as “excess kurtosis”):
Excess Kurtosis, β = α4 − 3.
Moment-Generating Functions
The moment-generating function (MGF) of a random variable X is defined as:
MX (t) = E[etX ],
4
Properties of the MGF:
• The k-th moment of X can be obtained by differentiating MX (t) k times with
respect to t and evaluating at t = 0:
dk
µ′k = MX (t) .
dtk t=0
• If two random variables have the same MGF, they have the same distribution
(uniqueness property).
• If X and Y are independent, then:
MX+Y (t) = MX (t) · MY (t).
That is, the mgf of the sum of two independent random variables is the product
of the mgfs of the individual random variables. This result can be extended to n
random variables.
• Let Y = aX + b. Then:
MY (t) = ebt MX (at).
Examples
1. MGF of a Bernoulli Random Variable: Let X ∼ Bernoulli(p). The MGF is:
MX (t) = E[etX ] = (1 − p) + pet .
Remark The MGF uniquely determines a distribution and, conversely, if the MGF exists,
it is unique.
Exercise Calculate the mgf for
1. Binomial distribution
2. Poisson distribution, and
3. Let X be a random variable with the probability density function (pdf):
(
1 −x/β
e , x > 0,
f (x) = β
0, otherwise.
Characteristic Functions
Let X be a random variable (RV). The complex-valued function φ defined on R by:
5
Examples
Remark Unlike a moment-generating function (MGF) that may not exist for some
distributions (e.g., Cauchy distribution), the characteristic function (CF) always exists
(also unique), which makes it a much more convenient tool.
Exercise Calculate the CF for
1. Binomial distribution
Note that this distribution is characterized by the single parameter p. It can be easily
verified that the mean and variance of X are:
6
Binomial distribution
A random variable X is said to have a binomial probability distribution with parameters
(n, p) if and only if:
(
n x n−x
x
p q , x = 0, 1, 2, . . . , n, 0 ≤ p ≤ 1, q = 1 − p,
p(x) = P (X = x) =
0, otherwise,
• Expected Value:
E(X) = µ = np
• Variance:
Var(X) = σ 2 = np(1 − p)
Remark 1 Binomial distribution can also be considered as the distribution of the sum
of n independent, identically distributed Bernoulli RV (b(1, p)) random variables.
Remark 2 P Let Xi (i = 1, 2, . . . , k) be independent random variables with Xi ∼ b(ni , p).
Then Sk = ki=1 Xi has a b(n1 + n2 + · · · + nk , p) distribution.
In practice, the binomial probability distribution is used when we are concerned with
the occurrence of an event, not its magnitude. For example, in a clinical trial, we may
be more interested in the number of survivors after a treatment.
Example It is known that screws produced by a certain machine will be defective with
probability 0.01, independently of each other. If we randomly pick 10 screws produced
by this machine, what is the probability that at least two screws will be defective?
Solution Let X be the number of defective screws out of 10. Then X can be considered
as a binomial random variable with parameters (10, 0.01). Hence, using the binomial
probability function p(x), we obtain:
10
X 10
P (X ≥ 2) = (0.01)x (0.99)10−x
x=2
x
= 1 − [P (X = 0) + P (X = 1)]
= 0.004.
Poisson distribution
Consider a statistical experiment of which A is an event of interest. A random variable
that counts the number of occurrences of A is called a counting random variable. The
Poisson random variable is an example of a counting random variable. Here, we assume
7
that the numbers of occurrences in disjoint intervals are independent and that the mean
number of occurrences is constant.
A discrete random variable X is said to follow the Poisson distribution with parameter
λ > 0, denoted by Pois(λ), if:
e−λ λx
p(x) = P (X = x) = , x = 0, 1, 2, . . .
x!
If X is a Poisson random variable with parameter λ, then:
• Expected Value:
E(X) = µ = λ
• Variance:
Var(X) = σ 2 = λ
Remark When n is large and p small, binomial probabilities are often approximated by
Poisson probabilities. If X is a binomial random variable with parameters n and p, then
for each value x = 0, 1, 2, . . ., and as p → 0, n → ∞ with np = λ constant, we have:
e−λ λx
n x
lim p (1 − p)n−x = .
n→∞ x x!
Example If the probability that an individual suffers an adverse reaction from a partic-
ular drug is known to be 0.001, determine the probability that out of 2000 individuals,
Solution Let Y be the number of individuals who suffer an adverse reaction. Then Y
follows a binomial distribution with parameters n = 2000 and p = 0.001. Because n is
large and p is small, we can use the Poisson approximation with λ = np = 2.
1. The probability that exactly three individuals will suffer an adverse reaction is:
λ3 e−λ 23 e−2
P (Y = 3) = = = 0.18.
3! 3!
Thus, there is approximately an 18% chance that exactly three individuals out of
2000 will suffer an adverse reaction.
8
2. The probability that more than two individuals will suffer an adverse reaction is:
P (Y > 2) = 1 − P (Y ≤ 2) = 1 − [P (Y = 0) + P (Y = 1) + P (Y = 2)] .
We calculate:
20 e−2
P (Y = 0) = = e−2 ,
0!
21 e−2
P (Y = 1) = = 2e−2 ,
1!
2 −2
2e
P (Y = 2) = = 2e−2 .
2!
Thus:
P (Y > 2) = 1 − 5e−2 ≈ 0.323.
Therefore, there is approximately a 32.3% chance that more than two individuals
will have an adverse reaction.
Uniform distribution
A random variable X is said to have a uniform probability distribution on (a, b), denoted
by U (a, b), if the density function of X is given by:
(
1
, a ≤ x ≤ b,
f (x) = b−a
0, otherwise.
9
Hence, the probability is:
Z 115
1 3
P (112 ≤ X ≤ 115) = dx = = 0.15.
112 20 20
Thus, there is a 15% chance that this solid will melt between 112°C and 115°C.
Normal distribution
The single most important distribution in probability and statistics is the normal dis-
tribution. The density function of a normal distribution is bell-shaped and symmetric
about the mean. The normal probability distribution was introduced by the French
mathematician Abraham de Moivre in 1733. He used it to approximate probabili-
ties associated with binomial random variables when n is large. This was later
extended by Laplace to the so-called Central Limit Theorem, which is one of the most
important results in probability. Because Gauss played such a prominent role in deter-
mining the usefulness of the normal distribution, the normal distribution is often called
the Gaussian distribution.
A random variable X is said to have a normal probability distribution with parameters
µ and σ 2 , if it has a probability density function given by
1 (x−µ)2
f (x) = √ e− 2σ2 , −∞ < x < ∞, −∞ < µ < ∞, σ > 0.
2πσ
If µ = 0 and σ = 1, we call it a standard normal random variable. For any normal
random variable with mean µ and variance σ 2 , we use the notation
X ∼ N (µ, σ 2 ).
X −µ
Z= ,
σ
is a random variable that follows the standard normal distribution:
Z ∼ N (0, 1).
Example For a standard normal random variable Z, find the value of z0 such that:
10
(b) P (Z < z0 ) = 0.95.
Solution: The required values of z0 can be obtained using the standard normal distri-
bution table or a statistical calculator. For a standard normal random variable Z, we
have:
(a) From the normal table, and using the fact that the shaded area in the figure is 0.25,
we obtain:
z0 ≈ 0.675.
11