Chapter 3
Chapter 3
RANDOM VARIABLES
and
PROBABILITY DISTRIBUTIONS
A random variable (r.v.) is a variable whose value is determined by the outcome of
a random experiment.
A r.v. whose values are countable is A r.v. that can assume any value
called a discrete r.v. contained in one or more intervals is
called a continuous r.v.
In most practical problems, discrete In most practical problems, continuous
r.v. represent count data, such as the r.v. represent measured data, such as all
number of defectives in a sample of N possible heights, weights, temperature,
items or the number of customers in a time, distances, pressure, ... ect.
shop, the number of successes of an
experiment.
1
Example 3.1
A public health nurse has a case load of 200 families. Let X be the number of
children for a randomly selected family. the probability distribution of Construct X.
Solution
We can construct the probability distribution of X by a table in which we list in one
column, x, the possible values that X assumes, and in another column, P(X=x), the
probability with which X assumes a particular value, x as the corresponding relative
frequency. Table 3.1 lists the probability distribution of the discrete r.v. X.
Alternatively, we can present this probability distribution in the form of a graph such
as Fig. 3.1 or Fig. 3.2. In these graphs the length of each vertical bar indicates the probability
for the corresponding value of x.
200/40
200/30
200/20
200/10
0
0 1 2 3 4 5 6 7 8
2
Fig. 3.1 Line bars of the probability distribution of the number
of children per family for population of 200 families.
The probability P(X=x) of a discrete r.v. is denoted by the function f(x), say, and possesses
the following two characteristics:
1. f(x) ≥0 , for all x
2. f (x) = 1
x
The probability function f(x) for a discrete r.v. is called the “probability mass function
(pmf)” or simply the pf.
It is easy to show that the interpretation of E(X) represents the mean of the population from
which the sample is drawn. If we called X the sample mean we can denote the population
mean by . So
k
= E (X) = x i f (xi )
i=1
i=1
where is the mean of X. Another formula for the variance, which can be derived from the
above formula, is
k
= E( X )- = x f (x ) -
2 2 2 2 2
i i
i=1
3
The standard deviation of X, denoted by , is
S.D. = = var ( X )
Example 3.2
Find (a) , (b) E(X2), (c) E(X - )2
for the following probability distribution
x 8 12 16 20 24
f(x) 1/8 1/6 3/8 1/4 1/12
Solution
k
)a( = E(X) = xif (xi ) = 8(1/8) + 12(1/6) + 16 (3/8) + 20 (1/4) + 24 (1/12) = 16
i=1
)b(
k
E(X2 ) = xi2 f (xi ) = (8) (1/8) + (12) (1/6) + (16) (3/8) + (20) (1/4) + (24) (1/12) = 276
2 2 2 2 2
i=1
Thus = 16 and 2 = 20 .
n x n- x
f ( x ) = p q , x = 0 , 1 , 2 , ... , n
x
where
n = total number of trials
p = probability of success
q = = 1 - p = probability of failure
x = number of success in n trials
This is a discrete probability distribution and it should be noted that
4
f(0) + f(1) + f(2) + ...+ f(n) = 1
In the binomial experiment the following conditions must be satisfied,
1) The experiment consists of n repeated trials.
2) Each trial results in an outcome that may be classified as a success or a failure.
3) The probability of a success, denoted by p, remains constant from trial to trial.
4) The repeated trials are independent.
5) The number x of successes observed during the n trials is recorded.
The mean and the variance of the binomial distribution are given by
µ = E(X) = np, and σ2 = Var(X) = npq.
Example 3.3
The probability that a patient recovers from a rare blood disease is 0.6. If 7 people
are known to have contracted this disease what is the probability that:
(a) Exactly 3 recover, (b) at least 5 recover (c) at most 5 survive
Solution:
Let X be the number of people that survive, then
P(one patient will recover) = p = 0.6
P(one patient will not recover) = q = 1 - p = 0.4
number of all patients = n = 7
7 3 4
(a) P ( exactly 3 recover) = P ( X = 3 ) = (0.6) (0.4) = 0.194
3
A manufacturer of metal spoons finds that on the average, 12% of his spoons are
rejected because they are either oversize or undersize. What is the probability that a
batch of 10 spoons will contain:
Solution
Let X be the number of rejected spoons (In this case, "success" means rejection), here
10 3 7
(a) P ( exactly 3 rejects) = P ( X = 3 ) = (0.12) (0.88) = 0.08
3
Example .3 5
Suppose 1% of the people on the average are left handed. Find the
probability that: (a) exactly 3 (b) at least 3 are left handed among 100 people.
Solution
(a) The exact probability of 3 are left handed using the binomial distribution with p=.01
and n =100 is
100 3 97
f(3) = (0.01) (0.99) = 0.061
3
while the Poisson approximation with λ = np = 1.0, is
-1 3
e 1
f (3)= = 0.0613
3!
(b) P(at least 3 are left handed) = f(3) + f(4) + …. == 1-f(0) – f(1) – f(2) = 1-F(2)
e −1 10 -1 1
e 1
-1 2
e 1
=1 − − − = 1 − 2.5e −1 0.08
0! 1! 2!
f ( x ) d x =1.
-
(3) the area under the curve between lines X = a and X = b (shaded in the Fig. 3.1) gives
the probability that X lies between a and b, i.e.
b
P ( a X b ) = f ( x ) d x = Area under the curve from a to b
a
It should be noted that the probability that a continuous r.v. X assumes a single value is
always zero, i.e.
P(X = c) = 0 for any real constant c
From this we can deduce that
P(a X b) = P(a X < b) = P(a < X b) = P(a < X < b)
Figure 3.1 .
The expected value of X is
E ( X ) = - x . f ( x ) dx
and
E ( X 2 ) = - x 2 . f ( x ) dx
8
As in the discrete r.v.’s the mean and the variance of X are given by
μ = E(X) and σ2 = E(X2) – μ2
9
(3) The mode, which is the point on the horizontal axis where the curve is a maximum,
occurs at x = μ (i.e. mode = mean μ).
(4) The median = mean μ.
(5) The two tails of the curve extend indefinitely.
A normal distribution is completely specified by the two parameters; the mean μ,
and the variance σ2. Thus, for any given standard deviation σ, there are an infinite number
of normal curves possible, depending on μ. Fig. 3.3 shows normal curves for σ = 1 and μ
= 0, 1, 2.
10
normal distribution with μ=0 and σ=1 is called the standard normal distribution. Fig. 3.5
displays the standard normal distribution curve.
Z Values or Z Scores
A random variable X is said to have been standardized when it has been adjusted so
that its mean is 0 and its standard deviation is 1 [i.e. N(0,1)]. Standardization can be
effected by subtracting μ from X, and dividing the resulting difference by σ; the standard
11
Figure 3.6
Most standard normal tables give the values of Φ(z), shown in Fig. 3.6, for positive
values of z.. If z is negative we can use rule (iv) to determine Φ(z).
Example 3.6
i- P(Z 1.2) = Φ (1.2) = 0.8849
12
ii-
P ( - 0.5 Z 0.9 ) = ( 0.9 ) - ( - 0.5 )
= ( 0.9 ) - (1 - ( 0.5 ))
= ( 0.9 ) - 1 + ( 0.5 )
= 0.8159 - 1 + 0.6915
= 0.5074
iii-
P(Z - 1.2 ) = 1 - (- 1.2 ) = 1 - ( 1 - (1.2) ) = ( 1.2 ) = 0.8849
Now, if X ~ N(μ , σ2); then
b- a-
P( a X b ) = -
Example 3.7
If X has a normal distribution with mean 400 and standard deviation 50, find P(360
X 469).
Solution:
Here, we have μ = 400, σ = 50, hence
360 - 400 X - 469 - 400
P( 360 X 469 ) = P = P ( - 0.8 Z 1.38 )
50 50
= (1.38) - (-0.8) = (1.38) - 1 + (0.8) = 0.9162 - 1 + 0.7881 = 0.7043 .
Example 3.8
The heights of 1000 students in a certain college are normally distributed with a
mean 68 inches and standard deviation of 3 inches. How many of these students would you
expect to have heights:
i- less than 64 inches, ii- between 67 and 71 inches.
Solution
Let X denotes the height of the students, then
X ~ N ( 68 , 9 )
64 - 68
i- P(X < 64 ) = P Z < = P(Z < - 1.33) = 1 - (1.33 ) = 1 - 0.908 = 0.092
3
13
Hence, the number of students having heights less than 64 inches is
1000 ( 0.092 ) = 92 students .
67 - 68 71 - 68
ii- P(67 < X < 71) = P ( Z ) = P(- 0.33 < Z < 1)
3 3
= (1) - (-0.33) = (1) - 1 + (0.33) = 0.841 - 1 + 0.629
= 0.470
Hence the number of students having heights between 67 and 71 inches is
100 (0.470) = 470 students
14
EXERCISES
[1] Identify the given random variable as being discrete or continuous.
i. The height of a player on a basketball team
ii. The number of cups of coffee sold in a cafeteria during lunch
iii. The cost of a randomly selected orange
iv. The number of oil spills occurring off the Alaskan coast
v. The pH level in a shampoo
vi. The number of phone calls between Cairo and Alexandria on first of Ramadan day
vii. The number of students in the required course, English 101
viii. The braking time of a car
[2] Determine whether the following is a probability distribution. If not, identify the requirement
that is not satisfied.
a- b- c- d-
e- A police department reports that the probabilities that 0, 1, 2, 3, and 4 car thefts will be
reported in a given day are 0.150, 0.284, 0.270, 0.171, and 0.081, respectively.
[3] For each of the following, determine whether the given function can serve as the probability
mass function of a r.v. with the given range.
a- f(x) = (x-2)/5 , for x = 1, 2, 3, 4, 5;
2
b- f(x) = x /30 , for x = 0, 1, 2, 3, 4;
c- f(x) = 1/5 , for x = 0, 1, 2, 3, 4, 5.
[4] For each of the following, determine the constant c so that the function can serve as the
probability mass function of a r.v. with the given range.
a- f(x) = c , for x = 1, 2, 3, 4, 5;
b- f(x) = c2 , for x = 1, 2, 3,..., n;
c- f(x) = c(1/4)x , for x = 1, 2, 3, ...
[5] Find the mean and the standard deviation of the following:
15
i. In a pizza takeout restaurant, the following probability distribution was obtained for the
number of toppings ordered on a large pizza.
x 0 1 2 3 4
P(x) 0.3 0.4 0.2 0.06 0.04
ii. The probabilities that a batch of 4 computers will contain 0, 1, 2, 3, and 4 defective
computers are 0.6274, 0.3102, 0.0575, 0.0047, and 0.0001, respectively. Round
answer to the nearest hundredth.
[6] If X is a discrete r.v. with the following p.m.f.
x -1 0 1
f(x) β 0.3 0.2 0.1
[7] Let X be a r.v. having the binomial distribution with parameters n, p such that E[X]=10 and
var(X) = 6. Find n and p.
[8] 10% of American adults are left-handed. For a statistics class of 35 students, find the mean
and standard deviation for the number of left-handed students.
[9] The incidence of occupational disease in an industry is such that the workers have a 30%
chance of suffering from it. What is the probability that out of 5 workers, at least 2 will catch
the disease?
[10] If is known that 40% of mice inoculated with a serum are protected from a certain disease. If
5 mice are inoculated, what is the probability that at most 3 of the mice contract the disease?
[11] The probability that a patient recovers from a rare blood disease is 0.6. If 7 people are
known to have contracted this disease, what is the probability that:
(a) exactly 3 recover, (b) at least 5 recover,
16
(c) at most 5 survive.
[12] A test consists of 10 true/false questions. To pass the test a student must answer at least 7
questions correctly. If a student guesses on each question, what is the probability that the
student will pass the test?
[13] Find the probability of at least 2 girls in 7 births. Assume that male and female births are
equally likely and that the births are independent events.
[14] The probability that a radish seed will germinate is 0.7. A gardener plants seeds in batches of
15. Find the mean and the standard deviation for the number of seeds germinating in each
batch.
[15] A company manufactures batteries in batches of 21 and there is a 3% rate of defects. Find
the mean and the standard deviation for the number of defects per batch.
[16] Suppose that 1% of all transistors produced by a certain company is defective. A new
model of computer requires 100 of these transistors, and 100 are selected at random from
the company's assembly line. Find the probability of obtaining 3 defectives.
[17] Suppose 2% of the people on the average are left handed. Find the probability that at least
four are left handed among 200 people.
[18] Given that X has the normal distribution with mean 50 and variance 100, find
i- P( X 60) ii- P( 45 X 65.3)
[19] Given that X is normally distributed with mean 18 and standard deviation 2.5, find
a- The value of k such that P(X k) = 0.2236
b- P(X>15)
c- P(17<X<210)
17
[20] Suppose the average length of stay in a chronic disease hospital of a certain type of patient
is 60 days with a standard deviation of 15. If it is reasonable to assume an approximately
normal distribution of lengths of stay, find the probability that a randomly selected patient
from this group will have a length of stay:
a- Greater than 50 days. b- Less than 30 days.
c- Between 30 and 60 days . d- Greater than 30 days.
[21] A study showed that the life of army shoes is normally distributed with μ = 15 months and
σ = 1.5 months. If 30,000 pairs are issued, how many pairs would be replaced after 18
months?
[22] The weekly bonus paid to 100 lecturer’s workout for professional entrance examinations
are normally distributed with mean LE. (700) and Variance of LE. (2500).
a- Estimate the number of lecturers whose bonus will be more than LE. (750)
b- To what value must the mean be altered to increase the probability in (a) to 70 %
(assuming the standard deviation in unaltered)
[23] Circle the correct answer from each of the following multiple-choice questions:
2. If X has the binomial distribution with μ=2.4 and σ=1.2, then P(X=2) is:
a. 0.112 b. 0.138 c. 0.311 d. None of the above
3. Given the normally distributed r.v. X with σ =2 and P(X > 15) = 0.9938, then μ equals
a. 20.0 b. 5.0 c. -10.0 d. Otherwise
7. Given the normally distributed r.v. X with μ=25 and P(X > 10) = 0.9332, then σ equals
8. Given the normally distributed r.v. X with σ =2 and P( X < 15) = 0.9938, then μ equals
a. -20.0 b. 5.0 c. 10.0 d. None of the above
0 1 2 x
P( 0.5 ≤ X ≤ 1) has value;
a. 0.375 b. 0.5 c. 0.75 d. 0.875
10. The following table gives the distribution of a r.v. X ,where and β are constants,
x -2 0 2
P(X=x) β 0.3 0.2 0.1
11. The probability density function of a r.v. X is given in the figure below. P(1 ≤ X ≤ 2) has
value;
a. 0.8 b. 0.75 f(x)
c. 0. 5 d. 0.25
19
0 1 2 x
12. If X has the Poisson distribution with P(X=1) = 2P(X=0), then P( X > 1) is
approximately:
a. 0.6 b. 0.7 c. 0.9 d. Otherwise.
20