0% found this document useful (0 votes)
8 views20 pages

Chapter 3

Chapter 3 discusses random variables (r.v.) and their probability distributions, distinguishing between discrete and continuous r.v.s. It covers the probability distribution of discrete r.v.s, including the concept of probability mass function (pmf), and introduces mathematical expectation and variance. The chapter also explores binomial and Poisson distributions, providing examples and formulas for calculating probabilities and expected values.

Uploaded by

omar2463ll
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views20 pages

Chapter 3

Chapter 3 discusses random variables (r.v.) and their probability distributions, distinguishing between discrete and continuous r.v.s. It covers the probability distribution of discrete r.v.s, including the concept of probability mass function (pmf), and introduces mathematical expectation and variance. The chapter also explores binomial and Poisson distributions, providing examples and formulas for calculating probabilities and expected values.

Uploaded by

omar2463ll
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Chapter3

RANDOM VARIABLES
and
PROBABILITY DISTRIBUTIONS
A random variable (r.v.) is a variable whose value is determined by the outcome of
a random experiment.

So each value of a random variable is a real number determined by the outcome of an


experiment that varies from trial to trial. Random variables usually denoted by capital letters
say X,Y, and Z,... . R.v.'s may be characterized as to whether discrete or continuous.
R.V.

Discrete r.v. Continuous r.v.

A r.v. whose values are countable is A r.v. that can assume any value
called a discrete r.v. contained in one or more intervals is
called a continuous r.v.
In most practical problems, discrete In most practical problems, continuous
r.v. represent count data, such as the r.v. represent measured data, such as all
number of defectives in a sample of N possible heights, weights, temperature,
items or the number of customers in a time, distances, pressure, ... ect.
shop, the number of successes of an
experiment.

3.1 Probability Distribution of A Discrete R.V.

The probability distribution of a discrete r.v. is a table, graph, formula, or other


device used to specify all possible values of a discrete r.v. along with their respective
probabilities.

1
Example 3.1
A public health nurse has a case load of 200 families. Let X be the number of
children for a randomly selected family. the probability distribution of Construct X.
Solution
We can construct the probability distribution of X by a table in which we list in one
column, x, the possible values that X assumes, and in another column, P(X=x), the
probability with which X assumes a particular value, x as the corresponding relative
frequency. Table 3.1 lists the probability distribution of the discrete r.v. X.
Alternatively, we can present this probability distribution in the form of a graph such
as Fig. 3.1 or Fig. 3.2. In these graphs the length of each vertical bar indicates the probability
for the corresponding value of x.

x Frequency P(X=x) = f(x)


Relative Frequency
0 3 3/200 = 0.015
1 15 15/200 = 0.075
2 28 28/200 = 0.140
3 25 25/200 = 0.125
4 35 35/200 = 0.175
5 44 44/200 = 0.220
6 35 35/200 = 0.175
7 10 10/200 = 0.050
8 5 5/200 = 0.025
Sum 200 1
Table 3.1 Frequency and Relative Frequency
(or probability) Distributions for Example 3.1
200/50

200/40

200/30

200/20

200/10

0
0 1 2 3 4 5 6 7 8

2
Fig. 3.1 Line bars of the probability distribution of the number
of children per family for population of 200 families.

The probability P(X=x) of a discrete r.v. is denoted by the function f(x), say, and possesses
the following two characteristics:
1. f(x) ≥0 , for all x
2.  f (x) = 1
x

The probability function f(x) for a discrete r.v. is called the “probability mass function
(pmf)” or simply the pf.

3.2 Mathematical Expectation


If X denotes a discrete random variable which can assume values x1, x2,..., xk with
respective probabilities p1, p2,..., pk where p1 + p2 +...+ pk = 1, the mathematical expectation
of X or simply the expected (or mean) value of X, denoted by E(X), is defined as
k
E (X) = x1p1 + x 2 p 2 + + xk pk =  x i p
i
i=1

Or in terms of the pmf;


k
E (X) = x1f (x1 ) + x 2 f (x 2 ) + + xk f (xk ) =  x if (x i )
i=1

It is easy to show that the interpretation of E(X) represents the mean of the population from
which the sample is drawn. If we called X the sample mean we can denote the population
mean by  . So
k
 = E (X) =  x i f (xi )
i=1

Also the variance of X (which is the population variance), denoted by  2 , is defined by


k
 = E ( X -  ) =  ( xi -  ) f (x i )
2 2 2

i=1

where  is the mean of X. Another formula for the variance, which can be derived from the
above formula, is
k

 = E( X )-  = x f (x ) - 
2 2 2 2 2
i i
i=1

3
The standard deviation of X, denoted by  , is
S.D. =  = var ( X )
Example 3.2
Find (a)  , (b) E(X2), (c) E(X -  )2
for the following probability distribution

x 8 12 16 20 24
f(x) 1/8 1/6 3/8 1/4 1/12
Solution
k
)a(  = E(X) =  xif (xi ) = 8(1/8) + 12(1/6) + 16 (3/8) + 20 (1/4) + 24 (1/12) = 16
i=1

)b(
k
E(X2 ) =  xi2 f (xi ) = (8) (1/8) + (12) (1/6) + (16) (3/8) + (20) (1/4) + (24) (1/12) = 276
2 2 2 2 2

i=1

(c) E ( X -  ) =  2 = E(X 2 ) −  2 = 276 − (16)2 = 20


2

Thus  = 16 and  2 = 20 .

3.4 The Binomial Probability Distribution


If p is the probability that an event will happen in any single trial (called the
probability of a success) and q = 1-p is the probability that this event will not happen in any
single trial (called the probability of a failure) then the probability that the event will happen
exactly x times in n trials (i.e. x successes and n-x failures will occur) is given by

 n  x n- x
f ( x ) =   p q , x = 0 , 1 , 2 , ... , n
 x
where
n = total number of trials
p = probability of success
q = = 1 - p = probability of failure
x = number of success in n trials
This is a discrete probability distribution and it should be noted that
4
f(0) + f(1) + f(2) + ...+ f(n) = 1
In the binomial experiment the following conditions must be satisfied,
1) The experiment consists of n repeated trials.
2) Each trial results in an outcome that may be classified as a success or a failure.
3) The probability of a success, denoted by p, remains constant from trial to trial.
4) The repeated trials are independent.
5) The number x of successes observed during the n trials is recorded.
The mean and the variance of the binomial distribution are given by
µ = E(X) = np, and σ2 = Var(X) = npq.

Example 3.3
The probability that a patient recovers from a rare blood disease is 0.6. If 7 people
are known to have contracted this disease what is the probability that:
(a) Exactly 3 recover, (b) at least 5 recover (c) at most 5 survive
Solution:
Let X be the number of people that survive, then
P(one patient will recover) = p = 0.6
P(one patient will not recover) = q = 1 - p = 0.4
number of all patients = n = 7
 7 3 4
(a) P ( exactly 3 recover) = P ( X = 3 ) =   (0.6) (0.4) = 0.194
 
3

(b) P(at least 5 survive) = P(X = 5) + P(X = 6) + P(X = 7)


 7 5 2  7 6  7 7 0
=   ( 0.6 ) ( 0.4 ) +   ( 0.6 ) ( 0.4 ) +   ( 0.6 ) ( .4 )
 5  6  7
= 0.261 + 0.131 + 0.028 = 0.42
(Note that: P(at least 5 survive) =P(X≥5) = 1-P(X≤ 4) = 1-F(4)=1-0.58 = 0.42)

(c) P(at most 5 survive) = P(X = 0) + P(X = 1) + ... + P(X = 5) = F(5)


= 1 - P(X = 6) - P(X = 7) = 1 - 0.131 - 0.028 = 0.841.
((Note that: P(at most 5 survive) =P(X≤ 5) = F(5) )
5
Example 3.4

A manufacturer of metal spoons finds that on the average, 12% of his spoons are
rejected because they are either oversize or undersize. What is the probability that a
batch of 10 spoons will contain:

(a) Exactly 3 rejects? (b) No more than 2 rejects?

Solution

Let X be the number of rejected spoons (In this case, "success" means rejection), here

P(the spoon is rejected) = p = 0.12,


P(the spoon is not rejected) = q = 0.88,

number of all spoons = n =10 .

 10  3 7
(a) P ( exactly 3 rejects) = P ( X = 3 ) =   (0.12) (0.88) = 0.08
 3

(b) P ( no more than 2 rejects) = P ( X  2 )


= P(X = 0) + P(X = 1) + P(X = 2)
 10   10   10 
=   (0.12)0 (0.88)10 +   (0.12)1 (0.88)9 +   (0.12) 2 (0.88)8
 0  1  2

= 0.2785 + 0.37977 + 0.23304 = 0.89131

3.4 The Poisson Distribution


The r.v. X is said to have the Poisson distribution with parameter λ ( > 0) iff its
p.m.f. is given by
-
e 
x
f(x) = for x = 0 , 1 , 2 , ...
x!
6
This distribution is named after the French mathematician Simeon Poisson (1781-1840).
For Poisson distribution the mean and the variance are given by
μ = E(X) = λ , σ2 = Var(X) = λ .
The Poisson distribution is a close approximation of the binomial distribution, with
λ= np for small x provided that p is small and n is large enough. In general, the Poisson
distribution will provide a good approximation to binomial probabilities when n  20 and p
 0.05. When n  100 and np < 10, the approximation will generally be excellent.

Example .3 5
Suppose 1% of the people on the average are left handed. Find the
probability that: (a) exactly 3 (b) at least 3 are left handed among 100 people.
Solution
(a) The exact probability of 3 are left handed using the binomial distribution with p=.01
and n =100 is
 100  3 97
f(3) =   (0.01) (0.99) = 0.061
 3 
while the Poisson approximation with λ = np = 1.0, is
-1 3
e 1
f (3)= = 0.0613
3!
(b) P(at least 3 are left handed) = f(3) + f(4) + …. == 1-f(0) – f(1) – f(2) = 1-F(2)
e −1 10 -1 1
e 1
-1 2
e 1
=1 − − − = 1 − 2.5e −1 0.08
0! 1! 2!

3.6 Probability Distribution of a Continuous R.V.


The above ideas can be extended to the case where the variable X may assume a
continuous set of values. The relative frequency polygon of a sample becomes, in the
theoretical or limiting case of population, a continuous curve such as shown in Fig. 1.3,
whose equation is
7
y = f(x)
This function is called probability density function (abbreviated by pdf). The pdf is a
formula used to represent the distribution of a continuous r.v. and is characterizes by:
(1) The whole curve lies above the x-axis, i.e.
f(x)  0 for all x
(2) The total area under the curve bounded by the X axis is equal to one, i.e

 f ( x ) d x =1.
-

(3) the area under the curve between lines X = a and X = b (shaded in the Fig. 3.1) gives
the probability that X lies between a and b, i.e.
b
P ( a  X  b ) =  f ( x ) d x = Area under the curve from a to b
a
It should be noted that the probability that a continuous r.v. X assumes a single value is
always zero, i.e.
P(X = c) = 0 for any real constant c
From this we can deduce that
P(a  X  b) = P(a  X < b) = P(a < X  b) = P(a < X < b)

Figure 3.1 .
The expected value of X is

E ( X ) =  - x . f ( x ) dx
and

E ( X 2 ) =  - x 2 . f ( x ) dx
8
As in the discrete r.v.’s the mean and the variance of X are given by
μ = E(X) and σ2 = E(X2) – μ2

3.6 The Normal Distribution


The most important and most widely used probability distribution in the entire field
of statistic is the normal distribution. A large number of phenomena in the real world are
normally distributed either exactly or approximately. Its graph called the normal curve, is
the bell shaped curve of Fig. 3.2, which describes so many sets of data that occur in nature,
industry, and research.

Figure 3.2 The Normal Distribution Curve


A normal distribution is completely specified by two parameters; the mean μ, and
the variance σ2. If a r.v. X has a normal distribution, with mean μ and variance σ2 i.e. in
symbols we write
X ~ N (  , 2 )
then the pdf of X is given by
2
1  x- 
-½  
f (x)= e   
 2
where π = 3.1416 and e = 2.7183 base of natural logarithms.
From an inspection of Fig. 3.2, we list the following properties of the normal curve.
(1) The total area under the curve and above the horizontal axis equal to 1.
(2) The curve is symmetric about the mean μ.

9
(3) The mode, which is the point on the horizontal axis where the curve is a maximum,
occurs at x = μ (i.e. mode = mean μ).
(4) The median = mean μ.
(5) The two tails of the curve extend indefinitely.
A normal distribution is completely specified by the two parameters; the mean μ,
and the variance σ2. Thus, for any given standard deviation σ, there are an infinite number
of normal curves possible, depending on μ. Fig. 3.3 shows normal curves for σ = 1 and μ
= 0, 1, 2.

Figure 3.3 Normal distributions with σ = 1, varying


in location with different means.

Figure 3.5 Normal distributions with μ = 0, varying


in scale with different standard deviations.
Standard Normal distribution.
The standard normal distribution is a special case of the normal distribution. The

10
normal distribution with μ=0 and σ=1 is called the standard normal distribution. Fig. 3.5
displays the standard normal distribution curve.

Z Values or Z Scores
A random variable X is said to have been standardized when it has been adjusted so
that its mean is 0 and its standard deviation is 1 [i.e. N(0,1)]. Standardization can be
effected by subtracting μ from X, and dividing the resulting difference by σ; the standard

Figure 3.5 The standard normal distribution curve.

deviation of X, thus the variable


X - 

is a standardized variable and is known as the standard normal variable, and is usually
denoted by Z. One of the most important theorems of statistics states that:
If the r.v. X has the normal distribution with mean μ and variance σ2, the
standardized variable
X-
Z=

has the normal distribution with mean 0 and variance 1. In symbols.
X-
If X ~ N (  , 2 ) then Z= ~ N ( 0 ,1)

Areas under the standard normal curve are frequently needed and are therefore widely
tabulated. Let us denote by φ(z) the area underneath the standard normal curve from - to
z, where z is any number positive, negative, or zero, i.e. we have for Z ~ N(0,1);
(z )=P(Z  z)

11
Figure 3.6

Clearly, φ(z) has the following properties:


i) Φ(-) = 0, Φ() = 1 and Φ(0) = ½.
ii) P(Z > c) = 1 - Ф(c)

iii) P(a  Z  b) = Ф(b) - Ф(a)

iv) Ф (-c) = 1 - Ф (c)

Most standard normal tables give the values of Φ(z), shown in Fig. 3.6, for positive
values of z.. If z is negative we can use rule (iv) to determine Φ(z).

Example 3.6
i- P(Z  1.2) = Φ (1.2) = 0.8849
12
ii-
P ( - 0.5  Z  0.9 ) =  ( 0.9 ) -  ( - 0.5 )
=  ( 0.9 ) - (1 -  ( 0.5 ))
=  ( 0.9 ) - 1 +  ( 0.5 )
= 0.8159 - 1 + 0.6915
= 0.5074
iii-
P(Z  - 1.2 ) = 1 -  (- 1.2 ) = 1 - ( 1 -  (1.2) ) =  ( 1.2 ) = 0.8849
Now, if X ~ N(μ , σ2); then
b- a-
P( a  X  b ) =    -  
     
Example 3.7
If X has a normal distribution with mean 400 and standard deviation 50, find P(360
 X  469).
Solution:
Here, we have μ = 400, σ = 50, hence
 360 - 400 X -  469 - 400 
P( 360  X  469 ) = P     = P ( - 0.8  Z  1.38 )
 50  50 
= (1.38) - (-0.8) = (1.38) - 1 + (0.8) = 0.9162 - 1 + 0.7881 = 0.7043 .

Example 3.8
The heights of 1000 students in a certain college are normally distributed with a
mean 68 inches and standard deviation of 3 inches. How many of these students would you
expect to have heights:
i- less than 64 inches, ii- between 67 and 71 inches.
Solution
Let X denotes the height of the students, then
X ~ N ( 68 , 9 )
 64 - 68 
i- P(X < 64 ) = P  Z <  = P(Z < - 1.33) = 1 - (1.33 ) = 1 - 0.908 = 0.092
 3 
13
Hence, the number of students having heights less than 64 inches is
1000 ( 0.092 ) = 92 students .
67 - 68 71 - 68
ii- P(67 < X < 71) = P ( Z ) = P(- 0.33 < Z < 1)
3 3
= (1) - (-0.33) = (1) - 1 + (0.33) = 0.841 - 1 + 0.629
= 0.470
Hence the number of students having heights between 67 and 71 inches is
100 (0.470) = 470 students

14
EXERCISES
[1] Identify the given random variable as being discrete or continuous.
i. The height of a player on a basketball team
ii. The number of cups of coffee sold in a cafeteria during lunch
iii. The cost of a randomly selected orange
iv. The number of oil spills occurring off the Alaskan coast
v. The pH level in a shampoo
vi. The number of phone calls between Cairo and Alexandria on first of Ramadan day
vii. The number of students in the required course, English 101
viii. The braking time of a car

[2] Determine whether the following is a probability distribution. If not, identify the requirement
that is not satisfied.

a- b- c- d-

e- A police department reports that the probabilities that 0, 1, 2, 3, and 4 car thefts will be
reported in a given day are 0.150, 0.284, 0.270, 0.171, and 0.081, respectively.

[3] For each of the following, determine whether the given function can serve as the probability
mass function of a r.v. with the given range.
a- f(x) = (x-2)/5 , for x = 1, 2, 3, 4, 5;
2
b- f(x) = x /30 , for x = 0, 1, 2, 3, 4;
c- f(x) = 1/5 , for x = 0, 1, 2, 3, 4, 5.
[4] For each of the following, determine the constant c so that the function can serve as the
probability mass function of a r.v. with the given range.
a- f(x) = c , for x = 1, 2, 3, 4, 5;
b- f(x) = c2 , for x = 1, 2, 3,..., n;
c- f(x) = c(1/4)x , for x = 1, 2, 3, ...
[5] Find the mean and the standard deviation of the following:
15
i. In a pizza takeout restaurant, the following probability distribution was obtained for the
number of toppings ordered on a large pizza.
x 0 1 2 3 4
P(x) 0.3 0.4 0.2 0.06 0.04

ii. The probabilities that a batch of 4 computers will contain 0, 1, 2, 3, and 4 defective
computers are 0.6274, 0.3102, 0.0575, 0.0047, and 0.0001, respectively. Round
answer to the nearest hundredth.
[6] If X is a discrete r.v. with the following p.m.f.

x -1 0 1 
f(x) β 0.3 0.2 0.1

i- Find the constants  and β so that E(X) = 0.


ii- Calculate Var(X).

[7] Let X be a r.v. having the binomial distribution with parameters n, p such that E[X]=10 and
var(X) = 6. Find n and p.

[8] 10% of American adults are left-handed. For a statistics class of 35 students, find the mean
and standard deviation for the number of left-handed students.

[9] The incidence of occupational disease in an industry is such that the workers have a 30%
chance of suffering from it. What is the probability that out of 5 workers, at least 2 will catch
the disease?

[10] If is known that 40% of mice inoculated with a serum are protected from a certain disease. If
5 mice are inoculated, what is the probability that at most 3 of the mice contract the disease?

[11] The probability that a patient recovers from a rare blood disease is 0.6. If 7 people are
known to have contracted this disease, what is the probability that:
(a) exactly 3 recover, (b) at least 5 recover,
16
(c) at most 5 survive.

[12] A test consists of 10 true/false questions. To pass the test a student must answer at least 7
questions correctly. If a student guesses on each question, what is the probability that the
student will pass the test?

[13] Find the probability of at least 2 girls in 7 births. Assume that male and female births are
equally likely and that the births are independent events.

[14] The probability that a radish seed will germinate is 0.7. A gardener plants seeds in batches of
15. Find the mean and the standard deviation for the number of seeds germinating in each
batch.

[15] A company manufactures batteries in batches of 21 and there is a 3% rate of defects. Find
the mean and the standard deviation for the number of defects per batch.

[16] Suppose that 1% of all transistors produced by a certain company is defective. A new
model of computer requires 100 of these transistors, and 100 are selected at random from
the company's assembly line. Find the probability of obtaining 3 defectives.

[17] Suppose 2% of the people on the average are left handed. Find the probability that at least
four are left handed among 200 people.

[18] Given that X has the normal distribution with mean 50 and variance 100, find
i- P( X  60) ii- P( 45  X  65.3)

[19] Given that X is normally distributed with mean 18 and standard deviation 2.5, find
a- The value of k such that P(X  k) = 0.2236
b- P(X>15)
c- P(17<X<210)

17
[20] Suppose the average length of stay in a chronic disease hospital of a certain type of patient
is 60 days with a standard deviation of 15. If it is reasonable to assume an approximately
normal distribution of lengths of stay, find the probability that a randomly selected patient
from this group will have a length of stay:
a- Greater than 50 days. b- Less than 30 days.
c- Between 30 and 60 days . d- Greater than 30 days.

[21] A study showed that the life of army shoes is normally distributed with μ = 15 months and
σ = 1.5 months. If 30,000 pairs are issued, how many pairs would be replaced after 18
months?

[22] The weekly bonus paid to 100 lecturer’s workout for professional entrance examinations
are normally distributed with mean LE. (700) and Variance of LE. (2500).
a- Estimate the number of lecturers whose bonus will be more than LE. (750)
b- To what value must the mean be altered to increase the probability in (a) to 70 %
(assuming the standard deviation in unaltered)

[23] Circle the correct answer from each of the following multiple-choice questions:

1. If X has the binomial distribution with μ=2.0 and σ2=1.2, then


P(X >1) equals to:
a. 0.663 b. 0.922 c. 0.7408 d. None of the above

2. If X has the binomial distribution with μ=2.4 and σ=1.2, then P(X=2) is:
a. 0.112 b. 0.138 c. 0.311 d. None of the above

3. Given the normally distributed r.v. X with σ =2 and P(X > 15) = 0.9938, then μ equals
a. 20.0 b. 5.0 c. -10.0 d. Otherwise

4. If Z ~ N(0,1) and P( 0 ≤ Z ≤ k) = 0.3, then P(Z > k) equals to:


a. 0.7 b. 0.5 c. 0.2 d. Otherwise

5. If X~N(5.4,100) and P(X > k) = 0.67, then k equals


a. 1.0 b. 10.0 c. 49.6 d. Otherwise
18
6. If X~N(0,1) and P(-k ≤ Z ≤ k) = 0.8, then P(Z > k) equals to:
a. 0.1 b. 0.2 c. 0.4 d. Otherwise

7. Given the normally distributed r.v. X with μ=25 and P(X > 10) = 0.9332, then σ equals

a. 10.0 b. 10.6 c. 21.6 d. None of the above.

8. Given the normally distributed r.v. X with σ =2 and P( X < 15) = 0.9938, then μ equals
a. -20.0 b. 5.0 c. 10.0 d. None of the above

9. The p.d.f. of a r.v. X is given in the figure below, then


f(x)

0 1 2 x
P( 0.5 ≤ X ≤ 1) has value;
a. 0.375 b. 0.5 c. 0.75 d. 0.875

10. The following table gives the distribution of a r.v. X ,where  and β are constants,

x -2 0 2 
P(X=x) β 0.3 0.2 0.1

If E(X)=0, then var(X) is


a. 1.0 b. 1.5 c. 4.0 d. None of the above

11. The probability density function of a r.v. X is given in the figure below. P(1 ≤ X ≤ 2) has
value;
a. 0.8 b. 0.75 f(x)
c. 0. 5 d. 0.25

19
0 1 2 x

12. If X has the Poisson distribution with P(X=1) = 2P(X=0), then P( X > 1) is
approximately:
a. 0.6 b. 0.7 c. 0.9 d. Otherwise.

[24] Explain why the following statement is wrong:


(i) For a r.v. X we have E(X) = 5 and E(X2 ) = 24

20

You might also like