Chapter 6, 7, 8
Chapter 6, 7, 8
The expectation (mean) of the random variable X gives a single value that acts as a representative or average
of the values of X, and for this reason it is often called a measure of central tendency.
Definition 6.1.1:
Let X be a discrete random variable which takes values xi (x1 , . . . ,xn ) with corresponding probabilities P(X
= xi ) = p(xi), i = 1, . . . , n. Then the expectation of X (or mean value of X) is denoted by E(X) and is defined
as:
n
E(X) = x1 p(x1 ) + . . . + xn p(xn ) = x p( x )
i 1
i i
= x p( x )
x
E ( X ) xi p( xi ) provided that x i p( x ) ii
i 1 i 1
Example 6.1:Let a fair die be rolled once. Find the mean number rolled, say X.
Solution: Since S = { 1, 2, 3, 4, 5, 6} and all are equally likely with prob. of 1/6, we have
1 1 1 1 1 1 21
E ( X ) 1. 2. 3. 4. 5. 6. 3.5.
6 6 6 6 6 6 6
Example 6.2:A lot of 12 TV sets includes two which are defectives. If two of the sets are chosen at random,
find the expected number of defective sets.
Then, the possible values of X are 0, 1, 2. Using conditional probability rule, we get,
2 1 1
P(X 2) P (both defective) = ,
12 11 66
= P (first defective and second good) + P (first good and second defective)
1
2 10 10 2 10 10 10
.
12 11 12 11 66 66 33
2
15 10 1 1
E ( X ) xi P( X xi ) 0 1 2 .
i 0 22 33 66 3
p , for x = 1, 2, 3, . . .
Solution:E(X) =
.
Definition 6.1.2:
The mathematical expectation, in general, of a continuous random variable is defined in a similar way with
those of a discrete random variable with the exception that summations have to be replaced by integrations
on specified domains. Let the random variableX is continuous with p.d.f. f(x), its expectation is defined by:
E (X )
x f ( x) dx , provided this integral exists.
Example 6.5: The probability density function of a random variable X is given by:
1
x 0 x 2
f ( x) 2
0 otherwise
Then, find the expected value of X
Solution: E(X) = = = [1/6 x3 = 4/3.
Example 6.6:Find the expected value of the random variable X with the Cumulative distribution function,
F(x) = x3 ,0 < x< 1. This implies f(x) = 3x2
Solution:E(X) =
Example 6.7:Let X has the following pdf f(x) = , 0 < x < c and you are told that E(X)=6, then find c.
2
6.2 Expectation of a Function of a Random Variable
Definition 6.2:
Now let us consider a new random variable g(X), which depends on X; that is, each value of g(X) is
determined by the value of X. In particular, let X be a discrete random variable with probability function
p(x). Then Y = g(X) is also a discrete random variable.
If X is a discrete random variable and P(xi ) = P(X=xi ) is the probability mass function, we will have E(Y) =
E(g(X)) = g ( x ) p( x )
i 1
i i
Example 6.8: Suppose that a balanced die is ro lled once. If X is the number that s hows up,find the
expected value of g ( X ) 2 X 2 1 .
Solution: Since each possible outcome has the probability 1/6, we get,
6
1
E ( g ( X )) (2 x 2 1). (2 12 1). (2 6 2 1)
1 1 94
.
x 1 6 6 6 3
Exercise: Let X be a random variable with p.d.f. f (x) = 3x2 , 0 < x <1.
Definition 6.3:
Let X and Y be random variables with joint probability distribution p(x, y) [or f(x, y)] and let H = g(x, y) be
a real valued function of (X, Y), then the mean, or expected values are:
E[g(X, Y)] = g ( x, y) f ( x, y) dx dy if X and Y are continuous random variables.
Example:
3
E(XY ) = if X and Y are discrete random variables.
E[XY] = xyf ( x, y) dx dy if X and Y are continuous random variables.
Remark
In calculating E(X) over a two-dimensional space, one may use either the joint probability distribution
of X and Y or the marginal distribution of X as:
E[X] = = if X is discrete random variable, wherepx(x) is the marginal
distribution of X.
E[X] = xf ( x, y) dx dy =
if X is continuous random variable,
Var(X) = E (X – )2 = E(X2 ) - 2
Var ( X ) E ( X 2 ) [ E ( X )]2
x x p ( x) 2
2 2
Var(X) = p ( x) = if X is discrete random variable.
x x
4
(x ) x f ( x)dx 2
2 2
Var(X) = f ( x)dx = if X is continuous random variable.
Note that, the positive square root of Var(X) is called the standard deviation of X and denoted by .Unlike
the variance, the standard deviation is measured in the same units as X (and E(X))
Examples 6.10: Find the expected value and the variance of the random variable X with pdf:
x, 0 x 1
f ( x ) 2 x, 1 x 2
0,
elsewhere
1 2 1 2
Solution: E ( X ) x. f ( x)dx x.xdx x.(2 x)dx x 2 dx (2 x x 2 )dx
0 1 0 1
2
3 1
x x3 1 8 1
x 2 4 1 1 4 2 = 1.
3 0 3 1 3 3 3 3 3 3
1 2 1 2
E( X ) x . f ( x)dx x .xdx x (2 x)dx = x 3 dx (2 x 2 x 3 )dx
2 2 2 2
0 1 0 1
1 2
x4 2 x4 1 16 2 1
x3 4 1 4 5 7 .
4 0 3 4 1 4 3 3 4 4 3 12 6
V ( X ) E ( X 2 ) E ( X )
7 2 1
1 .
2
6 6
Note: Let X and Y be random variables with joint probability distribution p(x, y) [or f(x, y)] and let H = g(x,
y) be a real valued function of (X, Y), then the variance of the random variable g(X, Y)is Var(g(X,Y)
=E[{g(X, Y)}2 ] - [E{g(X, Y)}]2
Example; If H=XY, then Var(XY) = E(X2 Y2 ) – [E(XY)]2
.
6.5. Properties of Expectation and Variance
There are cases where our interest may not only be on the expected value of a random variable, but also on
the expected value of a random variable related to X. In general, such relations are useful to explain the
properties of the mean and the variance.
5
4:Let X and Y are any two random variables. Then E(X + Y) = E(X) + E(Y). This can be generalized to n
random variables, That is, if X1 , X2 , X3 ,. . . ,Xn are random variables then, E(X1 + X2 + X3 + . . . + Xn) =
E(X1 ) + E(X2 ) + E(X3 ) + . . . + E(Xn )
5:Let X and Y are any two random variables. If X and Y are independent, then E(XY) = E(X)E(Y)
6:Let (X, Y) is a two dimensional random variable with a joint probability distribution. Let Z = H1(X, Y) and
W = H2(X, Y). Then E(Z + W) = E(Z) + E(W).
12: If (X, Y) be a two dimensional random variable, and if X and Y are independent, thenVar(X + Y) =
Var(X) + Var(Y) and Var(X - Y) = Var(X) + Var(Y)
Example 6.11:LetXbe a random variable with p.d.f. f (x) = 3x 2 , for 0 < x <1.
(a) Calculate the Var (X).(b) If the random variable Y is defined by Y = 3X − 2, calculate the Var(Y).
Solution: (a) Var(X) = - [E(X)]2 = -[ ]2 = 3/5 – [3/4]2 = 3/80
Let X berandom variable with E(X) = µ and variance σ 2 and let k be any positive constant. Then the
probability that any random variable X will assumea value within k standard deviations of the mean is at
least 1 − . Thatis,P(μ − kσ< X < μ+ kσ) ≥ 1 − .
Note that, Chebyshev’s theorem holds for any distribution of observations, and for thisreason the results are
usually weak. The value given by the theorem is a lowerbound only. That is, we know that the probability of
a random variable fallingwithin two standard deviations of the mean can be no less than 3/4, but we never
know how much more it might actually be. Only when the probability distributionis known can we determine
exact probabilities. For this reason we call the theorem a distribution-free result. The use of Chebyshev’s
theorem isrelegated to situations where the form of the distribution is unknown.
6
Examples 6.12: A random variable X has a mean μ = 8, a variance σ2 = 9, and an unknown probability
distribution. Find
(a) P(−4 < X <20),
(b) P(|X − 8| ≥ 6).
6.7.1 Covariance
The covariance between two random variables is a measure of the nature of theassociation between the two.
If large values of X often result in large values of Yor small values of X result in small values of Y , positive
X−μXwill often result inpositiveY −μYand negative X−μXwill often result in negative Y –μY. Thus, theproduct
(X –μX)(Y –μY) will tend to be positive. On the other hand, if large Xvalues often result in small Y values, the
product (X−μX)(Y –μY) will tend to benegative. The sign of the covariance indicates whether the relationship
between twodependent random variables is positive or negative.
Definition 6.7.1:
N.B.:WhenX and Y are statistically independent, it can be shown that the covariance is zero.
Cov (X, Y) = E[(X – E(X))(Y – E(Y)] = E[(X – E(X))]E[(Y – E(Y)] = 0.
Thus if X and Y are independent, they are also uncorrelated. However, the reverse is not true as illustrated by
the following example.
6.7.2 Properties of Covariance
Property 1:Cov(X, Y) = Cov (Y, X)
7
6.7.3 Correlation Coefficient
Although the covariance between two random variables does provide informationregarding the nature of the
relationship, the magnitude of σXY does not indicateanything regarding the strength of the relationship, since
σXYis not scale-free.Its magnitude will depend on the units used to measure both X and Y. There is ascale-
free version of the covariance called the correlation coefficient that is usedwidely in statistics.
Definition 6.7.3
Let X and Y be random variables with covariance Cov(X, Y)and standard deviations
XandσY, respectively. The correlation coefficient (or coefficient of correlation) of two random variables X
and Y that have non zero variances is defined as:
Cov ( X , Y ) E(XY) - E(X)E(Y)
xy
=
Var ( X ) Var (Y ) Var ( X ) Var (Y )
It should be clear to the reader that is free of the units of X and Y. Thecorrelation coefficient satisfies the
inequality −1 ≤ ≤ 1 and it assumes a value ofzero when σ XY= 0.
Examples 6.13: Let X and Y be random variables having joint probability density function
x y 0 x 1, 0 y 1
f ( x, y )
0 elsewhere
Thenfindthe correlation coefficient.
=x) > 0 by: p (x|y) = P{X = x|Y= y} = and p (y/x) = P{Y= y|X= x} = .
Similarly, let us recall that if X and Y are jointly continuous with a joint probabilitydensity function f (x, y),
then the conditional probability density of X, given thatY= y, and Y given that X = x is defined, for all values
of y and x such that f Y(y) >0, by fy/x) = and fx(x) >0, by f(x|y) = .
Once a conditional distribution is at hand, an expectation can be defined as done inrelations to expectation of
one dimensional random variable. However, a modified notation willbeneeded to reveal the fact that the
expectation is calculated with respect to aconditionalpdf. The resulting expectation is the conditional
expectation of one random variable, given the other random variable, as specified below.
8
Definition 6.8:
If X and Y have joint probability mass function p(x, y), then the conditional expectation of X giventhatY = y,
and Y given that X =x for all values of y and x are:
If X and Y have joint probability density function f(x, y), then the conditional expectation of X giventhatY =
y, and Y given that X =x for all values of y and x are:
E(Y| X = x) =
y p( y | x) dy and E(X| Y= y ) =
xp ( x | y) dx
Property 2: Let X and Y be independent random variables. Then E(X | Y) = E(X) and E(Y | X) = E(Y)
Example 6.14:Let X and Y be random variables having joint probability density function
x y 0 x 1, 0 y 1
f ( x, y )
0 elsewhere
Then find the conditional expectation of (a) Y given X= 0.5 (b) X given Y =0.25?
Remark
Just as we have defined the conditional expectation of X given the value of Y, we can also define the
conditional variance of X given that Y = y: Var(X|Y) = E[(X − E[X|Y])2 ]. That is, Var(X|Y) is equal to the
(conditional) expected square of the difference between X and its (conditional) mean when the value of Y is
given. In other words, Var(X|Y) is exactly analogous to the usual definition of variance, but now all
expectations are conditional on the fact that Y is known.
Exercises
1. The joint probability density function of two random variables X and Y is given by
c ( 2 x y ) for 2 x 6, 0 y 5
f ( x, y)
0 otherwise
Calculate a) the constant c. b) E(X), (c) E(Y), (d) E(XY), (e) Cov(X, Y) (f) Corr(X,Y)
2. The joint probability function of two discrete random variables X and Y is given by p(x, y) = c(2x + y),
where x and y can assume all integers such that 0 ≤ x ≤ 2, 0 ≤ y ≤ 3, and p(x, y) = 0 otherwise. Then
find(a) E(X), (b) E(Y), (c) E(XY), (d) Cov(X, Y) (e) Corr(X,Y)
3.Let X and Y be continuous random variables with joint density function
( x y ) x 0, y 0
f ( x, y ) e ,
0 otherwise
then find
(a) Var(X),(b) Var (Y), (c) , (d) , (e) (f)
9
6.9 Moment and Moment Generating functions
The obvious purpose of the moment- generating function is in determining momentsof random variables.
However, the most important contribution is to establishdistributions of functions of random variables.
6.9.1. Moment
If g(X) = Xforr = 0, 1, 2, 3, . . . , the following definition yields an expected value called
Therth moment about the origin of the random variable X, which we denoteby .
Definition
The rth moment about the origin of the random variable X is given by:
= E(Xr) =
Remark
Therth moment of a random variable X about the mean µ also called the rth central moment, is defined
as: that is;
(x )
r
p( x) , if X is discrete random variable
x
first and second moments about the origin are given by = E(X) and = E(X2 ), we can write the
mean and variance of a random variable as μ = and σ2 = - )2
The moment generating functionM(t) of the random variable X is defined for all realvalues of t by:
MX(t)=E(etX)=
Property 3: In general, the nth derivative of M(t) is given by: Mn (t) = E(Xn etX)) implies Mn (0) = E(Xn ) for n
1.
10
Remarks:
Two useful results concerning moment generating functions are, first, that the moment generating
function uniquely determines the distribution function of the random variable and, second, that the
moment generating function of the sum of independent random variables is equal to the product of
their moment generating functions.
It is also possible to define the joint moment generating function of two or more random variables.
Let X and Y be two random variables with m.g.f`s, Mx (t) and My (t), respectively. If Mx (t) = My (t) for
all value of t, then X and Y have the same probability distribution.That is, the moment generating
function (m.g.f) is unique and completely determines the distribution function of the random variable
X. Thus, two random variables having the same m.g.f. then would have the same distribution.
Suppose that X and Y are two independent random variables. Let Z = X + Y. Let MX(t), MY(t) and
MZ(t) be them.g.f`s, of the random variable X, Y and Z respectively. Then for all value of MZ(t) =
Mx (t) My (t). It is true for X1 , …,Xn independent random variables as: M(t 1 , . . . , tn ) = MX1 (t 1 ) · ·
·MXn (t n ).
Exercises
1. Let X be a discrete random variable with pmf of p(x) = 1/3, for x = 0, 1, 2, then find the moment
generating function , E(X) and V(X)
2. Find the moment generating function of a random variable X having density function:
x
0 x2
4. X is a continuous random variable with a probability density function: f ( x) 2
0 otherwise
Find the moment generating function of a random variable Y = (X +1)/2
11
Chapter 7
Common discrete probability distributions and their properties
7.1. Bernoulli Distribution
Bernoulli's trial is an experiment where there are only two possible outcomes, “success" or "failure". An
experiment considered into a Bernoulli trial by defining one or more possible results which we are interested
as ‘‘Success” and all other possible results as “Failure”. For instance, while rolling a fair die, a "success"
may be defined as "getting even numbers on top" and odd numbers as "Failure”. Generally, the sample space
in a Bernoulli trial is S = {S, F}, S = Success, F = failure.
Therefore if an experiment has two possible outcomes “success” and “Failure”, their probabilities are and
respectively. Then the number of success (0 or 1) has a Bernoulli distribution.
Definition 7.1:
A random variable X has Bernoulli distribution and it referred to as a Bernoulli random variable if and only
if its probability distribution given by: , for x = 0, 1.
x p( X x) x p (1 p)
x 1 x
Mean E ( X ) p
x 0 x 0
Example 7.1:
Many real life experiments result from conducting a series of Bernoulli trails. Repeated trials play an
important role in probability and statistics, especially when the number of trial is fixed, the parameter (the
probability of success) is same for each trial, and the trial are all independent. Several random variables are a
rise in connection with repeated trials. The one we shall study here concerns the total number of success.
Examples of Binomial Experiments
Tossing a coin 20 times to see how many tails occur.
Rolling a die 10 times to see if a 5 appears.
12
Derivation of the Binomial Distribution
Consider a set of n independent Bernoulli trials (n being finite) in which the probability of success in any
trail is constant. Then, this gives rise to a binomial distribution.
To sum up these conditions, the binomial distribution requires that:
An experiment repeated n times.
Only two possible outcomes: success (S) or Failure (F).
P(S)=p(fixed at any trial).
The n-trials are independent
Any experiment satisfying these four assumptions (or conditions) is said to have a binomial probability
distribution.
Note: Since S and F are complementary events, P( F ) 1 P(S ) 1 p q .
Let X be the number of successes in the n trials. Consider the event of getting k successes
(X = k). Out of the n trials, if k aresuccesses, then the remaining (n-k) are failures, observed in any order,
say, S S F S F F S F S F .
Since each trial is independent of the other, by the probability of independent events,
P(S S F S F F S F S F ) P(S ).P(S ).P( F ).P(S ).P( F ).P( F ).P(S ) P( F ).P(S ).P( F )
But k successes in n trials can occur in n ways (this is the number of possible selection of k out of n) and
k
expansion of p q n .
Definition 7.2:
A random variable X has Binomial distribution and it referred to as a Binomial random variable if and only if
its probability distribution given by: for x = 0, 1, … , n.In distribution
has the following characteristics:
13
Remark:
The numbers given by n C r are often called binomial coefficients, because they appear in the
binomial expansion and have many interesting properties in connection with the binomial
distribution.
The two constants, n and p, are known as the parameters of the distribution, and the notation X
B(n, p) shall be used to denote that the r- v X follows binomial distribution with parameters n and p.
The above pmf is also denoted by B(k ; n, p) .
Le X be a Binomial distribution with n number of trials and probability of success then the:
Mean: E(X) = µ = = np
Remark
the mean of the Binomial distribution is
n n
E ( X ) x P( X x) = x n
c x p x q n x
x 0 x 0
n
n!
=x p x q n x
x 0 x!(n x)!
n
n(n 1)!
=x p p x 1 q n x
x 0 x( x 1)!(n x)!
n
(n 1)!
= np p x 1q n x
x 1 ( x 1)!(n x)!
n
= np n 1
c x 1 p x 1 q n x
x 1
14
n n
Now, E ( X 2 ) = x 2 n c x p x q n x
x 0
= [ x( x 1) x]
x 0
n
c x p x q n x
n n
n! n!
= x( x 1) p x q n x + x p x q n x
x 0 x!(n x)! x 0 x!(n x)!
n
n(n 1)(n 2)!
= x( x 1) p 2 p x 2 q n x E ( X )
x 0 x( x 1)( x 2)!(n x)!
n
(n 2)!
= n(n 1) p 2 p x 2 q n x np
x 2 ( x 2)!(n x)!
n
= n(n 1) p 2 n2
c x 2 p x 2 q n x np
x 2
= n(n 1) p 2 (q p) n2 np
Solution: Let X equal the number of defectives in n = 5 trials. Then X is a binomial random variable with p,
the probability that a single stamping will be defective, equal to 0.05, and q = 1- 0.05 = 1 – 0.05 = 0.95.
The probability that X =3 is:
5
P(X 3) 0.053 (1 0.05) 53
3
5!
(0.05) 3 (0.95) 2
3! (5 - 3)!
5x4x3x2x1
(0.05) 3 (0.95) 2
3x2x1(2x1)
Example 7.3: Find the probability of getting five heads and seven tails in 12 flips of a balanced coin.
15
Solution: Given n = 12 trials. Let X be the number of heads.Then, p = Prob. of getting a head =1/2, and q =
prob. of not getting a head=1/2. Therefore, the probability of getting k heads in a random trial of a
coin 12 times is:
12 x 12 5
12 1 1
. And for x =5, P( X x) 1 1
x 5
12
P( X x) 0.1934
x 2 2 5 2 2
Example 7.4:If the probability is 0.20 that a person traveling on a certain airplane flight will request a
vegetarian lunch, what is the probability that three of 10 people traveling on this flight will request a
vegetarian lunch?
3
Exe rcises
1. The probability that a patient recovers from a rare blood disease is 0.4. If 100 people are known to have
contracted this disease, what is the probability that less than 30 survive?
2. A multiple-choice quiz has 60 questions each with 4 possible answers of which only 1 is the correct answer.
What is the probability that sheer guess-work yields from 25 to 30 correct answers out of the 60 problems
about which the student has no knowledge?
3. A company owns 400 laptops. Each laptop has an 8% probability of not work ing. You randomly
select 20 laptops for your salespeople.
16
The properties of Poisson random variables are the following.
The experiment consists of counting the number of items X a particular event occurs during a given
units,
The number of events that occur in one unit is independent of the number that occurs in other units.
Definition 7.3:
A random variable X has Poisson distribution with parameter and it referred to as a Poisson random
variable if and only if its probability distribution given by: for x = 0, 1, 2, . . .
Let X be a Poisson distribution with an average number of time an event occur (parameter)λ then:
Mean: E(X) = µ = =λ
Remark
The mean and variance for a Poisson distribution are both.
x
x 1
y
E(X) = x x !e
x 0
, (letting y = x - 1) e
x 1 ( x 1)!
= e
y!
e e
y 0
Remark:
When n is large, the calculation of binomial probabilities will usually be tedious. In such cases, it can
be approximated by the Poisson distribution. Let X be a binomial random variable with parameters n
and p. Then, the Poisson distribution is the limiting case of the binomial distribution under the
17
conditions: the number of trials, n is indefinitely large, i.e., n ; P(S ) p O (Indefinitely small);
and np (say), is constant. Then,
Example 7.5: Suppose that customers enter a waiting line at random at a rate of 4 per minute. Assuming that
the number entering the line during a given time interval has a Poisson distribution, find the
probability that:
b) at least one customer enters during a given half- minute time interval.
1 4
Solution:a) Given 4 per min, P( x 1) 4 e 4e 4 0.0733 .
1!
b) Per half- minute, the expected number of customers is 2, which is a new parameter.
Exercise:A certain kind of carpet has, on the average, five defects per 10 square meters. Assuming Poisson
distribution, find the probability that a 15 square meter of the carpet will have at least 2 defects.
Geometricdistribution arises in a binomial experiment situation when trials are carried out independently
(with constant probability of Success) until the first success occurs. The random variable X denoting the
number of required trials is a geometrically distributed with parameter .
Often we will be interested in measuring the length of time before some event occurs, for example, the length
of time a customer must wait in line until receiving service, or the length of time until a piece of equipment
fails. For this application, we view each unit of time as Bernoulli trail and consider a series of trails identical
to those described for the Binomial experiment. Unlike the Bino mial experiment where X is the total number
of successes, the random variable of interest here is X, the number of trails (time units) until the first success
is observed.
Definition 7.4:A random variable X has Geometric distribution with parameter p and it referred to as a
Geometric random variable if and only if its probability distribution is given by: x=
1, 2, . . ., where p is probability of success and x is number of trials until the first success occurs.
Le X be a geometric distribution:
18
Mean: E(X) = µ = =
Example 7.6:If the probability is 0.75 that an applicant for a driver’s license will pass the road test on any
given try. What is the probability that an applicant will finally pass the test on the fourth try?
Solution: Assuming that trials are independent, we substitute x=4 and p=0.75 into the formula for the
geometric distribution, to get:p(x) = = = 0.75(0.25)3 = 0.011719
Exercise: A manufacturer uses electrical fuses in an electronic system, the fuses are purchased in large lots
and tested sequentially until the first defective fuse observed. Assume that the lot contains 10% defectives
fuses.What is the probability that the first defective fuse will be one of the first five fuses tested?
We are interested in computing probabilities for thenumber of observations that fall into a particular
category. But in the case of thebinomial distribution, independence among trials is required. As a result, if
thatdistribution is applied to, say, sampling from a lot of items (deck of cards, batchof production items), the
sampling must be done with replacement of each itemafter it is observed. On the other hand, the
hypergeometric distribution does notrequire independence and is based on sampling done without
replacement.
Applications for the hyper geometric distribution are found in many areas, withheavy use in acceptance
sampling, electronic testing, and quality assurance. Obviously, in many of these fields, testing is done at the
expense of the item beingtested. That is, the item is destroyed and hence cannot be replaced in the
sample.Thus, sampling without replacement is necessary.
In general, we are interested in the probability of selecting x successes from the Mitems labeled successes
and n − x failures from the N –Mitems labeled failures when a random sample of size n is selected from N
items. This is known as a hypergeometric experiment, that is, one that possesses the following two
properties: A random sample of size n is selected without replacement from N items; and of the N items,
Mmay be classified as successes and N − Mare classified asfailures.The number X of successes of a
hypergeometric experiment is called a hypergeometric random variable.
19
Definition 7.5:The probability distribution of the hypergeometric random variable X, the numberof
successes in a random sample of size n selected from N items of which M are labeled success and N –
Mlabeledfailure, is: for x = 0, 1, 2, . . ., n; x≤M,n – x ≤ N –M.
The range of x can be determined by the three binomial coefficients in thedefinition, where x and n−x are no
more than MandN–M, respectively, and bothof them cannot be less than 0. Usually, when both M(the number
of successes) and N− M(the number of failures) are larger than the sample size n, the range
ofahypergeometric random variable will be x = 0, 1, . . ., n.
Remark
When the number of samples in the lot is large, then the hypergeometeric probability mass function is
approximated in to the probability mass function of a binomial random variable.
Example 7.7:Lots of 40 components each are deemed unacceptable if they contain 3 or more defectives. The
procedure for sampling a lot is to select 5 components at random and to reject the lot if a defective is
found. What is the probability that exactly 1 defective is found in the sample if there are 3 defectives
in the entire lot?
Solution: Using the hypergeometric distribution with n = 5, N = 40, M= 3, and x = 1, wefind the probability
Example 7.8: Two balls are selected at random and removed from a bag containing 5 blue and 3 green balls
in succession.Find the probability mass function blue balls.
Exercises:
1. An urn contains 8 blue balls and 12 white balls. If five are drawn at random, without replacement.
What is the probability that the sample will contain two blue and three white?
2. Among 16 applicants for a job, 10 have college degrees. If three of the applicants are randomly
chosen for interviews, what are the probabilities that: (a) none has college degrees; (b) two have
college degrees; (c) one has a college degree; (d) all three have college degrees?
20
Chapter 8
Note that the density function forms a rectangle with base b−aandconstant height to ensure that the area
under the rectangle equals one. As a result, the uniform distribution is often called the rectangular
distribution.
Definition 8.1: The probability density function for a uniform random variable, X with the parameters of
aandb is given by:
f(x) =
Mean: E(X) = µ = =
Example 8.1:A random variable X has a uniform distribution in the interval [5/8, 2].
(a) Find the mean and Standard deviation of X.
(b)Find P(X>1)
Exercise: Suppose the research department of a steel manufacturer believes that one of the company’s
rolling machines is producing sheets of steel of varying thickness. The thickness X is a uniform random
variable with values between 150 and 200 millimeters. Any sheets less than 160 millimeters thick must be
scrapped, since they are unacceptable to buyers.
(b) Find the fraction of steel sheets produced by this machine that have to be scrapped.
21
8.2 Normal Distribution
The most important continuous probability distribution in the entire field of statisticsis the normal
distribution.Its graph, called the normal curve, is the bell-shaped curve which approximately describes
many phenomena that occur in nature, industry, and research. For example, physical measurements in areas
such as meteorological experiments, rainfall studies, and measurements of manufactured parts are often more
than adequately explained with a normal distribution. In addition, errors in scientific measurements are
extremely well approximated by a normal distribution. The normal distribution is often referred to as the
Gaussian distribution, in honor of Karl Friedrich Gauss(1777–1855), who also derived its equation from a
study of errors in repeated measurement sof the same quantity.
A continuous random variable X having the bell-shaped distribution shown below is called a normal
random variable. The mathematical equation for the probability distribution of the normal variable depends
on the two parameters μ and σ, its mean and standard deviation, respectively. Hence, we denote the values of
the density of X by f(x; μ, σ) or f(x)
Definition 8.2:A random variable X is normal or normally distributed with parameters μ and σ 2 , (abbreviated
N(μ, σ2 )), if it is continuous with probability density function:
1 x μ 2
1 ( )
f(x) e 2 σ
- x ; σ 0 and μ ,the parameters μ and σ2 are the mean
σ 2Π
and the variance, respectively, of the normal random variable .
22
Remark
Let X be a binomial random variable with parameters n and p. For large n, X has approximately a
normal distribution with μ = npandσ2 = npq= np(1−p)and P(X ≤ x) = ≈ area under
normal curve to the left of x + 0.5= P(Z ≤ ,where +0.5 is called a continuity correction and
But this is easily evaluated using a table of probabilities prepared for a special kind of normal distribution,
called the standard normal distribution.
.
If X is a normal random variable with the mean μ and variance σ2 then the variable Z = is the
standardized normal random variable. In particular, if μ = 0 and σ= 1, then the density function is called the
standardized normal density and the graph of the standardized normal density distribution is similar to
normal distribution.
Convert all normal random variables to standard normal in order to easily obtain the area under the curve
with the help of the standard normal table.
Definition 8.3:Let X be a normal r-v with mean and standard deviation . Then we define the standard
normal variable Z as: Z X . Then the pdf of Z is, thus, given by:
1
1 2 z2
f ( z) e , z
2 .
23
Example 8.3: Find the probabilities that a random variable having the standard normal distribution will take
on a value
a) Less than 1.72; b)Less than -0.88;
c) Between 1.30 and 1.75; d) Between -0.25 and 0.45.
Solution:
a) P(Z 1.72) P(Z 0) P(0 Z 1.72) 0.5 0.4573 = 0.9573 .
b) P(Z 0.88) P(Z 0.88) 0.5 P(0 Z 0.88) 0.5 0.3106 0.1894 .
c) P(1.30 Z 1.75) P(0 Z 1.75) P(0 Z 1.30) 0.4599 0.4032 0.0567 .
d) P(0.25 Z 0.45) P(0.25 Z 0) P(0 Z 0.45) .
Remark:
a X b
P(a X b) P P( z1 Z z2 ) . Now, we need only to get the readings from the Z-
table corresponding to z1 and z2 to get the required probabilities, as we have done in the preceding example.
If X is a binomial random variable with mean μ = npand variance σ2 = npq, then the limiting form of
the distribution of Z = , as n→ , is the standard normal distribution n(z; 0, 1).
Example 8.4: If the scores for an IQ test have a mean of 100 and a standard deviation of 15, find the
probability that IQ scores will fall below 112.
Solution: IQ ~ N(100, 225)
Y μ 112 100
P(Y 112) P[ ]
σ 15
P[Z .800] 0.500 P(0 Z .800) 0.500 0.2881 0.7881
Example 8.5: Suppose that X N (165, 9), where X = the breaking strength of cotton fabric. A sample is
defective if X<162. Find the probability that a randomly chosen fabric will be defective.
Solution: Given that 165 and 2 9 ,
1. The average IQ score of students in a school for gifted children is 165 with a standard deviation of
27. A random sample of size 36 students is taken. What is the probability that:
a) the sample mean score will be greater than 170;
24
b) the sample mean score will be less than 158;
c) the sample mean score will be between 155 and 160;
d) the samples mean score is less than 170 or more than 175?
2. Find the value of Z if the area between -Z and Z is 0.4038;
3. The reduction of a person's oxygen consumption during periods of deep meditation may be looked up on
as a random variable having the normal distribution with 38.6 cc per minute and 6.5 cc per
minute. Find the probabilities that during such a period a person's oxygen consumption will be reduced by
(a) at least 33.4 cc per minute;
(b) at most 34.7 cc per minute
4. A normal distribution has mean =62.5. Find if 20% of the area under the curve lies to the right of
79.2.
Definition 8.4:The continuous random variable X has an exponential distribution, with parameter β, if its
Mean: E(X) = µ = =
Example 8.6:Let X be an exponential random variable with pdfof : f(x) = ,x then find the mean
and variance of the random variavle X.
Solution:E(X) = µ = = 2 and Var(X) = E(X – E(x))2 =4.
25