0% found this document useful (0 votes)
13 views31 pages

Chapter Four

Uploaded by

Gezae Gebredingl
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views31 pages

Chapter Four

Uploaded by

Gezae Gebredingl
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 31

CHAPTER FOUR

PROBABILITY AND PROBABILITY DISTRIBUTION


Since life is full of uncertainties, people have always been interested in evaluating probabilities.
The theory of probability is an indispensable tool in the analysis of situations involving
uncertainty. Uncertainty affects the decision-making process. Therefore, there is a need to handle
uncertainty systematically and scientifically. Probability theory helps us to make wiser decisions.

o Probability
 Definition of Probability
Probability can be defined as a mathematical means of studying uncertainty and variability. It is
a number that conveys the strength of our belief in the occurrence of an uncertain event.
Probability is a numerical measure of the likelihood that an event will occur.
Probability values are always assigned on a scale from 0 to 1, inclusive. A probability near zero
indicates an event is unlikely to occur; a probability near 1 indicates an event is almost certain to
occur. The probability of a sure event is one and the probability of an impossible event is zero.
Probability is the chance, the possibility or the likelihood that an event will occur in the
experiment.

 Basic Concepts and terminologies in probability theory


Experiment: is a process that leads to the occurrence of one and only one of several possible
observations. It is any activity whose outcomes cannot be predicted in advance. The act of
tossing a coin is an experiment. Examples of an experiment include: rolling a die, playing a
football game, drawing a card from a deck of playing card are examples of experiment.
Sample Space: is the set of all possible outcomes of an experiment. If you toss a coin and
observe which side lands up, there are two possible results: heads (H) and tails (T). The two
possible results H and T are the possible outcomes of the experiment, and the set S = {H, T} of
all possible outcomes is the sample space for the experiment.
Event: is any subset of the sample space. It is a set of possible outcomes that consists of one or
more outcomes of probability experiment. In rolling a die the events are either odd or even
numbers. If in rolling a die we are interested in the occurrence of odd numbers, then the possible
events will be: E = {1, 3, 5}.

1
Outcome: a particular result of an experiment. In case of tossing a coin, if head face up we will
consider head as the outcome of the experiment.
Sample point: Each distinct outcome is called sample space outcome or sample point or
elementary event. It is the same as one outcome.
 Principles of Counting
Being able to identify and count the experimental outcomes is a necessary step in assigning
probabilities. We can determine the number of possible outcomes of a given experiment either
by listing method or using any one of the counting rules such as general multiplication rule,
permutation or combination.

A. Listing Method (Probability tree diagram)


If we have an experiment in which we know the possible outcomes are equally likely, to list all
elements of the sample space and to get the number of possible outcomes, we sometimes sketch
a tree diagram. A tree diagram is a graphical representation that helps in visualizing possible
outcomes of an experiment.
Example 1: In an experiment of tossing a fair coin once, then the elements of the sample space
is:
H S= H, T
T
Example 2: In an experiment of tossing a fair coin and rolling regular die at the same time, list
the elements of the sample space.

Coin die Coin die


1 1

2 2

H 3 T 3

4 4

5 5

6 6

Hence the sample space(S) is given as


S = { (H, 1), (H, 2), (H, 3), (H, 4), (H, 5), (H, 6), (T, 1), (T, 2), (T, 3), (T, 4), (T, 5), (T, 6)}
i.e.(S) = 2  6= 12
B. Generalized Multiplication rule

2
If an experiment can be described as a sequence of k steps with n 1 possible outcomes on the first
step, n2 possible outcomes on the second step, and so on, then the total number of experimental
outcomes is given by (n1) (n2) . . . (nk). These are number of experimental outcomes without
listing them. Total number of arrangements=n1*n2*n3…….*nk

Example
 How many two digit numerals can be written by choosing the digit from A={1,3,5,7,9} and
from B= {2,4,6,8}
Solution: The Selection Consists of two steps where the 1 st can be made in 5 different ways for
the digits and for each of these the 2 nd can be made in 4 different ways for the units digit, n 1 = 5
and n2 = 4. Hence the whole selection one after the other can be made in 5x4 different ways.
There are 20 two-digit numerals.
 Suppose a library has 6 different mathematics books 5 different Economics books and 8
different Accounting books that are to be given to a student one book from each kind. In
how many ways can a student be given?
Solution: n1 = 6, n2 = 5 and n3 = 8 a student can be given in 6  5  8 = 240 different ways
C. Combination rule
A combination is a useful counting rule that allows one to count the number of experimental
outcomes when the experiment involves selecting n objects from a (usually larger) set of N
objects. It is called the counting rule for combinations.
Combination is an arrangement /selection of objects or things without the attention of order.

The notation “!” means factorial for example 3! = 1  2  3 = 6 4! = 1  2  3  4 = 24


Example

3
a. Consider a quality control procedure in which an inspector randomly selects two of five
parts to test for defects. In a group of five parts, how many combinations of two parts can
be selected? The counting rule in equation shows that with N= 5 and n = 2, we have

Thus, 10 outcomes are possible for the experiment of randomly selecting two parts from a group
of five. If we label the five parts as A, B, C, D, and E, the 10 combinations or experimental
outcomes can be identified as AB, AC, AD, AE, BC, BD, BE, CD, CE, and DE.
b. : Find the number of combinations of the letters A, B, C, D, E, F taking three at a time.
6!
6C 3  720 / 36  20
Solution: N = 6, n = 3 3! (6  3)!

c. In how many different ways can an association of 7 members select a committee of 3


members?
7! 5040
7C 3    35
Solution: N = 7, n = 3 3! 7  3! 144

D. Permutations
Permutations is sometimes useful is the counting rule for permutations. It allows one to
compute the number of experimental outcomes when n objects are to be selected from a
set of N objects where the order of selection is important. The same n objects selected in a
different order are considered a different experimental outcome.

Example
 Consider again the quality control process in which an inspector selects two of five parts
to inspect for defects. How many permutations may be selected?
Solution: N= 5 and n= 2

4
Thus, 20 outcomes are possible for the experiment of randomly selecting two parts from a group
of five when the order of selection must be taken into account. If we label the parts A, B, C, D,
and E, the 20 permutations (list of possible permutation outcomes) are AB, BA, AC, CA, AD,
DA, AE, EA, BC, CB, BD, DB, BE, EB, CD, DC, CE, EC, DE, and ED.

 In how many different ways can an association of 7 members choose a president, vice
president and secretary if no one can be chosen for two of it at a time and everyone can
be chosen for any one of it.
7! 7!
P3    210
Solution: n = 7, n = 3 then
7
7  3! 4!
 Approaches to probability
In the assignment of probabilities to the experimental outcomes, objective or subjective
probability can be used. Objective probability is the likelihood of occurrence of an event that is
based on data. The analysts use data and mathematical equations to derive the objective
probability of an event. In contrast, subjective probability is personal, and they are not data-
driven. There are three basic approaches in determining and assigning probabilities to outcomes.

i. Classical Probability
The classical approach is based on the premise that all possible outcomes of an experiment are
mutually exclusive and equally likely. E.g. The numbers 1,2,3,4,5, and 6 on fair die are equally
likely to occur (they do have equal chance of occurrence).
Under this approach the probability of an event is known before conducting the experiment. The
probability of an event ‘A’ is defined as:

Probability of an event P(A) = Number of favorable outcomes


Total number of possible outcomes
P(A) = m/n
Where, ‘m’ is the number of favorable outcomes of an event ‘A’ and ‘n’ is the total number of
outcomes of the experiment.
Example: In rolling a regular die what is the probability of getting an even number on the upper
face?
Solution: When a regular die is rolled, the number that faces up can be any one of the six equally
likely outcomes. 1, 2, 3, 4, 5, or 6 and three of these are even.

5
Hence S = {1, 2, 3, 4, 5, 6}, where E = {2, 4, 6} so m=3 n=6
m 3
  0.5
P (E) = n 6
ii. Relative Frequency Probability
It is also called empirical or experimental probability. The probability of an event happening in
the long-term is determined by observing what fraction of the time similar events happened in
the past/ in an experiment trials. We often think of a probability in terms of the percentage of the
time the event would occur in many repetition of the experiment. Suppose that “A” is an event
that might occur when a particular experiment is performed then the probability that the event A
will occur, P (A), can be interpreted to be the number that would be approached by the relative
frequency of the event “A” if we perform the experiment an indefinitely large number of times.
E.g. when we say that the probability of obtaining a head when we toss a coin is 0.5 we are
saying that, when we repeatedly toss the coin an indefinitely large number of times, we will
obtain a head 50% of the repetition. In terms of formula
Probability of an event happening = Number of times occurred in past
Total number of observation
Examples:
1. If a truck operator experienced 5 accidents out of 50 truck last year, then the probability
that a truck will have an accident next year can be 5/50 = 0.10.
2. A coin is tossed 20 times and lands 15 times on heads. What is the probability (relative
frequency) of observing the coin land on heads?
Solution: Total number of trials (observations) =20, Number of times occurred or number
of positive trial=15. Using the above formula the probability of observing
head=15/20=0.75. Hence the probability of observing the coin land on heads is 0.75.
iii. Subjective Probability
When there is no past experience or little on which to base a probability, personal judgment,
experience, intuition or expertise or any other subjective evaluation criteria will be applied to
estimating or assigning probability. This probability is subjective probability.
It is also called personal probability. Unlike objective probability, one person’s subjective
probability may very well different from another person’s subjective probability of the same
event.

6
E.g. A physician assessing the probability of a patient’s recovery and an expert in the national
bank assessing probability of currency devaluation are both making a personal judgment based
on what they know and feel about the situation and other group of physicians or experts will
arrive with different probability, though both can employee identical techniques or approaches
and information.
Both classic and long-term relative frequency probabilities are objective in the sense that no
personal judgment is involved.
Whatever the kind of probability involved (subjective or objective), the same set of mathematical
rules holds for manipulating and analyzing probability.
 Probability Rules
 The Addition Rule
The addition rule is applied to compute the union of two or more events. Here we are
interested in knowing the probability that at least one of the vents will occur. Before we present
the addition law, we need to discuss two concepts related to the combination of events: the union
of events and the intersection of events.
Union of two events
The union of A and B is the event consisting of sample space outcomes belonging to either A or
B. The union is denoted by AUB. Furthermore, P(AUB) denotes the probability that either A or
B will occur. It can also be denoted as P(A or B).

The Venn diagram in Figure 4.1 depicts the union of events A and B. Note that the two circles
contain all the sample points in event A as well as all the sample points in event B.

Sample Space
S

Event Event
A B

Intersection of two events


Given two events A and B, the intersection of A and B is the event containing the sample points
belonging to both A and B. The intersection is denoted by AnB. Furthermore, P(AnB) denotes

7
the probability that both A&B will simultaneously occur. It is can also be written as P(A and B).
The area where the two circles overlap is the intersection; it contains the sample points that are in both A and B.

Sample Space S

Event A Event B

Addition Rule for two non-mutually exclusive Events


Non-mutually exclusive events are events that can occur at the same time. They are also called
joint events.
Let A and B be are events then the probability that either A or B will occur is:
P(AUB) = P(A)+P(B)-P(AnB) where A and B are events and (AnB) is the intersection of A
and B.
Example 1: A student is taking two courses Human Resource Management and Business
Mathematics. The probability that the student will pass the Human Resource Management course
is 0.6 and that of passing Business Mathematics course is 0.7. The probability of passing both is
0.5. What is the probability of passing Human Resource Management or Business Mathematics
P(AuB)?
Solution:
Let H be the event that the student passes Human Resource Management
M be the event that the student passes Business Mathematics:

P(H ) 0.6 P(M )=0.7, (HnM )=0.5

P(HUM ) P(H ) P(M )-P(HnM )


0.6 0.7-0.5 0.8
Example 2: Consider a small assembly plant with 50 employees. Each worker is expected to
complete work assignments on time and with no defective product. At the end of a performance
evaluation period, the production manager found that 5 of the 50 workers completed work late, 6
of the 50 workers assembled a defective product, and 2 of the 50 workers both completed work
late and assembled a defective product. After reviewing the performance data, the production
manager decided to assign a poor performance rating to any employee whose work was either

8
late or defective. What is the probability that the production manager assigned an employee a
poor performance rating?
Solution:
Let
A = the event that the work is completed late
B = the event that the assembled product is defective

P(L)=5/50=0.10 P(D)=6/50=0.12 P(LnD)=2/50=0.04


Note that the probability question is about the union of two events. Specifically, we want to
know P(LUD). Using equation, we have
P(L U D) = P(L) + P(D) — P(LnD)
Knowing values for the three probabilities on the right side of this expression, we can write
P(L U D) = .10 + .12 — .04 = .18
Addition Rule for Two Mutually Exclusive Events
Mutually Exclusive Events are events that cannot occur at the same time. Two events
are said mutually exclusive if they have no sample space outcomes in common. They
are also called disjoint or incompatible events. In this case the event A&B cannot
occur simultaneously and thus: P(AnB) = 0
Let A&B Mutually exclusive events then, the probability that either A or B will occur is
P(AUB) = P(A) + P(B)
Example: Consider randomly selecting a card from a standard deck of 52 playing cards and
define the events.
Q, a randomly drawn card is Queen; and K, a randomly drawn card is a king.
Since there are 4 Queens and 4 Kings in the deck.
4 4
P(Q) = 52 P(K) = 52

Since there is no card that is both a Q & K the event Q and K are mutually exclusive and thus
P(QnK) = 0 it follows that the probability that the randomly selected card is either Q or K is
P(QUK) = P(Q) + P(K)
= 4/52 + 4/52 = 0.15
Complements of an event
The complementary of an event A is the set of all possible outcomes in the sample space S that
are not included in A. The complement of A is denoted by Ac and P(Ac) denotes the probability
that A will not occur. In any probability situation, either an event A or its compliment A must

9
occur.
Mathematically, this is given as: P(A) + P(Ac) = 1 This implies P(Ac) = 1-P(A)
On the next figure is a diagram, known as a Venn diagram, which illustrates the concept of a
complement. The rectangular area represents the sample space for the experiment and as such
contains all possible sample points. The circle represents event A and contains only the sample
points that belong to A. The shaded region of the rectangle contains all sample points not in
event A and is by definition the complement of A.

Sample Space S

Event A
Ac Complement of Event A

Example: After reviewing sales reports, the sales manager states that 80% of new customer
contacts result in no sale. By allowing A to denote the event of a sale and Ac to denote the event
of no sale, the manager is stating that P (Ac) = .80.
Using equation above we see that P(Ac) = 1-P(A)
We can conclude that a new customer contact has a .20 probability of resulting in a sale.
In another example, a purchasing agent states a 0.90 probability, P(A), that a supplier will send
a shipment that is free of defective parts. Using the complement, we can conclude that there is a
1-0.90 = .10 probability that the shipment will contain defective parts.

Conditional probability
Probability is conditional upon information. We may define the probability of event A
conditional upon the occurrence of event B. Often, the probability of an event is influenced by
whether a related event already occurred. Suppose we have an event A with probability P(A). If
we obtain new information and learn that a related event, denoted by B, already occurred, we
will want to take advantage of this information by calculating a new probability for event A. This
new probability of event A is called a conditional probability and is written P(A/B). We use the

10
notation / to indicate that we are considering the probability of event A given the condition
that event B has occurred. Hence, the notation P(A/B) reads “the probability of A given B.”
If we think about two adjacent rooms, R 1 and R2, the probability that R1 will be caught by fire is
highly conditional on the probability of the other room.
We express these conditional probabilities in terms of P(A), P(B) and P(AnB)
Given the sample space outcomes are equally likely.
P(A/B) = P(AnB) , then P(AnB) = P(B) P(A/B), by simple cross multiplication
P(B)
P(B/A) = P(AnB) = then P(AnB) = P(A) P(B/A)
P(A)
Example: In a firm 20% of the employees have an accounting background, while 5% of the
employees are executives and have accounting backgrounds. If an employee has accounting
background, what is the probability that the employee is an executive?
Let us define the events
E= an employee is an executive and
A= an employee has an accounting background
P(A) = 0.2 P(AnE) = 0.05 then P(E/A) = P(AnE) = 0.05 = 0.25
P(A) 0.2

Exercise: Given data on cars worked on by a garage over a certain period of time.
Accident (A) No accident (A') Total
Minor repair (B) 30 50 80
Major repair (B') 180 20 200
Total 210 70 280
Find i. P(A) ii. P(AnB') iii. P(A/B) iv. P(A'/B')
Ans:
o P(A) = 210/280 – Marginal probability

o P(AnB') = 180/280 -Joint probability

o P(A/B) = P(AnB)/ P(B) = 30/280/ 80/280 = 3/8 - Conditional probability

o P(A'/B') = P(A'nB')/ P(B') = 20/280/ 200/280 = 1/10

 Multiplication Rule

11
Whereas the addition law of probability is used to compute the probability of a union of two
events, the multiplication law is used to compute the probability of the intersection of two
events. The multiplication law is based on the definition of conditional probability.
Independent events
If the occurrence of events A and B have nothing to do with each other, then we know that A and
B are independent events. The probability of occurrence of A will not influence the probability
of occurrence of B.
This implies that
P(A/B)= P(A) and that
P(B/A) = P(B)
Therefore, two events A and B are independent if P(A/B)= P(A) and P(B/A) = P(B), otherwise
the events are dependent.
The General multiplication rule tells us that, for any two events A and B we can say that
P(AnB) = P(A) P(B/A) therefore if P(B/A)= P(B) it follows that
P(AnB) = P(A) *P(B)
This is called the multiplication rule for two independent events.
For example: Probability of occurring ‘A-diamond’ from 52 well shuffled cards is 1/52. If there
is replacement, probability of occurring ‘A –flower’ is 1/52; but if no replacement 1/51. Thus, if
there is replacement, events are independent, if no replacement they are dependent.
Dependent events
Two random events are said to be dependent when the probability of one event is affected by the
occurrence of the other event.
If the probability of an event is influenced by whether or not another event occurs, we say the
two events are dependent.
P(AnB) = P(A)*P(B/A)

Examples
 An urn contains 6 red marbles and 4 black marbles. Two marbles are drawn with
replacement from the urn. What is the probability that both of the marbles are black?
Solution
Let A = the event that the first marble is black; and let B = the event that the second marble is
black. We know the following:
 In the beginning, there are 10 marbles in the urn, 4 of which are black. Therefore, P(A)
= 4/10.
12
 After the first selection, we replace the selected marble; so there are still 10 marbles in
the urn, 4 of which are black. Therefore, P(B) = 4/10. Therefore, based on the
multiplication rule of independent events:
P(AnB) = P(A) P(B) = (4/10)*(4/10) = 16/100 = 0.16
 An urn contains 6 red marbles and 4 black marbles. Two marbles are drawn without
replacement from the urn. What is the probability that both of the marbles are black?
Solution: Let A = the event that the first marble is black; and let B = the event that
the second marble is black. We know the following:
1. In the beginning, there are 10 marbles in the urn, 4 of which are black. Therefore, P(A) =
4/10.
2. After the first selection, there are 9 marbles in the urn, 3 of which are black. Therefore,
P(B|A)= 3/9.
Therefore, based on the rule of multiplication of dependent events:
P(AnB) = P(A) P(B|A) = (4/10) * (3/9) = 12/90 = 2/15
Bayes’ Theorem
Bayes’ theorem is applicable when the events for which we want to compute posterior
probabilities are mutually exclusive and their union is the entire sample space (if the union of
events is the entire sample space, the events are said to be collectively exhaustive). In the
discussion of conditional probability, we indicate that revising probabilities when new
information is obtained is an important of probability analysis. Often, we begin our analysis with
initial or prior probability estimates for specific events of interest. Then, from sources such as a
sample, a special report, or a product test, we obtain some additional information about the
events. Given this new information, we update the prior probably values by calculating revised
probabilities, referred to as posterior probabilities by using Bayes’ theorem.

Prior Probabilities New Posterior Probabilities


Application of Bayes’ Theorem
Information

Bayes’ theorem reads as follows:


P(AiB) = P(Ai)P(BAi) .
P(A1)P(BA1) + P(A2)P(BA2) + … + P(An)P(B/An)

a. Probability Distribution
i. Basic concepts
Random Variable
13
A random variable is a variable whose values are determined by chance. Or a random variable is
a numerical description of the outcome of an experiment. A random variable may be either
discrete or continuous.
Discrete Random Variable: is a random variable that can assume only certain clearly separated
values resulting from a count of some item of interest. Random variables can only assume non-
negative whole numbers such as 0, 1, 2, 3………, n. Example: the number of students in a class,
the number of telephone calls received in a given hour, the number of people living in certain
area and so on.

Continuous Random Variable: is a random variable that can assume any value in an interval.
These are random variables that can assume an uncountable infinite number of values. A
Continuous random variable, as the name implies, assumes all possible values between any two
values. Mostly results of measurement. Example: Weight, time, temperature, etc are example of
continuous random variable.
Probability distribution
Probability distribution is listing of all possible values of the random variable with corresponding
probabilities. It is the listing of all the probable outcomes in a random experiment along with
their respective probabilities. The probability distribution for a random variable describes how
the probabilities are distributed over the values of the random variable.
A probability distribution can be classified as a discrete or continuous probability distribution
according to whether it assumes a discrete or continuous random variable. First let’s discuss the
discrete probability distribution and then we will look at the continuous probability distribution.

ii. Discrete Probability Distribution


Discrete Probability Distribution is any representation of the values of discrete random variable
and the associated probabilities. The most commonly used discrete probability distributions
include the binomial, Hyper-geometric and the Poisson distributions.
1. The Binomial Distribution

The binomial distribution is a discrete probability distribution that deals with populations that
can be divided into two categories with reference to the presence or absence of a particular
attribute.

14
The binomial distribution has the following characteristics.
 There must be a fixed number of trials and the data collected are the results of counts.
 Each trial results in a success or failure. An outcome of an experiment is classified into
one of two mutually exclusive categories a success or failure.
 The probability of success remains the same for each trial. So does the probability of a
failure. This implies that the probability of failure of any trial is 1- (probability of
success). Probability of success is denoted by p and probability of failure by q of then q =
1 – p.
 The trials are independent. The outcome of one trial does not affect the outcome of any
other trial.
In a Binomial experiment, the probability of exactly x successes in n trials is given by:
 n!  x n x

x! ( n  x

)!
 p q n x
p(r) = Cx . p . q
n-x
P (X) =   or  The Binomial formula

Where x is the number of successes


P(x) is the probability of success
n is the number of trials
P is the numerical probability of success
q is the numerical probability of failure
Note: q = 1 - p and 0  x  n
Example 1: Suppose that 40% of all customers who enter a department store make a purchase.
What is the probability that exactly 2 of the next 3 customers will make a purchase?
We can solve for P(x = 2) as follows
n=3 p = 0.4 q = 1 - 0.4 = 0.6
3!
P(x = 2) = 2!(3  2)! (0.16 *0.60) = 0.288
Example 2: A new drug is effective 60% of the time. What is the probability that in a random
sample of 4 patients, it will be effective on two of them?
Solution:
This is a Binomial experiment as the points of the experiment are satisfied. Define ‘effective’ as
‘success’ and ‘non effective’ as ‘failure’. Then,
p = 0.6, q = 1 - 0.6 = 0.4, n = 4, x = 2
Required p (2) = ?
4!
. 0.6  . 0.4   6  0.0576   0.3456
2 4 2
P (2) 
2!4  2 !

15
Hence, the drug will be effective on two of a random sample of 4 patients with a probability of
0.3456 (or 34.56%).
Example 3: If we toss a coin three times, what is the probability of getting
exactly two heads?
Using the Binomial formula
P = 0.50 q = 1 – 0.50 = 0.50 n=3 x=2
X 1-x
P(x=2) = ncx * P * q
= 3c2 *0.52*0.51
=
3(0.25*0.5) = 3(0.125) = 0.375
Remark: (Using the Binomial tables)
We recognized that it is tedious to calculate probabilities using the binomial formula when n are
a large number. For such cases, you may use the binomial probability distribution table that is
given at the end of this chapter.
Let’s demonstrate how to read the table with an illustrative example. Consider the above
example 2. Find the probability using binomial table? As stated clearly, the experiment a
Binomial experiment where
n = 4 and p=0.6 and x = 2
First choose the table where n=4, find 2 in the x column and 0.6 in the probability row then the
value in the intersection of x=2 and p=0.6 is the required probability that is 0.346.
Mean, Variance and Standard deviation of a Binomial Probability Distribution
The mean, variance and standard deviation of a variable that has the Binomial distribution is
found as:
Mean ( ) = n · p
Variance (2) = n·p·q

Standard deviation ( ) =
npq

Examples
1. A coin is tossed four times. Find the mean, variance and SD of the number of heads that
will be obtained.
Solution:
Here n = 4, p = 1/2, and q = 1/2
=n.p=4.½=2
2 = n . p . q = 4 . 1/2 . 1/2 = 1

= 2 = 1=1

2. A die is rolled 240 times. Find the mean, variance and standard deviation for the number
of 3’s that will be rolled.

16
Solution:
n = 240, P=1/6 q = 1/6
 = n . p = 240(1/6) = 40
2 = n . p . q = (240)(1/6)(5/6)  33.33
= 33.33  5.77
2. Hyper-geometric Distribution

The binomial distribution assumes that the probability of success (p) and failure (q = 1 - p) are
the same throughout the experiment. This is because
1.0.1. events are independent

1.0.2. sampling is done with replacement

1.0.3. n < 0.05N

1.0.4. population is infinite

However, in cases where sampling is without replacement and the sample size exceeds 5% of the
population size, it is necessary to use the hyper-geometric distribution to determine correct
probability.
The hyper-geometric distribution has the following characteristics.
A) It is a discrete distribution.

B) Each outcome consists of either a success or a failure.

C) Sampling is done without replacement.

D) The population size is finite and known.

E) It is described by three parameters: N, r and n. because of the multitude of possible


combinations of these three parameters, creating tables for the hyper-geometric
distribution is practically impossible.

F) The number of successes in the population, r, is known.

G) The sample size is ≥ 5% of the population.

Under the above conditions, we can use the hyper-geometric distribution for determining the
correct probability, with the following formula:

17
nNxr * rx
P( X ) 
nN
Where: P(X) = the probability N = population size
n = sample size
r = number of successes in the population
x = number of successes in the sample for which a probability is
desired
C = combination
N-r = the number of items in the population that are labeled as
success
NCn = the number of ways a sample of size n can be selected from a
population of size N.
rCx = the number of ways x successes can be selected from a total of r
successes in the population.
N-rCn-x = the number of ways n-x failures can be selected from a total of
N-r failures in the population
Example: 24 people, of whom eight are women, have applied for a job. If five of the applicants
are randomly selected, what is the probability that exactly three of those sampled are women?
Solution:
N = 24 r=8 n=5 x=3
24  8 8
 *
P( X 3)  5  3 24 3
5 = (120x56)/42,504 = 0.1581
The Binomial Approximation to Hyper-geometric Distribution
The binomial probability distribution with parameters n and p=r/N provides a good
approximation of hyper-geometric probability distribution if the sample size, n, is no more than
five percent of the population size, N. that is n ≤ 5%N. and as n/N decreases, the binomial
distribution better approximates the hyper-geometric distribution.
3. The Poisson Probability Distribution

The Poisson distribution is another discrete probability distribution which is used to describe a
number of processes, including the distribution of telephone calls going through a switch board
system, the demand of patients for service at a health institution, the arrival of trucks and cars at
a tool booth, and the number of accidents at an intersection.
While a binomial random variable counts the number of successes that occur in a fixed number
of trials, a Poisson random variable counts the number of rare events (successes) that occur in a
specified continuous time interval or specified region.
The Poisson distribution has the following characteristics.

18
o The probability of an occurrence is the same throughout the time interval or space
per unit.

o The number of occurrences in one interval is independent of the number of


occurrences in another interval.

o The probability of two or more occurrences in a subinterval is small enough to be


ignored.

o It must be possible to divide the time interval of interest in to many sub intervals.

o The expected number of occurrences in an interval is proportional to the size of the


interval.

Examples:
A. The number of machine failure in a week

B. the number of traffic accidents per month in town

C. the number of emergency patients arriving at a hospital in an hour

The Poisson distribution is described mathematically by the formula.


λt-the average number of occurrences per unit of
P ( x) = xe- or P(x) = (λt)x. e-(λt)
time
x! x!
λ = the average occurrence rate
Where;
t = the unit of time
 is the mean number of success /average rate/
e is the base of natural logarithm or mathematical constant with value 2.7183
x is the number of success in the interval
P (X) is the probability of X successes in an interval
The Poisson distribution can be used to approximate the binomial distribution when the
probability of a success is small and the number of trial is very large.
Example 1: A bank manger wants to provide prompt service for customers at the banks drive up
window. The bank currently can serve up to 10 customers per 15-minute period without
significant delay. The average arrival rate is 7 customers per 15minute period. Assuming X has a
Poisson distribution find the probability that exactly 10 customers will arrive in a particular 15-
minute period.
19
=7 X= 10
P(10) = 710 2.7183-7 = 0.071
10!
Example 2: Suppose that bank customers arrive randomly on weekday afternoons at an average
rate of 3.2 customers every four minutes. What is the probability of getting 10 customers during
an eight minute interval?
λ = 3.2 customers/4 minute=0.8 t= 8 minutes x=10customers
µ = λ* t = 0.8 * 8minutes = 6.4 customers
 xe  6.410 e  6.4
P(x=10) = X!= 10!
= 0.0528
The probability of getting 10 customers during the next eight minutes in a
bank is 0.0528. Or there is 5.28% chance that exactly 10 customers will
arrive in eight minutes at a bank.
Remark: - Although the above probability was determined by evaluating the probability
function, it is often easier to refer to the table for the Poisson probability distribution. These table
provides probabilities for specific values of x and . The table is included at the end of this
chapter.
For convenience, in example,  = 5 and x = 3. In the first column of the table choose
x = 3 and correspond it with  =5, the intersection of these two numbers gives you the
required probability, which is  0.1404.
Variance and standard deviation of the passion probability distribution
The variance of the poison distribution is equal to the mean of the distribution.
2 =  then

=

Poisson Approximation to Binomial Probability Distribution


The Poisson probability distribution can be used as an approximation to the binomial probability
distribution when P, the probability of success is small and n, the number of trials/sample size, is
large. Simply set µ = np and use the Poisson tables. As a rule of thumb, the approximation will
be good whenever P≤0.05 and n≥20. However, this approximation is reasonably accurate if n>20
and np≤5.
Why approximation?
The Poisson formula is easier to use than the binomial formula.
iii. Continuous Probability Distribution

20
As noted earlier in this unit a continuous random variable is one that can assume an infinite
number of possible values within a specified range. It usually results from measuring something.
It is not possible to list every possible value of the continuous random variable along with a
corresponding probability.
The most convenient approach is to construct a probability curve. The proportion of area
included between any two points under the probability curve identified the probability that a
randomly selected continuous variable has a value between those points.
The Normal Distribution
The normal distribution is a continuous distribution that has a bell shape and is determined by its
mean and standard deviation. It occupies a place of central importance in continuous probability
distribution in particular and statistics in general. It is the most important theoretical distribution.
Characteristics of a normal probability distribution
 It is a continuous distribution.

 The normal distribution curve is bell-shaped.

 The curve is symmetrical about the mean.

 The mean, median and mode are equal and are located at the center of the distribution.

 The curve is asymptotic to the x-axis and never touches the x-axis

 The total area under the normal distribution curve is equal to 1, or 100%; 50% of the area
is above the mean and 50% is below the mean.

 It is defined by two parameters: µ and δ. Each combination of these two parameters


specifies a unique normal distribution. The value of µ indicates where the center of the
bell lies, while δ represents how spread out (or wide) the distribution is.

Each combination of µ and δ specifies a unique normal distribution. This brings about having an
infinite family of normal distributions. This problem of dealing with an infinite family of
distributions can be solved by transforming all normal distributions to the standard normal
distribution, which has a mean equal to 0 and a standard deviation equal to 1. Standard Normal
Distribution is a normal distribution in which the mean is 0 and the standard deviation is 1. It is
denoted by z.

21
Any normal distribution can be converted to the standard normal distribution by standardizing
each of its observations in terms of Z- values. The Z- value measures the distance in standard
deviations between the mean of the normal curve and the X- value of interest. Any random
variable can be transformed to a standard random variable by subtracting the mean and dividing
by the standard deviation.
If a random variable X has mean µ and standard deviation δ, the standardized variable Z is
defined as:
X
Z , Where : Z  number of s tan dard deviations from the mean.

X  value of int erest
  mean of the distribution
  s tan dard deviation of distribution
A Z- score is the number of standard deviations that a value, X, is away from the mean. If the
value of X is less than the mean, the Z-score is negative; if the value of X is greater than the
mean, the Z-score is positive. Z-score is also known as z-value. A standardized score in which
the mean is zero and the standard deviation is 1. The Z score is used to represent the standard
normal distribution.
The probability calculations in normal distribution are made by computing areas under the graph.
Thus, to find the probability that a random variable lies within any specific interval we must
compute the area under the normal curve over that interval.
Probabilities for some commonly used intervals are:
 68.26% of the time, a normal random variable assumes a value within ±1δ of its mean.

 95.44% of the time, a normal random variable assumes a value within ±2δ of its mean.

 99.72% of the time, a normal random variable assumes a value within ±3δ of its mean.

 3.1

 4   2 .5

 0 22
The shape of the curves is determined by the standard deviation. The smaller the standard
deviation the more peaked the curve will be and the larger the standard deviation the more flat
and wider the curve will be.
Steps to find the probability of a random variable using the normal distribution:
a. Transform the given value to z-value

b. Draw a picture and Shade the desired area (optional)

c. Read the area from the standard normal distribution table.

d. Interpret your results

Transforming Random Variable to z-value


Example – The weekly incomes of a large group of middle managers are normally distributed
with a mean of 1000 Br. and standard deviation of Br. 100. What is the Z value for an income of
a) Br. 1100? Z=X-  = 1000
  = 100
Z = 1100 – 1000 = 1
100
This means an income of 1100 is one standard deviation above the mean.
b) Br 900? Z = 900 – 1000 = -1
100
This implies that an income of Br. 900 is one standard deviation below the mean.
Example 1: The scores for an IQ test are normally distributed with a mean of 100 and a
standard deviation of 15. Find the probability of the IQ scores that will fall below 112.
Solution
Draw a figure and represent the area
Find the z-value
Corresponding to an IQ
Score 112.
Z = x -  = 112 – 100 = 0.8 100 112
 115 0 0.8
From the table,
P(z < 0.8) = P(z < 0) + P(0 < z < 0.8) = 0.5000 + 0.2881 = 0.7881
Hence, 78.81% of the IQ scores fall below 112.

23
Example 2: The Graduate Management Admission Test (GMAT) is widely used by graduate
school of business as an entrance requirement. In one particular year, the mean score for the
GMAT was 485, with a standard deviation of 105. Assuming that GMAT scores are normally
distributed, what is the probability that a randomly selected score from this administration of the
GMAT:
 Falls between 600 and the mean?
 Is greater than 650?
 Is less than 300?
 Falls between 350 and 550?
 Is less than 700?
 Is exactly 500?
 If 500 applicants take the test, how many would you expect to score
590 or below?
Solution:
µ = 485 δ = 105 485≤X≤600
1. P (485≤X≤600) =?

X
Z
First convert X values in to Z-score using the formula 
Z485 = 0
600 485
Z600 = 105 = +1.10
1. P(485≤X≤600) = P(0≤Z≤+1.10)

= P (0 to +1.10)
= 0.3643
2. P (X>650) =?

X
Z
First convert X values in to Z-score using the formula 
650 485
Z650 = 105 = +1.57
2. P(X>650) = P(Z>+1.57)

= 0.5- P (0 to +1.57)
= 0.5-0.4418
= 0.0582
3. P (X<300) =?

24
X
Z
First convert X values in to Z-score using the formula 
300 485
Z300 = 105 = -1.76

3. P(X<300) = P(Z<-1.76)

= 0.5- P (0 to -1.76)
= 0.5-0.4608
= 0.0392
4. P (350≤X≤550) =?

X
Z
First convert X values in to Z-score using the formula 
350 485
Z350 = 105 = -1.29
550 485
Z550 = 105 = +0.62

4. P(350≤X≤550) = P (-1.29≤Z≤-1.29)

= P (0 to -1.29) + P (0 to 0.62)
= 0.4015 + 0.2324
= 0.6339
5. P (X<700) =?

X
Z
First convert X values in to Z-score using the formula 
700 485
Z700 = 105 = +2.05
5. P(X>300) = P(Z<+2.05)

= P (X<485) + P (485≤X<700)
= 0.5+ P (0 to +2.05)
= 0.5 + 0.4798
= 0.9798
6. P(X=500) = 0. The probability of an exact/single value of a continuous
random variable is zero. Consequently, the probability of an interval is the
same whether the end points are included or not.

7. To find the expected number of applicants who score 590 or below, we first
find P (X≤590) and we multiply it by the number of applicants.

25
P (X≤590) =?
X
Z
First convert X values in to Z-score using the formula 
590 485
Z590 = 105 = +1.00
6. P(X≤590) = P(Z≤+1.00)

= P (X<485) + P (485≤X<590)
= 0.5+ P (0 to +1.00)
= 0.5 + 0.3413
= 0.8413
If 500 applicants take the test, the number of students expected to score
590 or below is 500(0.8413) = 420.65 thus 421 students.
Normal Approximation
One of the reasons why we apply the normal probability distribution is that it is more efficient
than the binomial or Poisson when these distributions involve larger n or  values respectively.
The Normal approximation to the Binomial
The table of the binomial probabilities goes successively from an n of 1 to n of 25 or 30. Suppose
a problem involved taking a sample of 60. Generating a binomial distribution for that large a
number using the formula would be very time consuming. A more efficient approach is to apply
the normal approximation. This seems reasonable because as n increases, a binomial distribution
gets closer and closer to a normal distribution.
The normal probability distribution is generally deemed a good approximation to the binomial
probability distribution when np and nq are both greater than 5.
Since there is no area under the normal curve at a single point, we assign interval on the real line
to the discrete value of X by making what we call a continuity correction factor. Continuity
correction factor is subtracting or adding, depending on the problem, the value 0.5 to a selected
value when a binomial probability distribution is being approximated by a normal distribution.
We add 0.5 to x when x  and x > a certain value we subtract 0.5 from x when x < and  a
certain value.
Normal Approximation of Poisson distribution
When the mean of a Poisson distribution is relatively large, the normal probability distribution
can be used to approximate the Poisson distribution. For a good normal approximation to the
Poisson  must be greater than or equal to 10.

26
27
28
29
Standard Normal Distribution (Z-table)

30
31

You might also like