Chapter 5
Chapter 5
Chapter 5
Probability
1
Chapter 3
Basic Probability Concepts
5.1
3.1 General Definitions and Concepts
As we have defined statistics, in broad terms, it deals with two major components,
descriptive measures, and inference. Probability is the foundation for making
inference about the population based on the sample as representative part of the
population. In other words, probability is the link between population and sample in
such a way that we can have an understanding about the degree of uncertainty in
making decision about the population characteristics on the basis of sample char-
acteristics with the help of underlying probabilities.
Probability
Probability is a measure used to measure the chance of the occurrence of some
event. Probability measure ranges from 0 to 1, where 0 indicates impossibility and 1
indicates the certainty of the occurrence of an event.
Experiment
An experiment is a procedure (or process) that we perform whose outcomes are not
predictable in advance.
Example: Experiment with coin tossing results in two outcomes, head or tail, but
the outcome is not known until the coin is tossed or until the experiment is con-
ducted. Here, the two outcomes, head and tail, are exhaustive, because one of these
two outcomes must occur in each experiment and there are no other possible
outcomes in this experiment.
Sample Space
The sample space of an experiment is the set of all possible outcomes of an
experiment. Also, it is called the universal set and is denoted by X.
Example In the coin tossing experiment with a single coin, the possible outcomes
are head (H) or tail (T). Hence, the sample space is X ¼ fH; T g.
Event
Any subset of the sample space X is called an event. For example, in the coin
tossing experiment, an event called success may occur if the outcome is a head (H).
If a tail (T) appears, then it may be called failure. It may be noted that
(i) / X is an event is an impossible event, and
(ii) X X is an event is a sure or certain event.
An example is shown here to illustrate the sample space and events.
Let us consider selecting a patient from a hospital room with six beds numbered
from 1 to 6 and observing the patient of the selected bed. Here, the patients are
identified by their respective bed numbers.
This experiment has six possible outcomes or elements.
The sample space is X ¼ f1; 2; 3; 4; 5; 6g.
Consider the following events and the elements corresponding to the events:
E1 = getting an even number ¼ f2; 4; 6g X,
E2 = getting a number less than 4 ¼ f1; 2; 3g X,
E3 = getting 1 or 3 ¼ f1; 3g X,
E4 = getting an odd number ¼ f1; 3; 5g X,
E5 = getting a negative number ¼ fg ¼ / X, and
E6 = getting a number less than 10 ¼ f1; 2; 3; 4; 5; 6g ¼ X X.
If the experiment has nðXÞ equally likely outcomes, then the probability of the
event E is denoted by P(E) and is defined by:
3.2 Probability of an Event 75
This is the classical definition of probability and under this definition the ex-
periment should satisfy the condition of equally likely as an essential precondition
which may not be true in many practical situations.
Example: If we conduct an experiment with a fair coin, then the outcomes H or
T are equally likely but if we consider outcomes of defective and non-defective
products from an experiment, the condition of equally likely outcomes may be
violated because outcomes are not necessarily equally likely. In that case, we may
use an alternative definition known as the relative frequency or empirical definition
to measure the probability as stated below.
Let an event E occurs nðEÞ times in a series of n trials, where n is the total
number of trials or sample size, the trials are conducted under the same conditions
in the experiment. Here, the ratio nðEÞn is the relative frequency of the event E in
n trials. If n tends to infinity, then we can define the probability of E as follows:
nðEÞ
PðEÞ ¼ lim :
n!1 n
Solution
X ¼ f1; 2; 3; 4; 5; 6g ; nð X Þ ¼ 6
E1 ¼ f2; 4; 6g ; nðE1 Þ ¼ 3
E2 ¼ f1; 2; 3g ; nðE2 Þ ¼ 3
E3 ¼ f1; 3g ; nðE3 Þ ¼ 2
The outcomes are equally likely. Then, by definition the probabilities of the
events, E1 ; E2 ; and E3 are
3 3 2
PðE1 Þ ¼ ; PðE2 Þ ¼ ; and PðE3 Þ ¼ :
6 6 6
nðE1 [ E2 Þ 5
PðE1 [ E2 Þ ¼ ¼
nðXÞ 6
nðE1 [ E4 Þ 6
PðE1 [ E4 Þ ¼ ¼ ¼ 1:
nð X Þ 6
nðE1 \ E2 Þ 1
PðE1 \ E2 Þ ¼ ¼
nðXÞ 6
78 3 Basic Probability Concepts
nðE1 \ E4 Þ nð/Þ 0
PðE1 \ E4 Þ ¼ ¼ ¼ ¼ 0:
nðXÞ 6 6
A \ B ¼ /:
In this case, it is impossible that both events occur simultaneously (i.e., together
in the same time). Hence,
(i) PðA \ BÞ ¼ 0
(ii) PðA [ BÞ ¼ PðAÞ þ PðBÞ:
If A \ B 6¼ /, then A and B are not mutually exclusive (not disjoint).
A \ B 6¼ / A \ B=/
A and B are not mutually exclusive A and B are mutually exclusive (disjoint)
It is possible that both events may occur at the It is impossible that both events occur at the
same time. same time.
Exhaustive Events
The events A1 ; A2 ; . . .; An are exhaustive events if
A1 [ A2 [ . . . [ An ¼ X:
For this case, PðA1 [ A2 [ . . . [ An Þ ¼ PðXÞ ¼ 1.
Note
1. A [ A ¼ X (A and A are exhaustive events),
2. A \ A ¼ / (A and A are mutually exclusive (disjoint) events),
3. nðAÞ ¼ nðXÞ nðAÞ, and
4. P A ¼ 1 PðAÞ.
80 3 Basic Probability Concepts
Special Cases
1. For mutually exclusive (disjoint) events A and B,
PðA [ BÞ ¼ Pð AÞ þ PðBÞ:
Given some variable that can be broken down into (m) categories designated by
A1 ; A2 ; . . .; Am and another jointly occurring variable that is broken down into
(s) categories designated by B1 ; B2 ; . . .; Bs (Tables 3.1 and 3.2).
The marginal probability of Ai , PðAi Þ, is equal to the sum of the joint proba-
bilities of Ai with all categories of B. That is
For example,
For example,
Example 3.4 Let us consider a hypothetical data on four types of diseases of 200
patients from a hospital as shown below:
3.4 Applications of Relative Frequency or Empirical Probability 83
PðE2 \ E4 Þ ¼ Pð/Þ ¼ 0:
since E2 \ E4 ¼ /.
3. E 1 = the disease type of the selected patients is not “A”.
nðE1 Þ ¼ n nðE1 Þ ¼ 200 90 ¼ 110
nðE 1 Þ 110
PðE 1 Þ ¼ ¼ ¼ 0:55:
n 200
84 3 Basic Probability Concepts
Another solution
nðA3 Þ 316
PðA3 Þ ¼ ¼ ¼ 0:4521:
n 699
nðB2 Þ 241
PðB2 Þ ¼ ¼ ¼ 0:3447:
n 699
A3 \ B2 = the selected subject has clump thickness category 5 or above and the
case is malignant,
nðA3 \ B2 Þ 210
PðA3 \ B2 Þ ¼ ¼ ¼ 0:3004:
n 699
P A1 ¼ 1 PðA1 Þ
nðA1 Þ
¼1
n
195
¼1 ¼ 0:7210:
699
since A2 \ A3 ¼ /.
Percentages:
%ðDÞ 20%
PðDÞ ¼ ¼ ¼ 0:2
100% 100%
%ð M Þ 41%
PðM Þ ¼ ¼ ¼ 0:41
100% 100%
% ðD \ M Þ 15%
PðD \ M Þ ¼ ¼ ¼ 0:15:
100% 100%
D \ M = the selected patient did not visit physician and did not use medication.
54
P D\M ¼ ¼ 0:54:
100
D \ M = the selected patient did not visit physician and used medication.
5
P D\M ¼ ¼ 0:05:
100
D \ M = the selected patient visited physician and did not use medication.
5
P D\M ¼ ¼ 0:05:
100
D [ M = the selected patient did not visit physician or did not use medication.
P D[M ¼ P D þ M P D\M
¼ 0:80 þ 0:59 0:54 ¼ 0:85:
The conditional probability of the event A when we know that the event B has
already occurred is defined by
88 3 Basic Probability Concepts
PðA \ BÞ
PðAjBÞ ¼ ; PðBÞ 6¼ 0:
PðBÞ
Notes:
i. PðAjBÞ ¼ PðPAð\BÞBÞ
ii. PðAjBÞ ¼ nðnAð\BÞBÞ
iii. Using the restricted table directly.
PðA \ BÞ ¼ PðBÞPðAjBÞ;
PðA \ BÞ ¼ Pð AÞPðBjAÞ:
Example 3.7 Let us consider a hypothetical set of data on 600 adult males classified
by their ages and smoking habits as summarized in Table 3.7.
Consider the following event:
(B1|A2) = smokes daily given that age is between 30 and 39
Table 3.7 Two-way table displaying number of respondents by age and smoking habit of
respondents smoking habit
Daily ðB1 Þ Occasionally ðB2 Þ Not at all ðB3 Þ Total
Age 20–29 ðA1 Þ 57 18 13 88
30–39 ðA2 Þ 200 55 90 345
40–49 ðA3 Þ 50 40 55 145
50+ ðA4 Þ 7 0 15 22
Total 314 113 173 600
3.5 Conditional Probability 89
nðB1 Þ 314
PðB1 Þ ¼ ¼ ¼ 0:523
n 600
PðB1 \ A2 Þ
PðB1 jA2 Þ ¼
PðA2 Þ
0:333
¼ ¼ 0:579
0:575
( )
PðB1 \ A2 Þ ¼ nðB1 n\ A2 Þ ¼ 200
600 ¼ 0:333
nðA2 Þ
PðA2 Þ ¼ n ¼ 600 ¼ 0:575
345
Another solution
nðB1 \ A2 Þ 200
PðB1 jA2 Þ ¼ ¼ ¼ 0:579:
nðA2 Þ 345
Notice that
PðB1 Þ ¼ 0:523
PðB1 jA2 Þ ¼ 0:579
PðB1 jA2 Þ [ PðB1 Þ; PðB1 Þ 6¼ PðB1 jA2 Þ:
The probability of using medication given that the patient visited physician is
75%
PðBjAÞ ¼ 0:75 ¼ 0:75 :
100%
We can conclude that 15% of the patients with general health problems visited
physician and used medication.
Independent Events
There are three cases in a conditional probability for occurrence of A if B is given:
(i) PðAjBÞ [ Pð AÞ
(given B increases the probability of occurrence of A),
(ii) PðAjBÞ\Pð AÞ
(given B decreases the probability of occurrence of A), and
(iii) PðAjBÞ ¼ Pð AÞ
(given B has no effect on the probability of occurrence of A). In this case, A is
independent of B.
Independent Events: Two events A and B are independent if one of the fol-
lowing conditions is satisfied:
(i) PðAjBÞ ¼ Pð AÞ;
(ii) PðBjAÞ ¼ PðBÞ; and
(iii) PðB \ AÞ ¼ Pð AÞPðBÞ:
Note: The third condition is the multiplication rule of independent events.
Example 3.9 Suppose that A and B are two events such that
Also,
PðA \ BÞ 0:2
PðAÞ ¼ 0:5 6¼ PðAjBÞ ¼ ¼ ¼ 0:3333 and
PðBÞ 0:6
PðA \ BÞ 0:2
PðBÞ ¼ 0:6 6¼ PðBjAÞ ¼ ¼ ¼ 0:4:
PðAÞ 0:5
B B Total
A 0.2 ? 0.5
A ? ? ?
Total 0.6 ? 1.00
Example 3.10 Suppose that 12 patients admitted with fever (F = high fever,
F = low fever) in a hospital are selected randomly and asked their opinion about
whether they were satisfied with the services in the hospital (S = satisfied, S = not
satisfied). The table below summarizes the data.
The experiment is to randomly choose one of these patients. Consider the fol-
lowing events:
Solution
(a) The experiment has n = 12.
P(patient is satisfied with service)
nðSÞ 3
P ð SÞ ¼ ¼ ¼ 0:25:
n 12
nðFÞ 8
PðF Þ ¼ ¼ ¼ 0:6667:
n 12
nðS \ FÞ 2
PðS \ FÞ ¼ ¼ ¼ 0:16667:
n 12
P(satisfied with service and does not suffer from high fever)
nðS \ FÞ 1
PðS \ FÞ ¼ ¼ ¼ 0:0833:
n 12
(b) The probability of selecting a patient who suffers from high fever given that the
patient is satisfied with service is
PðS \ FÞ 2=12
PðFjSÞ ¼ ¼ ¼ 0:6667:
PðSÞ 0:25
Bayes’ theorem gives a posterior probability using the estimate used for a prior
probability. This procedure is performed using the concept of the conditional
probability. Let us consider that there are two states regarding the disease and two
states stating the result of a test then the outcomes are
We are interested in the true status of the disease for given test result which is a
posterior probability. This is essentially a conditional probability, for instance,
probability of being suffered from disease for the given fact that the screening test
result shows positive. On the other hand, the estimate of the prevalence of the
disease in the population may be considered as the prior probability which is a
marginal probability and does not depend on any other condition. This helps us to
understand or confirm how good the true status of the disease is for a given test
result.
There are two false results that can happen in an experiment concerning disease
status and test results.
1. A false-positive result: This is defined by a conditional probability
and this result happens when a test shows a positive status if the true status is
known to be negative.
94 3 Basic Probability Concepts
where the conditional probability states that a test shows a negative status;
however, the true status is known to be positive.
which is the probability of a negative test result given the absence of the disease.
To clarify these concepts, suppose we have a sample of (n) subjects who are
cross-classified according to disease status and screening test result (Table 3.9).
For example, there are “a” subjects who have the disease and whose screening
test result was positive.
From this table, we may compute the following conditional probabilities:
1. The probability of false-positive result
nðT \ DÞ b
PðTjDÞ ¼ ¼ :
nðDÞ bþd
nðT \ DÞ c
PðTjDÞ ¼ ¼ :
nðDÞ aþc
nðT \ DÞ a
PðTjDÞ ¼ ¼ :
nðDÞ aþc
nðT \ DÞ d
PðTjDÞ ¼¼ ¼ :
nðDÞ bþd
where the conditional probability states the chance of a subject has the disease
given that the subject has a positive test result.
The predictive value negative: The predictive value negative is defined as the
conditional probability
where the conditional probability states chance of a subject does not have the
disease, given that the subject has a negative test result.
We calculate these conditional probabilities using the knowledge of the
following:
1. The sensitivity of the test = PðTjDÞ,
2. The specificity of the test = PðTjDÞ, and
3. The probability of the relevant disease in the general population, P(D). It is
usually obtained from another independent study or source.
We know that
PðT \ DÞ
PðDjTÞ ¼ :
PðTÞ
Using the rules of probability, we have defined earlier, it can be shown that
PðTjDÞPðDÞ
PðDjTÞ ¼ :
PðTjDÞPðDÞ þ PðTjDÞPðDÞ
It is noteworthy that
PðTjDÞ ¼ sensitivity;
PðTjDÞ ¼ 1 PðTjDÞ ¼ 1specificity;
P(D) = The probability of the relevant disease in the general population,
PðDÞ ¼ 1 PðDÞ:
PðTjDÞPðDÞ
PðDjTÞ ¼ :
PðTjDÞPðDÞ þ PðTjDÞPðDÞ
Solution
Using these data, we estimate the following probabilities:
1. The sensitivity of the test
nðT \ DÞ 950
PðTjDÞ ¼ ¼ ¼ 0:95:
nðDÞ 1000
nðT \ DÞ 980
PðTjDÞ ¼ ¼ ¼ 0:98:
nðDÞ 1000
8%
PðDÞ ¼ ¼ 0:08:
100%
PðTjDÞPðDÞ
PðDjTÞ ¼ :
PðTjDÞPðDÞ þ PðTjDÞPðDÞ
950
PðTjDÞ ¼ ¼ 0:95; and
1000
nðT \ DÞ 20
PðTjDÞ ¼ ¼ ¼ 0:02:
nðDÞ 1000
ð0:95ÞPðDÞ
PðDjTÞ ¼
ð0:95ÞPðDÞ þ ð0:02ÞPðDÞ
ð0:95Þð0:08Þ
¼ ¼ 0:81:
ð0:95Þð0:08Þ þ ð0:02Þð1 0:08Þ
98 3 Basic Probability Concepts
As we see, in this case, the predictive value positive of the test is moderately
high.
5. The predictive value negative of the test
We wish to estimate the probability that a subject who is negative on the test
does not have the disease. Using the Bayes’ formula, we obtain
PðTjDÞPðDÞ
PðDjTÞ ¼ :
PðTjDÞPðDÞ þ PðTjDÞPðDÞ
980
PðTjDÞ ¼ ¼ 0:98;
1000
PðDÞ ¼ 1 PðDÞ ¼ 1 0:08 ¼ 0:92;
nðT \ DÞ 50
PðTjDÞ ¼ ¼ ¼ 0:05:
nðDÞ 1000
PðTjDÞPðDÞ
PðDjTÞ ¼
PðTjDÞPðDÞ þ PðTjDÞPðDÞ
ð0:98Þð0:92Þ
¼
ð0:98Þð0:92Þ þ ð0:05Þð0:08Þ
¼ 0:996:
3.7 Summary
The basic probability concepts are introduced in this chapter. It has been mentioned
in Chap. 1 that probability plays a vital role to link the measures based on samples
to the corresponding population values. The concept of random sampling provides
the foundation for such links that depend on the concepts of probability. The
definitions of probability along with experiment, sample space, event, equally
likely, and mutually exclusive outcomes are highlighted. The important operations
on events are discussed with examples. The concepts of marginal and conditional
probabilities are discussed in a self-explanatory manner with several examples. The
multiplication rules of probability and the concept of independent events are
introduced in this chapter. The applications of conditional probability and Bayes’
theorem in analyzing epidemiological data are shown. The measures of sensitivity
and specificity of tests are illustrated with examples.