0% found this document useful (0 votes)
104 views

Understanding Conditional Probability

This article offers a new approach to teaching the difficult concept of conditional probability using Venn diagrams and tree diagrams. It discusses how people's natural reasoning can lead them to make errors in probability judgments and ignores relevant information. The approach presented uses visual representations to help students understand conditional relationships and solve problems without memorizing complex formulas.

Uploaded by

Junhy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
104 views

Understanding Conditional Probability

This article offers a new approach to teaching the difficult concept of conditional probability using Venn diagrams and tree diagrams. It discusses how people's natural reasoning can lead them to make errors in probability judgments and ignores relevant information. The approach presented uses visual representations to help students understand conditional relationships and solve problems without memorizing complex formulas.

Uploaded by

Junhy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Understanding Conditional Probability

KEYWORDS: Stephen Tomlinson


Teaching; University of Alabama, USA
Reasoning;
Venn diagram; Robert Quinn
Tree diagram University of Nevada, USA

Summary
This article offers a new approach to teaching
the difficult concept of conditional probability.

◆INTRODUCTION◆ Kahneman, have shown that all human beings


approach experience with expectations and stere-
otypes that exert a strong and oftentimes distorting
grip over judgement (Tversky and Kahneman,
THE fundamental aim of education is the devel- 1974). Consider, for example, how the mind
opment of understanding: the mastery of concepts imposes a script on the following problem: If
and procedures that permit the intelligent solution Linda is a thirty-one year old single woman who is
of problems in new and novel situations. In outspoken on social issues such as disarmament
mathematics this goal is particularly challenging, and equal rights, which of the following statements
for students are often adept at memorising the is more likely to be true?
rules and formulas necessary to solve well-
defined test and textbook problems, without ever P: Linda is a bank teller.
coming to terms with the meaning or logic of the Q: Linda is a bank teller and active in the femi-
arguments they employ. As research shows, nist movement.
outside of the classroom even the most familiar
operations can be ignored in favour of intuitive According to Kahneman and Tversky more than
judgements (Gardner, 1991; Piattelli-Palmarini, 80% of those questioned including many schooled
-

1994). But while commonsense may provide in statistics chose Q, even though the set of bank
-

serviceable solutions to many problems, experi-


tellers who are feminists is included within the set
ence also reveals that our natural theories can
of bank tellers, and is thus a smaller proportion of
lead us into serious errors. To improve their
intellectual skills, students must learn to recog- the population (i.e. since Q ⊆ P then p(Q)≤ p(P) )
nise the limits of their spontaneous problem- (Kahneman, Slovic and Tversky, 1982). We posed
solving strategies and develop secondary the same question to 18 undergraduate mathemat-
intuitions for the more precise and powerful ics education majors and found that only two
reasoning tools developed by mathematicians judged P the more probable event!
(Fischbein, 1987). The distorting influence of our natural reasoning
skills can also be seen in the way people tend to
systematically ignore relevant knowledge in the
◆PROBABILITY AND ◆
decision-making processes. For example, to test
JUDGEMENT how subjects integrated “base-rates” into their
judgements, Tversky and Kahneman presented two
One of the most difficult areas in which to groups with 100 character profiles. Group A was
achieve these goals is probability theory. On one told that their dossiers consisted of 70 lawyers and
hand, the laws of probability contain many 30 engineers, while group B was informed there
abstract expressions, complex terms, and nested were 70 engineers and 30 lawyers. Yet, despite
relationships that pupils find hard to understand. these different populations, each group divided
On the other hand, psychologists, building upon their reports in roughly the same proportion even
-

the pioneering work of Amos Tversky and Daniel though group A should have identified more than
twice as may lawyers as group B! Interestingly, Yet, as research reveals, students find the notion
when given a stack of neutral personality sketches, of independent events and the laws of conditional
such as: probability extremely difficult to understand
“Dick is a 30 year old man. He is married (Shaughnessy, 1993). In the following discussion
with no children. A man of high ability and we offer a new approach to these concepts that can
high motivation, he promises to be quite help students develop a working understanding of
successful in his field. He is well liked by his probability laws. By representing events pictori-
colleagues.” ally, we show how the Venn diagram and the Tree
both groups split the lawyers and engineers fifty- diagram can be integrated into a single problem-
fifty not seventy-thirty or thirty-seventy as would solving instrument that provides a graphical
be expected (Tversky and Kahneman, 1974). mechanism for constructing conditional probabili-
According to Howard Gardner the cognitive ties and answering Bayesian questions without
heuristics that naturally guide reasoning in such having students memorise complex formulas.
cases are rather like optical illusions pathways in
- Since conditional relationships and Bayes’ Theo-
thought that lead to conclusions which, upon rem lie at the heart of the scientific method, and
reflection, can be recognised as erroneous. And indeed permeate everyday experience, the develop-
yet, despite their intuitive necessity, Gardner ment of such reasoning skills is vital for informed
believes such judgements can be overcome through decision-making in many situations.
meta-cognition, the conscious use of mathematical
tools. Just as we can employ the rules of logic to
construct valid arguments, so, he argues, a knowl- ◆VENN DIAGRAMS◆
edge of probability can help us formulate sound
inferences. The problem is that the ability to Consider the following three events based upon a
employ such secondary reasoning is not fostered single roll of a fair die:
by traditional teaching practices resting upon the X: An even number
memorisation of formulas. In recent years educa- Y: A number greater than two
tors interested in stochastics have responded to this Z: A prime number
challenge by constructing numerous pedagogical While Venn diagrams are helpful in sorting out
strategies that help students recognise the limits of which outcomes in an experiment are common to
their subjective theories and learn how to explore different events, they do not clearly reveal the
probabilistic relationships objectively. For in- underlying relationships of dependence and inde-
stance, many papers included in the proceedings of pendence that are essential to the solution of
the first and second International Conferences on conditional probability problems. Events may be
Teaching Statistics (ICOTS) stress the need for logically or contingently connected to one another,
pupils to conduct experiments, play games, employ in such a way that the occurrence of one increases
computer simulations, and use graphical represen- or decreases the probability of the other. Thus,
tations in order to gain an empirically grounded while the chance of rain is independent of our
measure of events (Grey, Holmes, Barnett and wishes, the number of people carrying umbrellas to
Constable, 1983; Davidson and Swift, 1988). Such work depends upon the weather forecast. In the
concrete activities are necessary to generate what case of the events X, Y and Z, however, such
might be called a “Socratic encounter” with the relationships are not at all obvious. Does knowl-
laws of probability; they provide a level of per- edge that a prime has been rolled increase, de-
sonal investment in the resolution of problematic crease, or not affect the probability of the outcome
situations that permits serious questioning of being an even number? Further, given such condi-
judgement and an openness to the consideration of tions, how can the probability of getting both a
mathematical laws. However, if students are to prime and an even number be determined? As the
understand and employ normative methods of story about Linda demonstrates, questions like
problemsolving to enhance their decision-making these, which are central to many probability
these practical explorations must be augmented problems, can only be answered when the implicit
with the theoretical modelling of events, for, relationships between events are fully understood.
whether frequentist or classical, it is only through As Fig. 1 illustrates, it is possible to find the
an intuition of mathematical arguments that intelli- probability of combined events like “A and B”,
gent thought and rational judgement can be pro- hereafter p(AB), by exhaustively arranging all the
moted. simple events within the appropriate regions of the
Venn diagram. But, importantly, this value does not
Y Y Figure 1 Intersection Tables for sets X, Y and Z
1234
1234 12345
12345
2 1234
12345
1234
4 12345
12345
2 4
12345
1234
12345
1234
12345
3 1234
6 3 6
X: an even number. Y: a number
XY 12345
1234 1234 greater than 2, Z: a prime number
12345 1234
12345
12345
5 1 5 1234
1234
1

12345678
12345678 1234
1234 12345
12345
12345678
12345678
2 4 2 1234
1234
4 12345
12345
2 4 XY p(Y)=2/3 p( Y ) = 1/3
12345678
12345 1234 12345
X 12345
3 12345
6 3
1234
1234
6 3 6
12345 1234 p(X) = 1/2 p(XY) = 1/3 p(X Y ) = 1/6
5 1 5 1 5 1
p( X ) = 1/2 p(XY) = 1/3 p( X Y ) = 1/6

2 4 2 4 2 4
1234
1234 12345
12345
X 1234
3
1234 6
12345678 12345
3
12345 6 3 6
1234
12345678 12345 1234
12345678
12345678
5 1
12345
12345
5 1 5
1234
1234
1
12345678 12345 1234

Z Z
12345
12345 1234
1234
12345 1234
12345
2
12345
4 2 1234
4
1234
YZ 12345 1234
12345
3
12345 6 3 1234
1234
6
12345
12345 1234
1234
12345
5 1 5 1234
1
12345 1234

12345
12345 1234
1234
2 12345
4 2 4 2 1234
4 YZ p(Z)=1/2 p( Z ) = 1/2
1234
123412345
12345 12345
12345 1234
1234
Y 1234
3 12345
123412345
6 12345
12345
3 6 3 1234
1234
6
1234 12345 1234 p(Y) = 2/3 p(YZ) = 1/3 p(Y Z ) = 1/3
1234
1234 12345
12345
5 1 5 1 5 1
1234 12345
p( Y ) = 1/3 p( Y Y) = 1/6 p( Y Z ) = 1/6
1234
1234 12345
12345
1234
2
1234 4 12345
2
12345 4 2 4

Y 3 6 3 6 3 6
12345
12345 1234
1234
5 12345
12345
1 5 1 5 1234
1234
1
1234

Z Z
12345
12345 1234
1234
12345
2
12345 4 2 1234
4
1234
XZ 12345
12345 1234
1234
12345
3
12345
6 3 1234
6
1234
12345 1234
12345
5
12345 1 5 1234
1
1234

12345678
12345678 12345
12345 1234
1234
12345678
2 12345
4
12345678 12345
2
12345 4 2 1234
4
1234 XZ p(Z)=1/2 p( Z ) = 1/2
12345 1234
X 3 12345
12345
6 3 6 3 1234
1234
6
12345 1234
5 1 5 1 5 1 p(X) = 1/2 p(XZ) = 1/6 p(X Z ) = 1/3

1234
2
1234
4 2
12345
4 2 4 p( X ) = 1/2 p( X Y) = 1/3 p( X Z ) = 1/6
X 1234 12345
1234
3
1234
6
12345678 12345
3
12345
6 3
1234
6
12345678
12345678 12345
12345 1234
1234
12345678
5 1
12345678 12345
5
12345
1 5 1234
1
1234
AB
If set A is represented by
the triangle and set B by
the upper rectangle:
P(B|A) = P(AB)
P(A) p(A) =1/2

p(AB) = 1/8

p(B) = 1/2
A P(B|A) = P(AB)
P(A) p(N|A) = p(AB)/p(A)

= (1/8)/1/2) = 1/4

AB
Figure 2 Conditional Probability Laws

always equal the product p(A)p(B), as the sym-


bols “AB” or the term “and” might suggest. For p(Event) = p(Solution set)/p(Outcome Set)
example, while the “intersection tables” for XY
and 17 show that p(XY) = p(X)p(Y) and p(YZ) = Even event Z, rolling a prime number, depends
p(Y)p(Z), the table for XZ reveals that p(XZ) ≠ upon the die being fair. Thus,
p(X)p(Z). This is because X and Z are dependent p(Z) = p(Z|Fair die) = p(Z|Universal Set) =
events influencing one another’s probability
- -
the sum of the simple events in set Z
while events X and Y, and events Y and Z are
both independent. the sum of the probabilities in set U
As teachers will recognise, one of the most 3
common assumptions students make is that the = 6 = 12
probability of a combined event can be calculated 6
by multiplying its constituent probabilities. While
6
this empirical “product rule” holds for independ-
ent events, such as Y and Z, students must come This argument can be presented pictorially by
to appreciate how this formula breaks down with making the area of each region within the Venn
dependent relationships. Having pupils postulate diagram correspond to the probability of the event
and then compute the values p(X), p(XY), p(XY), it represents. Thus, the universal set has an area of
and so on, for every entry in these tables will lead 1, and, in the example above, set Z has an area of
them to face this fact squarely and, as we have 1/2.
argued, provide a powerful stimulus for motivat-
ing the mathematical exploration of these events. p( AB) Area of region AB
Clearly, having students memorise and apply the
p( Z ) = =
p( A) Area of region A
formula P(AB) = p(A)p(B|A) in textbook exer- In general, for any two events A and B, if p(A)
cises will not ensure their grasp of probability ≠0
laws. Because the expression p(B|A), the prob-
p( AB) Area of region AB
ability of B given A, is a comparison of the p( Z ) = =
simple events in set A and the simple events in its p( A) Area of region A
subset AB, the underlying relationships between
events must be understood through the subset Thus, from figure 1
relation. Indeed, all probabilities are conditional,
and are computed from the same basic definition:
a probability be computed that depends on a
p( ZX ) 16 1 knowledge of B prior to A? As Falk shows,
p( Z | X ) = = = 3
p( X ) 1 when faced with such “diagnostic” questions
2 (so named because Bayesian relationships
But how are values like p(ZX) determined without often arise in medical testing), most students
exhaustively enumerating all the simple events in a have little or no idea of how non-causal
Venn diagram? Here we must turn to the second of connections can be computed - some even
our mathematical tools, the Tree diagram. believe that the problems themselves are
nonsensical (Falk, 1994). The traditional
approach to such questions is to employ
◆TREE DIAGRAMS◆ Bayes’ Theorem, which in its simplest form
states
The Tree diagram is most commonly used to p( A) p( B | A)
determine the probabilities of events in a p( A | B) =
multistage experiment. For example, the probabil- p( A) p( B | A) + p( A ) p( B | A )
ity of getting two heads when a coin is tossed three However useful this formula may be, it clearly
times, or the probability of randomly selecting two does not provide the student with an intuition
green marbles from ajar that contains one blue, of the reasoning process necessary to solve
three green, and two white ones. In such cases, the such embedded problems. Yet the expression
dependence or independence of events is clearly p(B|A) depends upon a subset relation similar
defined by the causal sequence of the experiment. to those in the previous conditional probability
It is easy to see that the result of tossing one coin problems. That is,
does not influence how the next one will land, but
p( AB) Area of region AB
that removing a blue marble from the jar does alter p( A | B) = =
the probability of future selections. The difficulty p( B) Area of Region B
comes with events such as X, Y and Z, where The difficulty, of course, is that there is no
conditional relationships must be determined by Venn diagram for set B in a tree diagram that
looking at the nest of logical connections defined starts with event A. If there were, p(A|B) could
by the structure of their subsets. We therefore be read off as easily as p(B|A). But, as we have
propose that Venn diagrams be integrated with the argued, since the tree diagram is composed of
Tree diagram by defining the various branches to the mutually exclusive events whose union is
represent an ordered proper subset relation on the Universal set, it does contains the building
exclusive events. (When read in the opposite blocks out of which new Venn diagrams can be
direction, the Venn diagrams of each stage thus constructed. In short, set B - and set B - can
become the union of sets at the previous level). be produced very simply by extending the tree
The subsets of A are AB and A B , while, con- using the operation of set union. The result, as
the diagram in Fig. 3 shows for X and Z, is a
versely, AB ∪ AB = A .The basic atom of this
two-directional tree that contains all the events
scheme, defining the relationships between p(A),
necessary to solve any conditional probability
p(AB) and p(B|A) is shown in Fig. 2. Further, as
problem. Of course, as students become
this diagram illustrates, by partitioning the unit
familiar with this method the Venn diagrams
square into “event regions”, area can be used to
can be dropped from the tree and replaced with
represent the probability values of all simple and
probability values (see Fig. 4). Consider the
combined events.
following problem originally presented by
Tversky and Kahneman which has become
something of a classic in the literature on
◆BAYES’ THEOREM ◆ conditional judgement (Tversky and
Kahneman, 1974; Scholtz, 1987).
An important class of questions turns upon the
computation of conditional probabilities that are An old man witnesses a hit and run
implicit in an experiment or a set of data. For accident and reports that the car involved
example, one might ask “What is p(Green on first was a blue cab (statement “b”). There are
draw| Blue on second)?” Like asking for p(A|B) in two taxi companies in the town, the Blue
Fig. 2, this problem inverts the direction of the tree (B) with 15 cars and the Green (G) with
diagram; if the branches lead from A to B, how can 85. At the trial the man’s vision is tested
p(ZX)
123456 = 1/6
123456
123456
123456
2
123456
4
123456
3 6
123456
123456
1/3 1/3 123456
12345612345
123456 5 1 12345612345
12345
123456
2
123456 4 123456
2 12345
123456 4
12345
123456
123456 12345612345
12345
Z 123456
123456
3
123456 6 p(ZX) = 1/3 3 12345
12345
6
12345
X
123456
123456 12345
1/2 123456 2 4
123456
5 1 123456 2/3 5 1 1/2
123456
123456 2/3 123456
123456
123456
3
123456 6
123456
123456
123456
123456
5
123456 1
123456
p(ZX)123456
= 1/3
123456
123456
123456
2 123456
4
123456
123456
123456 123456 2/3
123456 3 123456
6
1/2 123456 123456 1/2
2 123456
4
123456 2/3 123456 2 4
123456 123456
123456 5 1 123456
3 123456
123456
6
123456
123456
123456
3
123456 6
Z 123456 123456
123456
123456
123456 123456
123456
123456
123456
X
5 123456
1
123456 1/3 p(ZX) = 1/6 1/3 123456
5 123456
123456 1
123456
123456 123456
123456
2 4

3 6
123456
123456
5 123456
123456
1
123456
123456
Figure 3 The Tree Diagram for X and Z

Figure 4 The Hit and Run Affair


and he is found to be capable of identifying graphic tool that can help in this endeavour. While
the correct colour cab 80% of the time (i.e. the model we propose may not fit all conditional
p(b|B) = p(gIG) = 0.8). probability problems, we believe that, combined
with the kind of empirical explorations discussed
Given this information Tversky and Kahneman by Shaughnessy, it can add to the teacher’s ar-
then pose the following Bayesian question: moury of methods for helping students come to
grips with dependent and independent events,
Q: What is the probability that a blue car was normative rules for combined events, and the logic
involved in the crime? of Bayesian questions.

According to their report most subjects guess that


the value of Q is over 1/2. In our own trials we References
Gardner, H. (1991). The Unschooled Mind. New York:
found that undergraduate mathematics education
Basic Books.
majors were also inclined to “accept the old man’s Grey, D. R., Holmes, P., Barnett, V. and Constable, G.
judgement as reliable”, and assume the blue car M. (Eds.) (1983). Proceedings of the First International
was therefore most likely the one involved. Yet, Conference on Teaching Statistics. Sheffield: Teaching
however compelling this argument may seem, a Statistics Trust.
jury which adopted the same reasoning would Falk, R. (1994). Inference Under Uncertainty Via
probably send the wrong driver to jail! For, follow- Conditional Probabilities. Studies of
ing our argument, the probability that the blue car Mathematics Education: Vol 7. Teaching Statistics in
was involved, given the old man’s claim, is Schools. Paris: UNESCO.
Fischbein, E. (1987). Intuition in Science and Math-
ematics. Dordecht, The Netherlands: Reidel.
p(B|b) = P(Bb) / p(b)
Kahneman, D., Slovic, P. and Tversky, A. (1982).
Judgement Under Uncertainty: Heuristic and Biases.
As Fig. 4 shows, Cambridge: Cambridge University Press.
Piattelli-Palmarini, M. (1994). Inevitable Illusions. New
p(b) = p(Bb) + p(Gb) = 0.12 + 0.17 = 0.29 York: John Wiley & Sons.
and therefore p(B|b) = 0.12 / 0.29 = 0.41 Scholtz, R. (1987). Cognitive Strategies in Stochastic
Thinking. Dordecht, The Netherlands: Reidel
While Shaughnessy is surely correct to advocate . Shaughnessy, J. M. (1992). Research in Probability and
simulating these frequencies in a number of trials, Statistics: Reflections and Directions. Handbook of
the simple method we propose for computing Research on Teaching Mathematics and Learning D. A.
Grouws (Ed.). New York: Macmillan.
conditional probabilities appears, from our experi-
Tversky, A. and Kahneman, D. (1974). Judgement under
ence, to be well within the grasp of most compe- Uncertainty: Heuristics and Biases. Science 185, 1124-
tent students. Indeed, recognising that theory ought 1131.
to follow practice, we even found that students
could derive Bayes’ Theorem for themselves
through the simple exercise of completing the
extended tree diagram for events A and B.

◆CONCLUSION ◆
Conditional probability is a difficult topic for
students to master. Often counter-intuitive, its
central laws are composed of abstract terms and
complex equations that do not immediately mesh
with subjective intuitions of experience. If students
are to acquire the mathematical skills necessary for
rational judgement, teaching must focus on chal-
lenging the personal biases and cognitive heuristics
identified by psychologists, and demonstrate in the
most accessible way the power of probabilistic
-

reasoning. In our discussion we suggest one

You might also like