Lecture 9
Lecture 9
She could either have a positive test and have breast cancer or have a positive test but not
have cancer (false positive).
Review: Breast cancer screening example
We use Bayes’s rule, where
A1 is cancer, A2 is no cancer,
and B is a positive test
result.
Review: Breast cancer screening example
• We determined that the probability that a person has cancer given
that they tested positive is only about 0.3% (or 3 in 1000). This seems
very low!
• How do we rationalize this low probability?
• Let's step back a little. Before any screening (testing) was done, there
was only a 4-in-10,000 chance that a randomly selected person in this
population would have cancer.
• The test did its job relative to the unconditional probability in the
population.
Plan for today
• Formal definition of independent events
• Introduction to random variables
• Excel: Normal quantile plot
Definition of independent events
• Recall: A and B are independent when they have no
influence on each other’s occurrence.
• The formal definition of independence is:
• Definition: Two events A and B that both have positive
probability are independent iff P(B|A) = P(B).
• Notice that the multiplication rule for independent events is
a special case of the general multiplication rule. P(A and B) =
P(A)P(B|A) = P(A)P(B).
Example of non-independent events
Let’s use a deck of cards to illustrate independence and dependence.
Suppose that two cards are drawn from a standard deck of playing
cards without replacement. What is the probability that both cards
are spades?
Let A = first card is a spade and let B = next card is a spade.
We want P(A and B).
The multiplication rule tells us that P(A and B) = P(A)*P(B|A).
In this case, we have P(A and B) = (13/52)*(12/51) = 0.0588.
Example of independent events
Now, suppose that two cards are drawn from a standard deck of
playing cards with replacement. What is the probability that both cards
are spades?
Let A = first card is a spade and let B = next card is a spade.
We want P(A and B).
The multiplication rule tells us that P(A and B) = P(A)*P(B|A).
In this case, we have P(A and B) = (13/52)*(13/52) = 0.0625.
You see that the probability of the second card being a spade is not
affected by the first draw. This demonstrates the independence of the
draws when the cards are sampled with replacement.
Example 1: Probability Review
Suppose that a medical test has a 92% chance of detecting a disease if
the person has it and a 94% chance of correctly indicating that the
person does not have the disease when they actually don’t.
Suppose that it is known that 10% of the population have the disease.
What is the probability that a randomly selected person will test
positive? Have disease = 0.1
0.92
Positve 0.092 P (Have disease and Positive)
Population
0.06
Don’t Have disease = 0.9 Positve 0.054 P (Don’t Have disease and Positive)
corn plants
The induced probability function
• The random variable 𝑋 defines a new sample space that is
situated in the real numbers
• We use our probability function 𝑃, defined on the original
sample space, to obtain an induced probability function 𝑃! for
the random variable:
• 𝑃! 𝑋 = 𝑥" = 𝑃 𝑠# ∈ 𝑆: 𝑋 𝑠# = 𝑥"
• The above just says that the probability of getting 𝑥" equals the
probability of all the original outcomes that get mapped to 𝑥"
• We call the new probability function the probability distribution
of the random variable
Example of induced probability function
A basketball player shoots three free throws. The random
variable X represents the number of baskets successfully made.
The tree lists all the possible outcomes in terms of
hits and misses. The sample space has 8 elements.
HMM HHM
MHM HMH
MMM MMH MHH HHH
Value of X 0 1 2 3
M = Fail
Probability 1/8 3/8 3/8 1/8 H = success
𝑷𝒙
What is the probability that the player successfully makes fewer than three baskets?
amounts of fertilizer
to corn plants
Assigning probabilities: intervals of outcomes
• The probability distribution for a continuous random variable is represented by
a density curve
• The area under the entire density curve is equal to 1
• In order for a curve to represent the probability distribution of a
continuous random variable (a pdf), the total area beneath the curve must
be 1.
normal dist.
Probability of an event with a continuous
random variable
• An event for a continuous random variable is an interval or a union of
intervals
• The probability of any event is given by the area under the density
curve for the interval of values of 𝑋 that make up the event
Example of continuous distribution
• Consider the experiment of asking a random number generator
in software to produce a random number between 0 and 1
• The sample space is S = {all numbers between 0 and 1}
• All possible outcomes are equally likely. The results of many
trials are represented by the density curve of the uniform
distribution.
Total always = 1
Example of continuous distribution -
continued
Probabilities are computed as areas.
P(0.3 £ X £ 0.7) = 0.4 P(X < 0.5 or X > 0.8) = 0.5 + 0.2 = 0.7
Example of continuous distribution -
continued Formula:
P(X = 1) = k-a / b-a
P(X < 0.5 or X > 0.8) = P(X < 0.5) + P(X > 0.8) = 1 − P(0.5 < X < 0.8) = 0.7
Continuous random variable and population
distribution
The shaded area under a density
curve shows the proportion, or %, of
individuals in a population with
values of X between x1 and x2.