Introduction To Propabilty Theory
Introduction To Propabilty Theory
Axioms of Probability
2.1. Sample Space and Events
• We will have a sample space, denoted S (sometimes Ω, or U ) that consists of all possible outcomes
from an experiment.
Example1:
∗ Experiment: Roll two dice,
∗ Sample Space: S = would be all possible pairs made up of the numbers one through six.
List it here.{(i, j) : i, j = 1, . . . 6}. 36 points.
Example 2:
∗ Experiment: Toss a coin twice
∗ S = {HH, HT, T H, T T }}
Example3:
∗ Experiment: Measuring the number of accidents of a random person before they had
turn 18.
· S = {0, 1, 2, . . . }
Others:
∗ Let S be the possible orders in which 5 horses nish in a horse race;
∗ Let S be the possible price of some stock at closing time today; or S = [0, ∞) ;
∗ The age at which someone dies, S = [0, ∞) .
• Events: An event A is a subset of S . In this case we use the notation A ⊂ S , to mean A is a
subset of S.
A ∪ B : points in S such that is in A OR B OR BOTH.
A ∩ B , points in A AND B . (you may also see AB )
0
Ac is the compliment of A, the points Sn NOT in A T.n(you may also see A )
Can extend to A1 , . . . , An events. i=1 Ai and i=1 Ai .
10
2.1. SAMPLE SPACE AND EVENTS 11
• Example1: Roll two dice.
Example of an Events
E =the two dies come up even and equal {(2, 2) , (4, 4) , (6, 6)}
F = the sum of the two dice is 8. {(2, 6) , (3, 5) , (4, 4) , (5, 3) , (6, 2)}.
E ∪ F = {(2, 2) , (2, 6) , (3, 5) , (4, 4) , (5, 3) , (6, 2) , (6, 6)}
E ∩ F = {(4, 4)}.
F c all the 31 other ways that does not include {(2, 6) , (3, 5) , (4, 4) , (5, 3) , (6, 2)}.
• Example2: S = [0, ∞) age someone dies.
Event A = person dies before they reached 30.
∗ A = [0, 30).
Interpret Ac = [30, ∞)
∗ The person dies after they turned 30.
B = (15, 45). Do A ∪ B, A ∩ B and so on.
• Properties: Events also have commutative and associate and Distributive laws.
• What is A ∪ Ac ? = S .
• DeMorgan's Law:
c
(A ∪ B) = Ac ∩ B c .Try to draw a picture
c
(A ∩ B) = Ac ∪ B c .
c c
This works for general A1 , . . . , An : (∪n n c n n c
i=1 Ai ) = ∩i=1 Ai and (∩i=1 Ai ) = ∪i=1 Ai .
• The empty set ∅ = {} is the set that has nothing in it.
• A and B are disjoint if A ∩ B = ∅.
In Probability we may say that events A and B are mututally exclusive if they are disjoint.
mutually exclusive means the same thing as disjoint
2.2. AXIOMS OF PROBABILITY 12
P (∪ni=1 Ai ) = P (∪∞
i=1 Ai )
n
X ∞
X
= P (Ai ) + P (∅)
i=1 n=1
Xn ∞
X
== P (Ai ) + 0
i=1 n=1
n
X
= P (Ai )
i=1
1 = P (S) = P (E) + P (E c ) ,
hence P(E c ) = 1 − P(E).
c
(d) If E ⊂ F, then write F = E ∪ (F ∩ E ) thus since this is disjoint
P (E ∪ F ) = P (E) + P (E c ∩ F ) .
Now write F (with picture) as F = (E ∩ F ) ∪ (E c ∩ F ) and using disjointness
P (F ) = P (E ∩ F ) + P (E ∩ F ) =⇒ P (E c ∩ F ) = P (F ) − P (E ∩ F ) ,
c
P (E ∪ F ) = P (E) + P (E c ∩ F )
= P (E) + P (F ) − P (E ∩ F ) ,
as needed.
• Example: Uconn Basketball is playing Kentucky this year.
Home game has .5 chance of winning
Away game has .4 chance of winning.
.3 that uconn wins both games.
What's the probability that Uconn loses both games?
Answer.
∗ Let P (A1 ) = .5 , P (A2 ) = .4 and P (A1 ∩ A2 ) = .3.
∗ We want to nd P (Ac1 ∩ Ac2 ). Simplify as much as we can:
c
P (Ac1 ∩ Ac2 ) = P ((A1 ∪ A2 ) ) by DeMorgan's Law
= 1 − P (A1 ∪ A2 ) , by Proposition 1c
2.2. AXIOMS OF PROBABILITY 14
P (A1 ∪ A2 ) = .5 + .4 − .3 = .6,
c
Hence P (A1 ∩ Ac2 ) = 1 − .6 = .4 as needed.
2.3. EQUALLY LIKELY OUTCOMES 15
• In many experiments, a probability space consists of nitely many points, all with equally likely
probabilities.
1
Basic example was a tossing a coin P (H) = P (T ) = 2
Fair die: P (i) = 61 for i = 1, . . . , 6.
• In this case from Axiom 3 we have that
number of outcomes in E
P (E) = .
number of outcomes in S
• Example1: What is the probability that if we roll 2 dice, the sum is 7?
Answer: There are 36 total outcomes , of which 6 have a sum of 7:
∗ E = ”sum is 7” = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)}. Since they are all equally
6
likely, the probability is P (E) = 6·6 = 16 .
• Example 2: If 3 balls are randomly drawn from a bowl containing 6 white and 5 black balls,
what is the probability that one ball is white and the other two are black?
Method 1: (regard as a ordered selection)
W BB + BW B + BBW
P (E) =
11 · 10 · 9
6·5·4+5·6·4+5·4·6 120 + 120 + 120 4
= = = .
990 990 11
Method2: (Regard as unordered set of drawn balls)
6 5
(1 white) (2 black) 1 2 4
P (E) = = = .
11 11 11
3 3
• We can always choose which way to regard our experiements.
• Example 3 A committee of 5 is to selected from a group of 6 men and 9 women. What is probability
consistsd of 3 men and 2 women
6 9
3 2
Answer: Easy men·women
all = 240
= 1001 .
15
5
• Example 4: Seven balls are randomly withdrawn from an urn that contains 12 red, 16 blue, and
18 green.
(b) Find probability that at least 2 red balls are withdrawn;
Ans: Let E be this event then P (E) = 1−P (E c ), P (at least 2 red) = 1−P (drawing 0 or 1 balls).
Now
16 + 18 = 34 12 34
7 1 6
P (drawing 0 or 1 red balls) = + .
46 46
7 7
• Explanation of Poker/Playing cards : Ranks and suits,etc!
There are 52 cards in a standard deck of playing cards. The poker hand is consists of ve
cards. There are 4 suits : heats, spades, diamonds, and clubs (♥♠♦♣). The suits diamonds
2.3. EQUALLY LIKELY OUTCOMES 16
and hearts are red while clubs and spades are black. In each suit there are 13 ranks : the
numbers 2, 3 . . . , 10, the face cards, Jack, Queen, King, and the Ace(not a face card).
• Example 5: What is the probability that in a poker hand (5 cards out of 52) we get exactly 4 of
a kind?
4 4
Answer: Consider 4 aces and 1 king: AAAK = . But JJJJ3 is the same
4 1
probability.
∗ Thus there are 13 ways to pick the rst rank, and 12 ways to pick the second rank
363
dierent birthday from the rst two people is
365 . So the answer is
P (at least 2 people) = 1 − P (Everyone dierent birthday)
365 364 363 (365 − 31)
= 1− · · ···
365 365 365 365
364 363 334
= 1−1· · ··· ≈ 0.752374.
365 365 365
Really High!!!
CHAPTER 3
Independence
3.1. Independent Events
P (E ∩ F ) = P (E) P (F ) .
• Example1: Suppose you ip two coins.
The event that you get heads on the second coin is independent of the event that you get tails
on the rst.
This is why: Let At be the event of getting is tails for the rst coin and Bh is the event
of getting heads for the second coin, and we assume we have fair coins (although this is not
necessary), then
1
P (At ∩ Bh ) = , list out all outcomes
4
11 1
P (At ) P (Bh ) = = .
22 4
• Example2: Experiment: Draw a card from an ordinary deck of cards
Let A = draw ace, S = draw a spade.
∗ These are independent events since you're taking one at a time, so one doesn't eect the
other. To see this using the denition we have compute
1 1
∗ P (A) P (S) = 13 4.
1
∗ White P (A ∩ S) = 52 since there is only 1 Ace of spades.
Proof. Draw a Venn Diagram to help with the computation, but note that
P (E ∩ F c ) = P (E) − P (E ∩ F )
= P (E) − P (E) P (F )
= P (E) (1 − P (F ))
= P (E) P (F c ) .
• Remark: Independence and mutually exclusive, are two dierent things!
18
3.1. INDEPENDENT EVENTS 19
S7 = {sum is 7}
A4 = {rst die is a 4}
B3 = {second die is a 3}
Are the events S7 , A4 , B3 independent?
∗ Compute
1
P (S7 ∩ A4 ∩ B3 ) = P ({(4, 3)}) =
36
but
6 11 1
P (S7 ) P (A4 ) P (B3 ) = = .
36 6 6 36 · 6
• Remark: This generalizes to events A1 , . . . , A n . We say events
T A1 , . . . , An are independent if for
r Qr
all subcollections i1 , . . . , ir ∈ {1, . . . , n} we have that P j=1 Aij = j=1 P Aij .
• Example:
An urn contains 10 balls: 4 red and 6 blue.
A second urn contains 16 red balls and an unknown number of blue balls.
A single ball is drawn from each urn. The probability that both balls are the same color is
0.44.
Question: Calculate the number of blue balls in the second urn.
Solution: Let Ri = even that a red ball is drawn from urn i and let Bi =event that a blue
ball is drawn from urn i.
∗ Let x be the number of blue balls in urn 2,
∗ Note that drawing from urn 1 and independent from drawing from urn 2. They are
completely dierent urns! They shouldn't eect the other.
∗ Then
[
.44 = P (R1 ∩ R2 ) (B1 ∩ B2 ) = P (R1 ∩ R2 ) + P (B1 ∩ B2 )
= P (R1 ) P (R2 ) + P (B1 ) P (B2 ) , by independence
4 16 6 x
= + .
10 x + 16 10 x + 16
∗ This tellls you that the slows are constant. What does that tell you about p(x)? It's a
line!
x
· Thus we must have p(x) = 200 .
1
∗ Thus p(50) = 4.
• Example (A variation of Gambler's ruin)
Problem: Suppose we are in the same situation, but you are allowed to go arbitrarily far in
debt. Let p(x)be the probability you ever get to $200. What is a formula for p(x)?
∗ Answer: Just as before p(x) = 12 p(x + 1) + 12 p(x − 1). So that p(x) is linear.
∗ But now all we have is that p(200) = 1 and linear and domain is (−∞, 200).
∗ Draw a graph: Now the slope, or p0 (x) can't be negative, or else we would have it that
p(x) > 1 for x ∈ (−∞, 200).
· The slope can't be positive or else we would get p(x) < 0 for x ∈ (−∞, 200).
∗ Thus we must have that p(x) ≡ constant. Hence p(x) = 1 for all x ∈ (−∞.200).
∗ Sol: So we are certain to get $200 if we cna get into debt.
Method2:
∗ Just compute There is nothing special about the gure 200. Another way of seeing this
is to compute as above the probability of getting to 200 before −M and then letting
M → ∞.
· We would get p(x) is a line with p(−M ) = 0 and p(200) = 1 so that
1−0
p(x) − 0 = (x − (−M ))
200 − (−M )
x+M
and letting M →∞ wee see that p(x) = 200+M → 1.
• Example: Experiment: Roll 10 dice.
What is the probability that exactly 4 twos will show if you roll 10 dice?
Answer: These are independent. The probability that the 1st, 2nd, 3rd, and 10th dice will
1 3 5 7
show a three and the other 6 will not is
6 6 .
Independence is used here: the probability is 16 16 16 65 65 56 65 65 56 61 . Note that the probability
that the 10th, 9th, 8th, and 7th dice will show a two and the other 6 will not has the same
probability.
1 4 5 6
So to answer our original question, we take 6 6 and multiply it by the number of ways
10
of choosing 4 dice out of 10 to be the ones showing the twos. There are ways to do
3
10 1 4
5 6
this
4 6 6 .
• This is an example of Bernoulli trials, or the Binomial distribution.
3.1. INDEPENDENT EVENTS 21
If we have n independent trials, where the probability of success if p. The probability that
there are k successes in n trials is
n n−k
pk (1 − p) .
k
CHAPTER 4
P (E ∩ F )
P (E | F ) = .
P (F )
Now P (E | F ) is read the probability of E given F .
• Note that P (E ∩ F ) = P (E | F ) P (F )!
• This is the conditional probability that E occurs given that F has already occured!
• Remark: Suppose P (E | F ) = P(E) , i.e. knowing F doesn't help predict E . Then this implies
P(E∩F )
that E and F are independent of each other. Rearranging P (E | F ) =
P(F ) = P (E) we see that
P (E ∩ F ) = P(E)P(F ).
• Example1: Experiment: Roll two dice.
(a) What is the probability the sum is 8?
5
∗ Solution: Note that A = {(2, 6), (3, 5), (4, 4), (5, 3), (6, 2)} 36 .
so we know P (A) =
(b) What is the probability that the sum is 8 given that the rst die shows a 3? (In other
words, nd P (A | B))
∗ Solution: Let B = {rst die shows three}.
1
∗ P (A ∩ B) = P ({(3, 5)}) = 36 is probability that the rst die shows a 3 and the sum is
8
∗ Finally we can compute
1/36 1
P (A | B) = P (sum is 8 | 1st is a 3) = = .
1/6 6
• Remark: When computing P (E | F ), Sometime its easier to work with the reduced sample space
F ⊂ S.
22
4.1. CONDITIONAL PROBABILITIES 23
P (sum is 8 | 1st is a 3)
we could have worked in the smaller sample space of {1st is a 3} = {(3, 1) , (3, 2) , (3, 3) , (3, 4) , (3, 5) , (3, 6)}.
Since only (3, 5) begins with a 3 and has the sum of 8, then the probability is
P (U ∩ M c )
P (U nion | N ot Monteith) =
P (M c )
P (U )
= , since U ⊂ Mc
1 − P (M )
4/10 2
= = .
6/10 3
• Example4: Suppose that Annabelle and Bobby each draw 13 cards from a standard deck of 52.
Given that Sarah has exactly two aces, what is the probability that Bobby has exactly one ace?
Solution: Let A be the event Annabelle has two aces," and let B be the event Bobby has
exactly one ace." Again, we want
P (B | A), so we calculate
and
P(A) P(A ∩ B). Annabelle
52 4 48
could have any of possible hands. Of these hands, · will have exactly
13 2 11
4.1. CONDITIONAL PROBABILITIES 24
two aces, so
4 48
·
2 11
P (A) = .
52
13
Now the number of ways in which Annabelle can have a certain hand and Bobby can have a
52 39
certain hand is · , and the number of ways in which A and B can both occur
13 13
4 48 2 37
is · · · . so
2 11 1 12
4 48 2 37
· · ·
2 11 1 12
P(A ∩ B) = .
52 39
·
13 13
Therefore,
4 ·
48 ·
2 37
·
2 11 1 12
52 ·
39
P (A ∩ B) 13 13
P (B | A) = =
P(A)
4 ·
48
2 11
52
13
2 37
·
1 12
= .
39
13
• P (B | A) = P(A∩B)
Note that since
P(A) then P (A ∩ B) = P(A)P (B | A).
In general: If E1 , . . . , En are events then
• Example5:
Experiment: Suppose an urn has 5 White balls and 7 Black balls. Each ball that is selected is
returned to the urn along with an additional ball of the same color. Suppose draw 3 balls.
Part (a): What is the probability that you get 3 white balls.
∗ Then
P (A ∩ C) = P (C) P (A | C)
1 1 1
= · = .
2 7 14
• Example 7: A total of 500 married couples are poled about salaries:
Wife Husband makes less than 25,000 Husband makes more than 25,000
• Sometimes it's easier to compute a probability once we know something has or has not happened.
• Note that we can compute,
P (E) = P (E ∩ F ) + P (E ∩ F c )
= P (E | F ) P (F ) + P (E | F c ) P (F c )
= P (E | F ) P (F ) + P (E | F c ) (1 − P (F )) .
• This formula is called: The Law of Total Probability:
P (E) = P (E | F ) P (F ) + P (E | F c ) (1 − P (F ))
• The following problem will describe the types of problems of this section.
• Example1: Insurance company believes
The probability that an accident prone person has an accident within a year is .4.
The probability that Non-accident prone person has an accident with year is .2.
30% of the population is accident prone.
Part (a): Find P (A1 ) where A1 =new policy holder will have an accident within a year?
∗ Let A = {Policy holder IS accident prone.}
P (A ∩ A1 )
P (A | A1 ) =
P (A1 )
P (A) P (A1 | A)
=
.26
(.3) (.4) 6
= = .
.26 13
• In general:
So in Part (a) we had to break a probability into two cases: If F1 , . . . , Fn are mutually exclusive
Sn
events such that they make up everythinn S= i=1 Fi then
n
X
P (E) = P (E | Fi ) P (Fi ) .
i=1
∗ This is called Law of Total Probability.
In Part (b), we wanted to nd a probability of a separate conditional event: then
P (E | Fj ) P (Fj )
P (Fj | E) = Pn .
i=1 P (E | Fi ) P (Fi )
∗ This is known as Baye's Formula
∗ Note that the denominator of the Bayes's formula is the Law of total probability.
• Example2: Suppose the test for HIV is
98% accurate in both directions
0.5% of the population is HIV positive.
4.2. BAYES'S FORMULA 27
Question: If someone tests positive, what is the probability they actually are HIV positive?
Solution: Let T+ = {tests positive} , T− = {tests negative}, while + = {actually HIV positive,}
− = {actually negative}.
∗ Want
P (+ ∩ T+ )
P (+ | T+ ) =
P (T+ )
P (T+ | +) P (+)
=
P (T+ | +) P (+) + P (T+ | −) P (−)
(.98) (.005)
=
(.98) (.005) + .02 (.995)
= 19.8%.
• Example3: Suppose
30% of the women in a class received an A on the test
25% of the men/or else received an A.
60% of the class are women.
Question: Given that a person chosen at random received an A, what is the probability this
person is a women?
∗ Solution: Let A the event that a students receives an A. Let W =being a women,
M =not a women. Want
P (A | W ) P (W )
P (W | A) = , by Bayes's
P (A | W ) P (W ) + P (A | M ) P (M )
.3 (.6) .18
= = ≈ .64.
.3 (.6) + .25 (.4) .28
• (General Baye's Theorem) Here's one with more than 3 possibilities:
• Example4: Suppose in Factory with Machines I,II,III producing Iphones
Machines I,II,III produce 2%,1%, and 3% defective iphones, respectively.
Out of total production, Machines I makes 35% of all Iphones, II -25%, III - 40%.
If one Iphone is selected at random from the factory,
Part (a): what is probability that one Iphone selected is defective?
P (III) P (D | III)
P (III | D) =
P (D)
(.4) (.03) 120
= = .
215/10, 000 215
• Example5: In a Multiple Choice Test, students either knows the answer or randomly guesses the
answer to a question.
Let m =number of choices in a question.
4.2. BAYES'S FORMULA 28
Let p = the probability that the students knows the answer to a question.
Question: What is the probability that the student actually knew the answer, given that the
student answers correctly.
Solution:
Let K = {Knows the answer} and C = {Answer's correctly}. Then
P (C | K) P (K)
P (K | C) =
P (C | K) P (K) + P (C | K c ) P (K c )
1·p mp
= 1 = .
1 · p + m (1 − p) 1 + (m − 1)p