TP Lecture1h
TP Lecture1h
Lecture 1
1 / 34
Lecture 1 Outline
2 / 34
A little bit of history
Founding fathers of probability
Probability theory began in 17th century France in the correspondence of two great French
mathematicians, Blaise Pascal and Pierre de Fermat.
5 / 34
Why probability is important for CS?
7 / 34
Back to history
It is said that de Mere had been betting that, in four rolls of a die, at least one six would
turn up. He was winning consistently.
In order to get more people to play, he changed the game to bet that, in 24 rolls of two
dice, a pair of sixes would turn.
It is claimed that de Mere was loosing money in long term with 24 and felt that 25 rolls were
necessary to make the game favorable.
So, he asked Blaise Pascal to explain him that ”mystery”.
Similarly, for the second bet, with 24 rolls, the probability that de Mere wins is
24
35
1− = 0.491
36
and for 25 rolls it is 25
35
1− = 0.506.
36
9 / 34
What is a probability problem?
Probability theory is a mathematical framework for reasoning about uncertainty.
Every probability problem involves some sort of randomized experiment, process, or
game.
A randomized experiment/process is an experiment/process whose outcome nobody
knows at the beginning.
Each such problem involves two distinct challenges:
1 How do we model the situation mathematically?
– List of all possible outcomes (sample space);
– Describe event or events of interest;
– Probability law: describes our beliefs about which outcomes are more likely to occur than
others.
– Probability laws have to obey certain basic properties (axioms).
2 How do we solve the resulting mathematical problem?
Probability problems are classified upon the number of possible outcomes into
– Discrete probability problem;
– Continuous probability problem.
10 / 34
Discrete Probability. Outcomes
Consider chance experiments with a finite number of possible outcomes: ω1 , ω2 , . . . , ωn .
Example
Roll a die (which is an experiment) and the possible outcomes are: {1, 2, 3, 4, 5, 6}
corresponding to the side that turns up.
Example
Toss a coin (another experiment) with possible outcomes: H (heads) and T (tails).
Example
Outcomes of two tosses of a coin (experiment) are: {HH, HT , TH, TT }.
Example
Rolling
a pair of dice (experiment) will have the possible outcomes of the form
(1, 1), (1, 2), (1, 3), . . . (3, 3), (3, 4), . . . (5, 6), (6, 6) .
11 / 34
Discrete Probability. Sample space
Definition
Suppose we have an experiment whose outcome depends on chance. The set of all possible
outcomes of a chance experiment is called sample space of the experiment. Usually it is
denoted by Ω.
Sample space must be
– Collectively exhaustive;
– Mutually exclusive.
Model requirement: to be at the “right” detalization. We have some freedom about the
details of how we’re going to describe sample space. And the question is:
How much detail are we going to include?
Definition
The elements of a sample space are called outcomes.
Any subset of a sample space is defined to be an event.
Probabilities are assigned to events. 12 / 34
Discrete Probability. Random variables
Frequently we might refer numerically to an outcome of an experiment.
Example
Consider the mathematical expression which gives the sum of three rolls of a six-sided die.
To do this, we could let X1 , X2 and X3 represent the values of the outcomes of the three
rolls, and then we could write the expression
Y = X1 + X2 + X3
for the sum of the three rolls. The Xi ’s and Y are so-called random variables.
Definition
A random variable is the expression whose value is (or depends on) the outcome of a
particular experiment.
Just as in the case of other types of variables in mathematics, random variables can take on
different values.
13 / 34
Discrete Probability
Example
A die is rolled once. We let X denote the outcome of this experiment. Then the sample
space for this experiment is the 6−element set
Ω = {1, 2, 3, 4, 5, 6},
where each outcome i, for i = 1, . . . , 6, corresponds to the number of dots on the face which
turns up. The event
E = {2, 4, 6}
corresponds to the statement that the result of the roll is an even number.
The event E can also be described by saying that X is even.
Definition
If the sample space is either finite or countably infinite, the experiment, the sample space and
the associated random variables are said to be discrete.
14 / 34
Probability Axioms
1 How do we model the situation mathematically?
– List of all possible outcomes (sample space);
– Describe event or events of interest;
– Probability law: describes our beliefs about which outcomes are more likely to occur than
others.
– Probability laws have to obey certain basic properties (axioms).
Next we shall assign probabilities.
Axioms
1 Nonnegativity: P (E ) > 0 for any event E .
2 Normalization: P (Ω) = 1.
3 Additivity: If E ∩ F = ∅, then P (E ∪ F ) = P (E ) + P (F ).
15 / 34
Probability Axioms
Axioms
1 Nonnegativity: P (E ) > 0 for any event E .
2 Normalization: P (Ω) = 1.
3 Additivity: If E ∩ F = ∅, then P (E ∪ F ) = P (E ) + P (F ).
1 = P ( Ω ) = P (E ∪ E c ) = P (E ) + P (E c ) > P (E ).
Therefore, for any event E we have 0 6 P (E ) 6 1 .From the same relation we get
P (E c ) = 1 − P (E ) .
Note that the third axiom needs some strengthening!
P (E ∪ F ∪ G ) = P ((E ∪ F ) ∪ G ) = P (E ∪ F ) + P (G )
= P (E ) + P (F ) + P (G ).
for any disjoint sets E , F and G . 16 / 34
Probability Axioms
Theorem
If A1 , . . . , An are pairwise disjoint subsets of Ω (i.e., no two of the Ai have an element in
common), then
Suppose we have a probabilistic experiment with discrete sample space Ω and an event
E = {ω1 , ω2 , . . . , ωk }, which is a finite subset of Ω, E ⊂ Ω. What is P (E )?
P (E ) = P ({ω1 , ω2 , . . . , ωk })
= P ({ω1 } ∪ {ω2 } ∪ . . . {ωk })
= P ({ω1 }) + P ({ω2 }) + · · · + P ({ωk })
= P ( ω1 ) + P ( ω2 ) + · · · + P ( ωk ) . 17 / 34
Discrete Probability
Definition
Consider an experiment with finite sample space Ω and let X be a random variable
associated with this experiment. A distribution function (also called a probability mass
function, or pmf) for X is a real-valued function m whose domain is Ω and which satisfies:
1. m (ω ) ≥ 0, for all ω ∈ Ω,
2. ∑ m (ω ) = 1.
ω ∈Ω
P (E ) = ∑ m ( ω ).
ω ∈E
18 / 34
Discrete Probability
Example
For rolling a die,
1
m (i ) = , i = 1, 2, . . . , 6.
6
Clearly, it is a distribution function. Then
P (E ) = P ({2, 4, 6})
= ∑ m(ω )
ω ∈{2,4,6}
= m (2) + m (4) + m (6)
3 1
= = .
6 2
19 / 34
Discrete Probability
Consider rolling a pair of dice. Sample space is
Ω = {(i, j ) | 1 6 i 6 6, 1 6 j 6 6, i, j ∈ Z}.
Let X be the result of the first roll, and Y the result of the second. Both are random
variables associated to this experiment. We can represent the outcomes in the table form.
1 2 3 4 5 6
1 (1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
X = First roll
2 (2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
3 (3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
4 (4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
5 (5,1) (5,2) (5,3) (5,4) (5,5) (5,6)
6 (6,1) (6,2) (6,3) (6,4) (6,5) (6,6)
Y = Second roll
What probability law we should assign?
Every possible outcome is equally likely, thus we will assign to every outcome the same
1
probability 36 1
: m (ω ) = 36 for any ω ∈ Ω. 20 / 34
Discrete Probability
1 2 3 4 5 6
1 (1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
X = First roll
2 (2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
3 (3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
4 (4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
5 (5,1) (5,2) (5,3) (5,4) (5,5) (5,6)
6 (6,1) (6,2) (6,3) (6,4) (6,5) (6,6)
Y = Second roll
P (X , Y ) = (1, 3) or (X , Y ) = (4, 5) ?
P X =2 ?
P Y is even ?
P X +Y > 9 ?
P min(X , Y ) = 3 ?
21 / 34
Discrete Probability
1 2 3 4 5 6
1 (1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
X = First roll
2 (2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
3 (3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
4 (4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
5 (5,1) (5,2) (5,3) (5,4) (5,5) (5,6)
6 (6,1) (6,2) (6,3) (6,4) (6,5) (6,6)
Y = Second roll
P (X , Y ) = (1, 3) or (X , Y ) = (4, 5) = 1/36 + 1/36 = 2/36.
P X = 2 = 6/36.
P Y is even = 18/36.
P X + Y > 9 = 10/36.
P min(X , Y ) = 3 = 7/36.
22 / 34
Discrete Probability
Example
Consider tossing a coin twice. There are several ways to record the outcomes of this
experiment:
1 Record the two tosses, in the order they occurred:
Ω1 = {HH, HT , TH, TT }.
2 Record the outcomes by simply recording the number of heads that appeared:
Ω2 = {0, 1, 2}.
3 Record the two outcomes, without regard to the order in which they occurred:
Ω3 = {HH, HT , TT }.
23 / 34
Discrete Probability
Example (Contd.)
Assume that all four outcomes are equally likely, and define the distribution function m (ω ) by
1
m (HH ) = m (HT ) = m (TH ) = m (TT ) = .
4
Let E = {HH, HT , TH }. Then
1 1 1 3
P (E ) = m (HH ) + m (HT ) + m (TH ) = + + = .
4 4 4 4
Similarly, if F = {HH, HT } then
1 1 1
P (F ) = m (HH ) + m (HT ) = + = .
4 4 2
24 / 34
Properties of probability
Theorem
Let A1 , . . . , An be pairwise disjoint events with Ω = A1 ∪ . . . ∪ An and let E be any event.
Then
n
P (E ) = ∑ P (E ∩ Ai ).
i =1
Corollary
For any two events E and F , P (E ) = P (E ∩ F ) + P (E ∩ F c ).
Theorem
If E and F are any two events, then
P (E ∪ F ) = P (E ) + P (F ) − P (E ∩ F ).
25 / 34
Tree Diagrams
A very useful tool in solving discrete probability problems are tree diagrams.
Consider three tosses of a coin.
Start
H T 1st Toss
H T H T 2nd Toss
H T H T H T H T 3rd Toss
ω1 ω2 ω3 ω4 ω5 ω6 ω7 ω8 ← outcomes
HHH HHT HTH HTT THH THT TTH TTT
26 / 34
Tree diagrams
Example (Contd.)
Sample space is
Ω = { ω1 , ω2 , . . . , ω8 }
= {HHH, HHT , HTH, HTT , THH, THT , TTH, TTT }.
and F be the event that exactly one pair of either heads or tails will happen.
Compute P (E ) and P (F ).
27 / 34
Tree diagrams
Start
1/2 1/2
H T 1st Toss
1/2 1/2 1/2 1/2
H T H T 2nd Toss
1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2
H T H T H T H T 3rd Toss
ω1 ω2 ω3 ω4 ω5 ω6 ω7 ω8 ← outcomes
HHH HHT HTH HTT THH THT TTH TTT
P (E ) = 1 − P (E c ) = 1 − 1
8 = 78 .
Let F be the event that either a pair of heads or a pair of tails turn up.
Let F1 be the event that exactly one pair of heads turns up,
and F2 consists of exactly a pair of tails. Then F = F1 ∪ F2 and use formula:
P ( F ) = P ( F1 ) + P ( F2 ) − P ( F1 ∩ F2 ) ,
= P ({HHT , HTH, THH }) + P ({TTH, THT , HTT }) − P (∅),
3
= 8 + 38 − 0 = 43 .
29 / 34
Tree diagrams
Compute the probability that either the first outcome is head or the second outcome is a tail.
Start
1/2 1/2
H T 1st Toss
1/2 1/2 1/2 1/2
H T H T 2nd Toss
1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2
H T H T H T H T 3rd Toss
ω1 ω2 ω3 ω4 ω5 ω6 ω7 ω8 ← outcomes
HHH HHT HTH HTT THH THT TTH TTT
Example (Contd.)
Probability that either 1st outcome is head or 2nd outcome is tail.
Let A = {1st outcome is head} = {ω1 , ω2 , ω3 , ω4 },
and B = {2nd outcome is tail} = {ω3 , ω4 , ω7 , ω8 }.
By looking at the paths in the tree, we see that
1
P (A) = P (B ) = 8 + 81 + 81 + 1
8 = 12 ,
A ∩ B = { ω3 , ω4 } ,
1
P (A ∩ B ) = + 81 = 14 ,
8
P (A ∪ B ) = P (A) + P (B ) − P (A ∩ B )
1
= 2 + 21 − 1
4 = 43 .
31 / 34
Basic steps in solving discrete probability problems
32 / 34
Lecture 1 Summary
33 / 34
Probability Joke
34 / 34