Chapter 2: Belief, Probability, and Exchangeability: Lecture 1: Probability, Bayes Theorem, Distributions
Chapter 2: Belief, Probability, and Exchangeability: Lecture 1: Probability, Bayes Theorem, Distributions
MSU-STT-465: Summer-20B
Axioms of Probability
P1 Contradictions and tautologies:
0 = P (not H | H ) ≤ P (F | H ) ≤ P (H | H ) = 1;
P2 Addition rule: P (F ∪ G | H ) = P (F | H ) + P (G | H ), if F ∩ G = ∅;
P3 Multiplication rule: P (F ∩ G | H ) = P (F | H )P (G | F ∩ H ),
where ∅ is a empty set.
Note the probability axioms and its properties are common to both a
Bayesian and frequentist interpretation of probability.
Consider the set H , which is the “set of all possible truths.” Partition H into
discrete subsets {H1 , . . . , Hk }, where only one subset contains the truth.
Statistically, H is called the sample space, the set of possible outcomes of
a random experiment and Hi ∈ H is called an event. We can assign
probabilities whether each of these sets contains the truth. First, some
event in H is true, so that P (H) = 1.
(iii) The rule (ii) says that the total probability of an event occurring is the
sum of all of its probabilities under the possible partitions of truths
P (Hi ∩ E )
P (Hi |E ) =
P (E )
P (Hi ∩ E )
= Pn
i =1 P (E ∩ Hi )
likelihood prior
z }| { z}|{
P (E | Hi ) P (Hi )
= Pn .
i =1 P (E |Hi )P (Hi )
P (E |H1 )P (H1 )
P (H1 |E ) =
P (E |H1 )P (H1 ) + P (E |H2 )P (H2 )
(0.05)(0.98)
=
(0.05)(0.98) + (0.99)(0.02)
49
=
68.8
≈ 0.7.
P (Hi | E ) P (E ∪ Hi )/P (E )
=
P (Hj | E ) P (E ∪ Hj )/P (E )
P (E | Hi ) P (Hi )
= ×
P (E | Hj ) P (Hj )
= Bayes Factor × prior probabilites.
So, the Bayes rule does not tell us about our prior beliefs should be, but
tell us how they should change after the data is obtained.
That is, then knowing F does not change our belief about G, when H is
known.
Example 1
Consider F = { Patient is a smoker}, G = { Patient has lung cancer} and
H = { smoking causes lung cancer}. As we know H is true, knowing that
patient has lung cancer, we do believe in F; that is, F and G are related.
What if H is not true?
(P. Vellaisamy: MSU-STT-465: Summer-20B) Bayesian Statistical Methods 8 / 17
Random variables and their distributions (Contd.)
Discrete RVs
P (Y ∈ A or Y ∈ B ) = P (A ∪ B ) = P (Y ∈ A ) + P (Y ∈ B ).
Continuous RVs
Remark 0.1
Given the joint density p (y1 , y2 ), we can calculate marginal and conditional
densities {p (y1 ), p (y2 ), p (y1 | y2 ), p (y2 | y1 )}. Also, given p (y1 ) and
p (y2 | y1 ), we can reconstruct the joint distribution.
so that
XZ
Pr (Y1 ∈ A ; Y2 ∈ B ) = p (y1 )p (y2 |y1 )dy2
y 1 ∈A y 2 ∈B
Z X
= { p (y1 , y2 )}dy2 .
y 2 ∈B y ∈A
1
p (θ)p (y |θ)
p (θ|y ) = R ,
p (θ)p (y |θ)d θ
p (θ)p (y |θ)
= ,
p (y )
where p (y ) is called the marginal density of Y .
(P. Vellaisamy: MSU-STT-465: Summer-20B) Bayesian Statistical Methods 15 / 17
Bayes Rule for Densities
Proof.
Let p (y ) is the marginal density of Y . Then the joint density p (y , θ) is
p (y , θ) = p (θ)p (y |θ).
p (y , θ)
p (θ|y ) =
p (y )
p (y , θ)
=R
p (y , θ)d θ
p (θ)p (y |θ)
=R
p (θ)p (y |θ)d θ
= p (θ)p (y |θ)/p (y ),