0% found this document useful (0 votes)
16 views15 pages

L04 BayesRule

The document explains Bayes' Rule and its application in probability theory, using examples such as a trick coin and random variables. It details the derivation of Bayes' Rule, joint probabilities, and conditional probabilities through various examples, including a cold and runny nose scenario. The document emphasizes how conditional probabilities can indicate relationships between events.

Uploaded by

Ed Z
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views15 pages

L04 BayesRule

The document explains Bayes' Rule and its application in probability theory, using examples such as a trick coin and random variables. It details the derivation of Bayes' Rule, joint probabilities, and conditional probabilities through various examples, including a cold and runny nose scenario. The document emphasizes how conditional probabilities can indicate relationships between events.

Uploaded by

Ed Z
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Bayes’ Rule

Foundations of Data Analysis

February 3, 2022
Brain Teaser: Trick Coin

I have four coins. Three are normal, one side heads, one
side tails. One is a trick coin where both sides are
heads. I pick one coin at random and flip it. If it shows
heads, what is the probability that it is the trick coin?
Bayes’ Rule

Let’s us “flip” a conditional:

P(A | B)P(B)
P(B | A) =
P(A)
Deriving Bayes’ Rule
Multiplication rule:

P(A ∩ B) = P(A | B)P(B)

P(B ∩ A) = P(B | A)P(A)


But these two equations are equal, so:

P(B | A)P(A) = P(A | B)P(B)

Dividing both sides by P(A) gives us:

P(A | B)P(B)
P(B | A) =
P(A)
Trick Coin Example
A = “heads”, B = “trick coin”
P(A | B) = 1.0
P(B) = 0.25

P(A) = P(A | B)P(B) + P(A | Bc )P(Bc )


5
= 1.0 × 0.25 + 0.5 × 0.75 =
8

P(A | B)P(B) 1.0 × 0.25 2


P(B | A) = = = = 0.4
P(A) 5/8 5
Random Variables

Definition
A random variable is a function defined on a sample
space, Ω. Notation: X : Ω → R

I A random variable is neither random nor a variable.


I Just think of a random variable as assigning a
number to every possible outcome.
I For example, in a coin flip, we might assign “tails”
as 0 and “heads” as 1:

X(T) = 0, X(H) = 1
Dice Example

Let (Ω, F, P) be the probability space for rolling a pair of


dice, and let X be the random variable that gives the
sum of the numbers on the two dice. So,

X[(1, 2)] = 3, X[(4, 4)] = 8, X[(6, 5)] = 11


Even Simpler Example

Most of the time the random variable X will just be the


identity function. For example, if the sample space is the
real line, Ω = R, the identity function

X : R → R,
X(s) = s

is a random variable.
Defining Events via Random Variables

Setting a real-valued random variable to a value or range


of values defines an event.

[X = x] = {s ∈ Ω : X(s) = x}
[X < x] = {s ∈ Ω : X(s) < x}
[a < X < b] = {s ∈ Ω : a < X(s) < b}
Joint Probabilities
Two binary random variables:
C = cold / no cold = (1/0)
R = runny nose / no runny nose = (1/0)

Event [C = 1]: “I have a cold”


Event [R = 1]: “I have a runny nose”

Joint event
[C = 1] ∩ [R = 1]: “I have a cold and a runny nose”

Notation for joint probabilities:

P(C = 1, R = 1) = P([C = 1] ∩ [R = 1])


Cold Example: Probability Tables

Two binary random variables:


C = cold / no cold = (1/0)
R = runny nose / no runny nose = (1/0)

Joint probabilities:

C
0 1

0 0.50 0.05
R
1 0.20 0.25
Cold Example: Marginals

C
0 1

0 0.50 0.05
R
1 0.20 0.25

Marginals:

P(R = 0) = 0.55, P(R = 1) = 0.45

P(C = 0) = 0.70, P(C = 1) = 0.30


Cold Example: Conditional Probabilities
C
0 1

0 0.50 0.05 0.55


R
1 0.20 0.25 0.45

0.7 0.3
Conditional Probabilities:
P(C = 0, R = 0) 0.50
P(C = 0 | R = 0) = = ≈ 0.91
P(R = 0) 0.55
P(C = 1, R = 1) 0.25
P(C = 1 | R = 1) = = ≈ 0.56
P(R = 1) 0.45
Cold Example
C
0 1
Remember:
0 0.50 0.05 0.55
R P(C) = 0.3
1 0.20 0.25 0.45
P(C | R) = 0.56

0.7 0.3
What if I didn’t give you the full table, but just:

P(R | C) = 0.83 > P(R) = 0.45

What can you say about the increase


P(C | R) > P(C)?
Cold Example
Notice, having a cold increases my chance for a runny
nose by the factor,

P(R | C) 0.83
= = 1.85
P(R) 0.45
How does such a ratio increase if I flip the conditional?

P(C | R) P(C ∩ R)
=
P(C) P(R)P(C)
P(R | C)
=
P(R)
= 1.85

You might also like