0% found this document useful (0 votes)
2 views

Lecture2 Probability

Uploaded by

emtranvo
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Lecture2 Probability

Uploaded by

emtranvo
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 88

Probability

P(a ∧ b)
P(a | b) =
P(b)

P(a ∧ b) = P(b)P(a | b)
P(a ∧ b) = P(a)P(b | a)
probability distribution
P(Flight) = ⟨0.6, 0.3, 0.1⟩
independence
the knowledge that one event occurs does
not affect the probability of the other event
independence
P(a ∧ b) = P(a)P(b | a)
independence
P(a ∧ b) = P(a)P(b)
independence
P( ) = P( )P( )
1 1 1
= ⋅ =
6 6 36
independence
P( ) ≠ P( )P( )
1 1 1
= ⋅ =
6 6 36
independence
P( ) ≠ P( )P( | )
1
= ⋅0=0
6
Bayes' Rule
P(a ∧ b) = P(b) P(a | b)

P(a ∧ b) P(a) P(b | a)


P(a) P(b | a) = P(b) P(a | b)
Bayes' Rule

P(b) P(a | b)
P(b | a) =
P(a)
Bayes' Rule

P(a | b) P(b)
P(b | a) =
P(a)
AM PM

Given clouds in the morning,


what's the probability of rain in the afternoon?

• 80% of rainy afternoons start with cloudy


mornings.
• 40% of days have cloudy mornings.
• 10% of days have rainy afternoons.
P(clouds | rain)P(rain)
P(rain | clouds) =
P(clouds)

(.8)(.1)
=
.4

= 0.2
Knowing

P(cloudy morning | rainy afternoon)

we can calculate

P(rainy afternoon | cloudy morning)


Knowing

P(visible effect | unknown cause)

we can calculate

P(unknown cause | visible effect)


Knowing

P(medical test result | disease)

we can calculate

P(disease | medical test result)


Knowing

P(blurry text | counterfeit bill)

we can calculate

P(counterfeit bill | blurry text)


Joint Probability
PM
AM

C = cloud C = ¬cloud R = rain R = ¬rain


0.4 0.6 0.1 0.9

PM
AM

R = rain R = ¬rain
C = cloud 0.08 0.32
C = ¬cloud 0.02 0.58
Probability Rules
Negation

P( ¬a) = 1 −
P(a)
Inclusion-Exclusion

P(a ∨ b) = P(a) + P(b) − P(a ∧


b)
Marginalization

P(a) = P(a, b) + P(a,


¬b)
Marginalization

P(X = xi) = ∑ P(X = xi, Y = yj)


j
Marginalization
R = rain R = ¬rain
C = cloud 0.08 0.32
C = ¬cloud 0.02 0.58

P(C = cloud)
= P(C = cloud, R = rain) + P(C = cloud, R = ¬rain)
= 0.08 + 0.32
= 0.40
Conditioning

P(a) = P(a | b)P(b) + P(a | ¬b)P(


¬b)
Conditioning

P(X = xi) = ∑ P(X = xi | Y = yj)P(Y = yj)


j
Bayesian Networks
Bayesian network
data structure that represents the
dependencies among random variables
Bayesian network
• directed graph
• each node represents a random variable
• arrow from X to Y means X is a parent of Y
• each node X has probability distribution
P(X | Parents(X))
Rain
{none, light, heavy}

Maintenance
{yes, no}

Train
{on time, delayed}

Appointment
{attend, miss}
Rain none light heavy
{none, light, heavy} 0.7 0.2 0.1
Rain
{none, light, heavy}

R yes no

Maintenance none 0.4 0.6


{yes, no} light 0.2 0.8
heavy 0.1 0.9
Rain
{none, light, heavy}

R M on time delayed
Maintenance none yes 0.8 0.2
{yes, no} none no 0.9 0.1
light yes 0.6 0.4
light no 0.7 0.3
Train heavy yes 0.4 0.6
{on time, delayed} heavy no 0.5 0.5
Maintenance
{yes, no}

Train
{on time, delayed}

T attend miss
Appointment on time 0.9 0.1
{attend, miss} delayed 0.6 0.4
Rain
{none, light, heavy}

Maintenance
{yes, no}

Train
{on time, delayed}

Appointment
{attend, miss}
Rain Computing Joint Probabilities
{none, light, heavy}

Maintenance
{yes, no}

Train
{on time, delayed}

Appointment
{attend, miss}
P(light)

P(light)
Rain Computing Joint Probabilities
{none, light, heavy}

Maintenance
{yes, no}

Train
{on time, delayed}

Appointment
{attend, miss}
P(light, no)

P(light) P(no | light)


Rain Computing Joint Probabilities
{none, light, heavy}

Maintenance
{yes, no}

Train
{on time, delayed}

Appointment
{attend, miss}
P(light, no, delayed)

P(light) P(no | light) P(delayed | light, no)


Rain Computing Joint Probabilities
{none, light, heavy}

Maintenance
{yes, no}

Train
{on time, delayed}

Appointment
{attend, miss}
P(light, no, delayed, miss)

P(light) P(no | light) P(delayed | light, no) P(miss | delayed)


Inference
Inference

• Query X: variable for which to compute distribution


• Evidence variables E: observed variables for event e
• Hidden variables Y: non-evidence, non-query variable.

• Goal: Calculate P(X | e)


Rain
{none, light, heavy}

P(Appointment | light, no)


Maintenance
{yes, no}
= α P(Appointment, light, no)
Train
{on time, delayed}
= α [P(Appointment, light, no, on time)
+ P(Appointment, light, no, delayed)] Appointment
{attend, miss}
Inference by Enumeration

P(X | e) = α P(X, e) = α ∑ P(X, e, y)


y

X is the query variable.


e is the evidence.
y ranges over values of hidden variables.
α normalizes the result.
Approximate Inference
Sampling
Rain
{none, light, heavy}

Maintenance
{yes, no}

Train
{on time, delayed}

Appointment
{attend, miss}
R = none

Rain none light heavy


{none, light, heavy} 0.7 0.2 0.1
R = none
M = yes

Rain
{none, light, heavy}

R yes no

Maintenance none 0.4 0.6


{yes, no} light 0.2 0.8
heavy 0.1 0.9
Rain R = none
{none, light, heavy} M = yes
T = on time

Maintenance
{yes, no} R M on time delayed
none yes 0.8 0.2
none no 0.9 0.1
light yes 0.6 0.4
Train light no 0.7 0.3
heavy yes 0.4 0.6
{on time, delayed} heavy no 0.5 0.5
Maintenance R = none
{yes, no}
M = yes
T = on time
Train A = attend
{on time, delayed}

T attend miss
Appointment on time 0.9 0.1
{attend, miss} delayed 0.6 0.4
R = none
M = yes
T = on time
A = attend
R = light R = light R = none R = none
M = no M = yes M = no M = yes
T = on time T = delayed T = on time T = on time
A = miss A = attend A = attend A = attend

R = none R = none R = heavy R = light


M = yes M = yes M = no M = no
T = on time T = on time T = delayed T = on time
A = attend A = attend A = miss A = attend
P(Train = on time) ?
R = light R = light R = none R = none
M = no M = yes M = no M = yes
T = on time T = delayed T = on time T = on time
A = miss A = attend A = attend A = attend

R = none R = none R = heavy R = light


M = yes M = yes M = no M = no
T = on time T = on time T = delayed T = on time
A = attend A = attend A = miss A = attend
R = light R = light R = none R = none
M = no M = yes M = no M = yes
T = on time T = delayed T = on time T = on time
A = miss A = attend A = attend A = attend

R = none R = none R = heavy R = light


M = yes M = yes M = no M = no
T = on time T = on time T = delayed T = on time
A = attend A = attend A = miss A = attend
P(Rain = light | Train = on time) ?
R = light R = light R = none R = none
M = no M = yes M = no M = yes
T = on time T = delayed T = on time T = on time
A = miss A = attend A = attend A = attend

R = none R = none R = heavy R = light


M = yes M = yes M = no M = no
T = on time T = on time T = delayed T = on time
A = attend A = attend A = miss A = attend
R = light R = light R = none R = none
M = no M = yes M = no M = yes
T = on time T = delayed T = on time T = on time
A = miss A = attend A = attend A = attend

R = none R = none R = heavy R = light


M = yes M = yes M = no M = no
T = on time T = on time T = delayed T = on time
A = attend A = attend A = miss A = attend
R = light R = light R = none R = none
M = no M = yes M = no M = yes
T = on time T = delayed T = on time T = on time
A = miss A = attend A = attend A = attend

R = none R = none R = heavy R = light


M = yes M = yes M = no M = no
T = on time T = on time T = delayed T = on time
A = attend A = attend A = miss A = attend
Rejection Sampling
Likelihood Weighting
Likelihood Weighting

• Start by fixing the values for evidence variables.


• Sample the non-evidence variables using conditional
probabilities in the Bayesian Network.
• Weight each sample by its likelihood: the probability
of all of the evidence.
P(Rain = light | Train = on time) ?
Rain
{none, light, heavy}

Maintenance
{yes, no}

Train
{on time, delayed}

Appointment
{attend, miss}
R = light

T = on time

Rain none light heavy


{none, light, heavy} 0.7 0.2 0.1
R = light
M = yes
T = on time
Rain
{none, light, heavy}

R yes no

Maintenance none 0.4 0.6


{yes, no} light 0.2 0.8
heavy 0.1 0.9
Rain R = light
{none, light, heavy} M = yes
T = on time

Maintenance
{yes, no} R M on time delayed
none yes 0.8 0.2
none no 0.9 0.1
light yes 0.6 0.4
Train light no 0.7 0.3
heavy yes 0.4 0.6
{on time, delayed} heavy no 0.5 0.5
Maintenance R = light
{yes, no}
M = yes
T = on time
Train A = attend
{on time, delayed}

T attend miss
Appointment on time 0.9 0.1
{attend, miss} delayed 0.6 0.4
Rain R = light
{none, light, heavy} M = yes
T = on time
A = attend
Maintenance
{yes, no} R M on time delayed
none yes 0.8 0.2
none no 0.9 0.1
light yes 0.6 0.4
Train light no 0.7 0.3
heavy yes 0.4 0.6
{on time, delayed} heavy no 0.5 0.5
Rain R = light
{none, light, heavy} M = yes
T = on time
A = attend
Maintenance
{yes, no} R M on time delayed
none yes 0.8 0.2
none no 0.9 0.1
light yes 0.6 0.4
Train light no 0.7 0.3
heavy yes 0.4 0.6
{on time, delayed} heavy no 0.5 0.5
Uncertainty over Time
Xt: Weather at time t
Markov assumption
the assumption that the current state
depends on only a finite fixed number of
previous states
Markov Chain
Markov chain
a sequence of random variables where the
distribution of each variable follows the
Markov assumption
Transition Model
Tomorrow (Xt+1)

0.8 0.2
Today (X t )
0.3 0.7
X0 X1 X2 X3 X4
Sensor Models
Hidden State Observation
robot's position robot's sensor data

words spoken audio waveforms

user engagement website or app analytics

weather umbrella
Hidden Markov Models
Hidden Markov Model
a Markov model for a system with hidden
states that generate some observed event
Sensor Model
Observation (E t )

0.2 0.8
State (X t )
0.9 0.1
sensor Markov assumption
the assumption that the evidence variable
depends only the corresponding state
X0 X1 X2 X3 X4

E0 E1 E2 E3 E4

You might also like