BN Lecture2
BN Lecture2
Knowledge Representation
• Andreas Sauter
• Dec. 2023
• (Content adapted from Erman Acar)
Overview
1. Foundations
Degrees of Belief, Belief Dynamics, Independence, Bayes Theorem, Marginalization
2. Bayesian Networks
Graphs and their Independencies, Bayesian Networks, d-Separation
Bayesian Networks
Motivation, Formal Definition
d-Separation
d-Blocked Paths, d-Separation, Pruning for d-Separation
We will consider only directed acyclic graphs (DAGs) in the context of BNs.
Basic Definitions
A directed path is a path in which all edges are directed towards the end node.
7 Department of Computer Science
Slide 7
The parents of a node is the set of variables which have a directed edge
to .
• Sequence if 𝜋 = 𝐴 → 𝑊 → 𝐵
• Fork if 𝜋 = 𝐴 ← 𝑊 → 𝐵
• Collider if 𝜋 = 𝐴 → 𝑊 ← 𝐵
is a path
is a directed path
is a fork
is a collider
is a sequence
We know that joint probability tables are useful for reasoning about beliefs.
Unfortunately, representing this table needs rows even in the simplest case.
BNs acknowledge the fact that independence forms a significant aspect of beliefs
and that it can be elicited relatively easily using the language of graphs.
Where is the joint distribution over all variables and denotes the
parents of w.r.t. the graph
For an instantiation we can re-write the chain rule to compute its probability as
| ∼ |
| ¬ | | ,¬ ¬ |¬
0.0216
¬ ¬ |¬ ¬ |¬ ¬ |¬ ,¬ ¬ |¬
Some intuition:
An alarm directly triggers a call from the neighbour.
One information a DAG can give us about independence is the Markov Property.
Example:
{𝐶} ⫫ 𝐵, 𝐸, 𝑅 | {𝐴}
{𝑅} ⫫ 𝐴, 𝐵, 𝐶 | {𝐸}
𝐴 ⫫ {𝑅} | 𝐵, 𝐸
{𝐵} ⫫ 𝐸, 𝑅 | ∅
𝐸 ⫫ {𝐵} | ∅
This property is also often used in the Hidden Markov Model (HMM) which has
many applications in reinforcement learning, NLP, ect.
Although all independencies of the Markov property are encoded by the DAG,
the DAG implies even more independencies.
Intuitively, if observing does not influence our belief in , then learning does
not influence our belief in either.
This is especially useful when calculating a joint probability distribution with the
chain rule.
Example:
Example: If w
then
We have seen that deriving new independencies from the Markov property can
be cumbersome.
Luckily, there is a graphical test called d-separation which captures the same
independencies as the rules described before.
Recall the three special paths described in before: sequence, fork, collider
In general, a path is d-blocked iff it contains at least one d-blocked sequence, fork
or collider.
open
Is d-blocked by ? Is d-blocked by ?
Yes! No!
31 Department of Computer Science
d-Separation
Intuitively this means that there is no way information can flow between and
when we condition on .
Does hold?
Yes!
Does hold?
No!
Does hold?
Yes!
Does hold?
Yes!
Paths between sets of nodes can be exponentially many. The following method
guarantees that d-separation can be decided in linear time in the graph size.
Recall rules:
• Deleting every leaf node 𝑊 ∉ 𝑋 ∪ 𝑌 ∪ 𝑍,
• Deleting all edges outgoing from nodes in 𝑍,
Is ?
Yes!
Is ?
Yes!