0% found this document useful (0 votes)
13 views

Module 2 Bayesian Network Model and Inference

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Module 2 Bayesian Network Model and Inference

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes.

Distribution and modifications of the content is prohibited.

Probabilistic Graphical Models


CSDLO5011
2024-25

Subject Incharge
Dr. Bidisha Roy
Associate Professor
Room No. 401
email: [email protected]

St. Francis Institute of Technology PGM


Department of Computer Engineering Dr. Bidisha Roy 1
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Module 2: Bayesian Network Model and Inference


Directed Graph Models

St. Francis Institute of Technology PGM


Department of Computer Engineering Dr. Bidisha Roy 2
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Topics to be covered
❑ Bayesian Network-Exploiting Independence Properties
❑Naive Bayes Model
❑Bayesian Network Model
❑Reasoning Patterns
❑Basic Independencies in Bayesian Networks
❑Bayesian Network Semantics,
❑Graphs and Distributions
❑ Modelling: Picking variables, Picking Structure, Picking
Probabilities,
❑Dseparation
St. Francis Institute of Technology PGM
Department of Computer Engineering Dr. Bidisha Roy 3
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

What?
❑ A Bayesian Network (BN), or a Bayesian Model is a directed
PGM, that represents a set of random variables and their
dependencies using a DAG

❑Helped simplify the representation of probabilistic


relationships between random variables

St. Francis Institute of Technology PGM


Department of Computer Engineering Dr. Bidisha Roy 4
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

A Bayesian Network
A Bayesian network is made up of:
1. A Directed Acyclic Graph
A

C D
2. A set of tables for each node in the graph
A P(A) A B P(B|A) B D P(D|B) B C P(C|B)
false 0.6 false false 0.01 false false 0.02 false false 0.4
true 0.4 false true 0.99 false true 0.98 false true 0.6
true false 0.7 true false 0.05 true false 0.9
true true 0.3 true true 0.95 true true 0.1
St. Francis Institute of Technology PGM
Department of Computer Engineering Dr. Bidisha Roy 5
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

A Directed Acyclic Graph

Each node in the graph is a A node X is a parent of


random variable another node Y if there is an
arrow from node X to node Y
A eg. A is a parent of B

C D

Informally, an arrow from


node X to node Y means X
has a direct influence on Y
St. Francis Institute of Technology PGM
Department of Computer Engineering Dr. Bidisha Roy 6
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is

A Set of Tables for Each Node


prohibited.

Each node Xi has a conditional


A P(A) A B P(B|A)
probability distribution P(Xi | Parents(Xi))
false 0.6 false false 0.01
that quantifies the effect of the parents
true 0.4 false true 0.99
on the node
true false 0.7
true true 0.3 The parameters are the probabilities in
B C P(C|B)
these conditional probability tables
false false 0.4
(CPTs)
false true 0.6 A
true false 0.9
true true 0.1 B D P(D|B)
B
false false 0.02
false true 0.98
C D true false 0.05
true true 0.95
St. Francis Institute of Technology PGM
Department of Computer Engineering Dr. Bidisha Roy 7
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Bayesian Networks
❑ Two important properties:
1. Encodes the conditional independence relationships
between the variables in the graph structure
2. Is a compact representation of the joint probability
distribution over the variables

St. Francis Institute of Technology PGM


Department of Computer Engineering Dr. Bidisha Roy 8
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Bayesian Networks…. Another Example

St. Francis Institute of Technology PGM


Department of Computer Engineering Dr. Bidisha Roy 9
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

The Joint Probability Distribution

We can compute the joint probability distribution over all


the variables X1, …, Xn in the Bayesian net using the
formula:
𝑛

𝑃(𝑋1 = 𝑥1 , . . . , 𝑋𝑛 = 𝑥𝑛 ) = ෑ 𝑃(𝑋𝑖 = 𝑥𝑖 |𝑃𝑎𝑟𝑒𝑛𝑡𝑠(𝑋𝑖 ))


𝑖=1

Where Parents(Xi) means the values of the Parents of the


node Xi with respect to the graph

St. Francis Institute of Technology PGM


Department of Computer Engineering Dr. Bidisha Roy 10
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Example
A P(A) A B P(B|A) Suppose you want to calculate:
false 0.6 false false 0.01 P(A = true, B = true, C = true, D = true)
true 0.4 false true 0.99
true false 0.7
= P(A = true) * P(B = true | A = true) *
true true 0.3 P(C = true | B = true) P( D = true | B =
true)
B C P(C|B)
= (0.4)*(0.3)*(0.1)*(0.95)
false false 0.4
false true 0.6 A
true false 0.9
true true 0.1
B
B D P(D|B)
false false 0.02
C D
false true 0.98
true false 0.05
St. Francis Institute of Technology true true 0.95 PGM
Department of Computer Engineering Dr. Bidisha Roy 11
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Another Example

St. Francis Institute of Technology PGM


Department of Computer Engineering Dr. Bidisha Roy 12
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Inference

❑ Using a Bayesian network to compute probabilities


is called inference
❑ In general, inference involves queries of the form:
P( X | E )
E = The evidence variable(s)
X = The query variable(s)

St. Francis Institute of Technology PGM


Department of Computer Engineering Dr. Bidisha Roy 13
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is
prohibited.

Inference Example
Supposed we know that A=true. A
What is more probable C=true or D=true?
For this we need to compute B
P(C=t | A =t) and P(D=t | A =t).
Let us compute the first one.
C D

P( A = t, C = t )
 P( A = t , B = b, C = t , D = d )
P (C = t | A = t ) = = b ,d

P( A = t ) P( A = t )

A P(A) A B P(B|A) B D P(D|B) B C P(C|B)


false 0.6 false false 0.01 false false 0.02 false false 0.4
true 0.4 false true 0.99 false true 0.98 false true 0.6
true false 0.7 true false 0.05 true false 0.9
true true 0.3 true true 0.95 true true 0.1

St. Francis Institute of Technology PGM


Department of Computer Engineering Dr. Bidisha Roy 14
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Another Inference Example


If the sprinkler was ON, what is
the probability that the grass
was wet?

If the grass was wet what is the


probability that it was cloudy?

St. Francis Institute of Technology PGM


Department of Computer Engineering Dr. Bidisha Roy 15
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Another Inference Example


Suppose Mary has called to
tell you that you had a
burglar alarm. Should you
call the police

Suppose both Mary and


John call. What is the
probability of burglary?

St. Francis Institute of Technology PGM


Department of Computer Engineering Dr. Bidisha Roy 16
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Reasoning Patterns in Bayes Nets


❑ Causal reasoning: The query is “downstream” of the evidence
❑ What is the chance that we get a phone call from our neighbor given that there was a
burglary?
❑ Diagnostic or evidential reasoning: To infer the probability of upstream
events conditioned on downstream events
❑ Given that the alarm went off, what is the chance that there was a burglary?
❑ Intercausal reasoning or explaining away: When evidence is observed for
one cause, it explains its effect, thereby reducing the need to attribute the
effect of the other cause.
❑ Suppose your neighbor calls and informs you that your alarm went off. You are worried
that there was a burglary. Then, you hear on the radio that there was an earthquake,
and you’re relieved because you figure the earthquake probably set off the alarm.
St. Francis Institute of Technology PGM
Department of Computer Engineering Dr. Bidisha Roy 17
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Naïve Baye’s Model


❑ A probabilistic classifier based on Bayes' theorem, with an assumption of
independence among features
❑ All attributes are independent given the value of the class variable i.e. conditional
independence

𝑃(𝑋1 = 𝑥1 , . . . , 𝑋𝑛 = 𝑥𝑛 , 𝐶) = 𝑃(𝐶) ෑ 𝑃(𝑋𝑖 = 𝑥𝑖 |𝐶)


𝑖=1
❑ It makes the naïve assumption that the random variables are independent given C
St. Francis Institute of Technology PGM
Department of Computer Engineering Dr. Bidisha Roy 18
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Naïve Baye’s Model as a Classifier


❑ Prediction Process
❑ Calculate prior probabilities P(C)
❑ Calculate likelihoods P(Xi/C) for each feature
❑ Compute Posterior Probability: Use Bayes' theorem to find 𝑃(𝐶∣𝑋)
for each class
❑ Select the class: Choose the class with highest posterior
probability

St. Francis Institute of Technology PGM


Department of Computer Engineering Dr. Bidisha Roy 19
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Naïve Baye’s Classifier Example


Name Gender The class Gender has two values: Male(M) and
Drew Male Female(F).
Given that I come across a person named Drew(D),
Claudia Female
classify the person as Male and Female
Drew Female
Drew Female 𝑃(𝐷Τ𝑀)𝑃(𝑀) 1Τ3 ∗ 3Τ8
𝑃(𝑀ൗ𝐷) = = = 0.33
𝑃(𝐷) 3Τ8
Alberto Male
Karin Female
Nina Female 𝑃(𝐷Τ𝐹)𝑃(𝐹) 2Τ5 ∗ 5Τ8
𝑃(𝐹 ൗ𝐷) = = = 0.67
𝑃(𝐷) 3Τ8
Sergio Male
Drew is more likely to be classified as Female
St. Francis Institute of Technology PGM
Department of Computer Engineering Dr. Bidisha Roy 20
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

Naïve Baye’s Classifier Example


P(PT=Yes) = 9/14 = 0.64, P(PT=No) = 5/14 = 0.36
Outlook PT = Yes PT = No Humidity PT = Yes PT = No
Sunny 2/9 3/5 High 3/9 4/5
Overcast 4/9 0 Normal 6/9 1/5
Rainy 3/9 2/5
Temperat PT = Yes PT = No Windy PT = Yes PT = No
ure
True 3/9 3/5
Hot 2/9 2/5
False 6/9 2/5
Mild 4/9 2/5
Cool 3/9 1/5

P(Yes) = P(Yes) * P(Sunny/Yes) * P(Cool/Yes) * P(High/Yes) *


Given (Outlook = Sunny, Temperature
P(True/Yes)
= Cool, Humidity = High, Windy = P(No) = P(No) * P(Sunny/No) * P(Cool/No) * P(High/No) * P(True/No)
True), will a person Play Tennis?
St. Francis Institute of Technology PGM
Department of Computer Engineering Dr. Bidisha Roy 21
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

https://www.youtube.com/watch?v=z8K-598fqSo
Applications https://www.youtube.com/watch?v=fOK9DiKUGYs

St. Francis Institute of Technology PGM


Department of Computer Engineering Dr. Bidisha Roy 22
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

D-separation
❑ A criterion for deciding from a given causal/conditional graph,
whether a set of variables X is independent of other set of
variables Y, given a third set Z
“A variable or set of variables X is d-separated from Y
❑ Associates “dependence” with connectedness (existence of a
connecting path), and “independence” with unconnected-ness
or separation
❑ Helps in simplifying the network by identifying independent
relationships, which reduces computational complexity

St. Francis Institute of Technology PGM


Department of Computer Engineering Dr. Bidisha Roy 23
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

D-separation
❑ Direct Connection between X and Y
❑X and Y are correlated regardless of
any evidence about any other variables
❑If X and Y are directly connected we
can get examples where they influence
each other regardless of Z

St. Francis Institute of Technology PGM


Department of Computer Engineering Dr. Bidisha Roy 24
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

D-separation
❑ Indirect Connection between X and Y
(four cases)
❑Indirect causal effect
❑Indirect evidential effect
❑Common cause
❑Common effect

St. Francis Institute of Technology PGM


Department of Computer Engineering Dr. Bidisha Roy 25
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

D-separation
❑ Indirect Causal Effect (X → Z → Y)
❑Cause X cannot influence effect Y if Z
observed
❑Observed Z blocks influence
❑If Grade observed then I does not
influence L
❑Intelligence influences Letter if Grade is
unobserved

St. Francis Institute of Technology PGM


Department of Computer Engineering Dr. Bidisha Roy 26
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

D-separation
❑ Indirect Evidential Effect (Y → Z → X)
❑Evidence X can influence Y via Z only if Z is
unobserved
❑Observed Z blocks the influence
❑If Grade unobserved, Letter influences
assessment of Intelligence
❑Dependency is a symmetric notion
❑X ⊥ Y does not hold then Y ⊥ X does not
hold either

St. Francis Institute of Technology PGM


Department of Computer Engineering Dr. Bidisha Roy 27
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

D-separation
❑ Common Cause(X  Z → Y)
❑X can influence Y if and only if Z is not
observed
❑Observed Z blocks the influence
❑Grade is correlated with SAT score
❑But if Intelligence is observed then
SAT provides no additional
information

St. Francis Institute of Technology PGM


Department of Computer Engineering Dr. Bidisha Roy 28
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

D-separation
❑ Common Effect
(V-Structure) (X → Z Y)
❑Influence cannot flow on trail
X Z →Y if Z is not observed
❑Observed Z enables
❑Opposite to previous 3 cases
(Observed Z blocks)\
❑When G not observed I and D are
independent
❑When G is observed, I and D are
correlated
St. Francis Institute of Technology PGM
Department of Computer Engineering Dr. Bidisha Roy 29
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

D-separation – How to determine


❑Given query Xi ⊥ Xj / {Xk1, Xk2, …., Xkn}
❑Shade all evidence nodes
❑For all (undirected) paths between Xi and Xj
❑Check whether path is active (unblocked by observed conditions):
❑If active:
❑ Xi and Xj are not independent given the evidences
❑If at reaching this point all paths have been checked and found
inactive (blocked by observed evidences):
❑Xi ⊥ Xj / {Xk1, Xk2, …., Xkn} is true, i.e., Xi and Xj are conditionally
independent given the evidences

St. Francis Institute of Technology PGM


Department of Computer Engineering Dr. Bidisha Roy 30
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

D-separation – Active Path


❑A path is active if each triple is active:
❑Causal Chain A→B→C where B is
unobserved
❑Common Cause AB→C where B
is unobserved
❑Common effect (V-structure)
A→BC where B or one of its
descendants is observed

St. Francis Institute of Technology PGM


Department of Computer Engineering Dr. Bidisha Roy 31
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

D-separation
❑Is V ⊥ Z? Not d-separated or not conditionally
independent
❑ Is (V ⊥ Z)/T? d-separated or conditionally
independent as T blocks the
active path
❑Is (U ⊥ V)? d-separated as all paths are
blocked (due to common effect)
❑Is (U ⊥ V)/W? Not d-separated as W being
active becomes a collider
❑Is (U ⊥ V)/X? d-separated as both paths
inactive
❑Is (U ⊥ V)/Y? Not d-separated as the first path
becomes active due to Y
St. Francis Institute of Technology PGM
Department of Computer Engineering Dr. Bidisha Roy 32
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

D-separation
❑Is (U ⊥ V)/Z? d-separated as the both paths
inactive
❑Is (W ⊥ X)? Not d-separated as WV→X is
active
❑Is (X ⊥ T)/V? d-separated as the both paths
inactive
❑Is (X ⊥ W)/U? Not d-separated as WV→X is
active
❑Is (Y ⊥ Z)? Not d-separated as
YXV→T→Z is active
❑Is (Y ⊥ Z)/T? d-separated as the both paths
inactive
St. Francis Institute of Technology PGM
Department of Computer Engineering Dr. Bidisha Roy 33
The material in this presentation belongs to St. Francis Institute of Technology and is solely for educational purposes. Distribution and modifications of the content is prohibited.

D-separation
❑Is (Y ⊥ Z)/T? Not d-separated as one path is
active
❑Is (Y ⊥ Z)/V? d-separated as both paths are
inactive
❑Is (W ⊥ Z)/V? d-separated as both paths are
inactive
❑Is (U ⊥ Z)? d-separated as both paths are
inactive
❑Is (U ⊥ Z)/Y? Not d-separated as 1st path is
active

https://www.youtube.com/watch?v=i0CGsHhjISU&t=628s

St. Francis Institute of Technology PGM


Department of Computer Engineering Dr. Bidisha Roy 34
Next…
❑Local Probabilistic Models
❑Inference and Variable Elimination

You might also like