0% found this document useful (0 votes)

9 views65 pages

Lecture Bayesian Networks

Uploaded by

haifa.zaidi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views65 pages

Lecture Bayesian Networks

Uploaded by

haifa.zaidi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 65

Bayesian Networks

Philipp Koehn

6 April 2017

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Outline 1

● Bayesian Networks

● Parameterized distributions

● Exact inference

● Approximate inference

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

bayesian networks

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Bayesian Networks 3

● A simple, graphical notation for conditional independence assertions

and hence for compact specification of full joint distributions

● Syntax
– a set of nodes, one per variable
– a directed, acyclic graph (link ≈ “directly influences”)
– a conditional distribution for each node given its parents:
P(Xi∣P arents(Xi))

● In the simplest case, conditional distribution represented as

a conditional probability table (CPT) giving the
distribution over Xi for each combination of parent values

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 4

● Topology of network encodes conditional independence assertions:

● W eather is independent of the other variables

● T oothache and Catch are conditionally independent given Cavity

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 5

● I’m at work, neighbor John calls to say my alarm is ringing, but neighbor Mary
doesn’t call. Sometimes it’s set off by minor earthquakes.
Is there a burglar?

● Variables: Burglar, Earthquake, Alarm, JohnCalls, M aryCalls

● Network topology reflects “causal” knowledge

– A burglar can set the alarm off
– An earthquake can set the alarm off
– The alarm can cause Mary to call
– The alarm can cause John to call

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 6

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Compactness 7

● A conditional probability table for Boolean Xi with k Boolean parents has 2k

rows for the combinations of parent values

● Each row requires one number p for Xi = true

(the number for Xi = f alse is just 1 − p)

● If each variable has no more than k parents,

the complete network requires O(n ⋅ 2k ) numbers

● I.e., grows linearly with n, vs. O(2n) for the full joint distribution

● For burglary net, 1 + 1 + 4 + 2 + 2 = 10 numbers (vs. 25 − 1 = 31)

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Global Semantics 8

● Global semantics defines the full joint distribution as the product of the local
conditional distributions:
n
P (x1, . . . , xn) = ∏ P (xi∣parents(Xi))
i=1

● E.g., P (j ∧ m ∧ a ∧ ¬b ∧ ¬e)

= P (j∣a)P (m∣a)P (a∣¬b, ¬e)P (¬b)P (¬e)

= 0.9 × 0.7 × 0.001 × 0.999 × 0.998
≈ 0.00063

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Local Semantics 9

● Local semantics: each node is conditionally independent

of its nondescendants given its parents

● Theorem: Local semantics ⇔ global semantics

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Markov Blanket 10

● Each node is conditionally independent of all others given its

Markov blanket: parents + children + children’s parents

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Constructing Bayesian Networks 11

● Need a method such that a series of locally testable assertions of

conditional independence guarantees the required global semantics
1. Choose an ordering of variables X1, . . . , Xn
2. For i = 1 to n
add Xi to the network
select parents from X1, . . . , Xi−1 such that
P(Xi∣P arents(Xi)) = P(Xi∣X1, . . . , Xi−1)

● This choice of parents guarantees the global semantics:

n
P(X1, . . . , Xn) = ∏ P(Xi∣X1, . . . , Xi−1) (chain rule)
i=1
n
= ∏ P(Xi∣P arents(Xi)) (by construction)
i=1

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 12

● Suppose we choose the ordering M , J, A, B, E

● P (J∣M ) = P (J)?

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 13

● Suppose we choose the ordering M , J, A, B, E

● P (J∣M ) = P (J)? No
● P (A∣J, M ) = P (A∣J)? P (A∣J, M ) = P (A)?

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 14

● Suppose we choose the ordering M , J, A, B, E

● P (J∣M ) = P (J)? No
● P (A∣J, M ) = P (A∣J)? P (A∣J, M ) = P (A)? No
● P (B∣A, J, M ) = P (B∣A)?
● P (B∣A, J, M ) = P (B)?

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 15

● Suppose we choose the ordering M , J, A, B, E

● P (J∣M ) = P (J)? No
● P (A∣J, M ) = P (A∣J)? P (A∣J, M ) = P (A)? No
● P (B∣A, J, M ) = P (B∣A)? Yes
● P (B∣A, J, M ) = P (B)? No
● P (E∣B, A, J, M ) = P (E∣A)?
● P (E∣B, A, J, M ) = P (E∣A, B)?

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 16

● Suppose we choose the ordering M , J, A, B, E

● P (J∣M ) = P (J)? No
● P (A∣J, M ) = P (A∣J)? P (A∣J, M ) = P (A)? No
● P (B∣A, J, M ) = P (B∣A)? Yes
● P (B∣A, J, M ) = P (B)? No
● P (E∣B, A, J, M ) = P (E∣A)? No
● P (E∣B, A, J, M ) = P (E∣A, B)? Yes

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 17

● Deciding conditional independence is hard in noncausal directions

● (Causal models and conditional independence seem hardwired for humans!)
● Assessing conditional probabilities is hard in noncausal directions
● Network is less compact: 1 + 2 + 4 + 2 + 4 = 13 numbers needed

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example: Car Diagnosis 18

● Initial evidence: car won’t start

● Testable variables (green), “broken, so fix it” variables (orange)
● Hidden variables (gray) ensure sparse structure, reduce parameters

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example: Car Insurance 19

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Compact Conditional Distributions 20

● CPT grows exponentially with number of parents

CPT becomes infinite with continuous-valued parent or child

● Solution: canonical distributions that are defined compactly

● Deterministic nodes are the simplest case:

X = f (P arents(X)) for some function f

● E.g., Boolean functions

N orthAmerican ⇔ Canadian ∨ U S ∨ M exican

● E.g., numerical relationships among continuous variables

∂Level
= inflow + precipitation - outflow - evaporation
∂t

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Compact Conditional Distributions 21

● Noisy-OR distributions model multiple noninteracting causes

– parents U1 . . . Uk include all causes (can add leak node)
– independent failure probability qi for each cause alone
Ô⇒ P (X∣U1 . . . Uj , ¬Uj+1 . . . ¬Uk ) = 1 − ∏ji = 1 qi

Cold F lu M alaria P (F ever) P (¬F ever)

F F F 0.0 1.0
F F T 0.9 0.1
F T F 0.8 0.2
F T T 0.98 0.02 = 0.2 × 0.1
T F F 0.4 0.6
T F T 0.94 0.06 = 0.6 × 0.1
T T F 0.88 0.12 = 0.6 × 0.2
T T T 0.988 0.012 = 0.6 × 0.2 × 0.1

● Number of parameters linear in number of parents

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Hybrid (Discrete+Continuous) Networks 22

● Discrete (Subsidy? and Buys?); continuous (Harvest and Cost)

● Option 1: discretization—possibly large errors, large CPTs

Option 2: finitely parameterized canonical families

● 1) Continuous variable, discrete+continuous parents (e.g., Cost)

2) Discrete variable, continuous parents (e.g., Buys?)

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Continuous Child Variables 23

● Need one conditional density function for child variable given continuous
parents, for each possible assignment to discrete parents

● Most common is the linear Gaussian model, e.g.,:

P (Cost = c∣Harvest = h, Subsidy? = true)

= N (ath + bt, σt)(c)
2
1 1 c − (ath + bt)
= √ exp (− ( ) )
σt 2π 2 σt

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Continuous Child Variables 24

● All-continuous network with LG distributions

Ô⇒ full joint distribution is a multivariate Gaussian

● Discrete+continuous LG network is a conditional Gaussian network i.e., a

multivariate Gaussian over all continuous variables for each combination of
discrete variable values

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Discrete Variable w/ Continuous Parents 25

● Probability of Buys? given Cost should be a “soft” threshold:

● Probit distribution uses integral of Gaussian:

x
Φ(x) = ∫−∞ N (0, 1)(x)dx
P (Buys? = true ∣ Cost = c) = Φ((−c + µ)/σ)

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Why the Probit? 26

● It’s sort of the right shape

● Can view as hard threshold whose location is subject to noise

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Discrete Variable 27

● Sigmoid (or logit) distribution also used in neural networks:

1
P (Buys? = true ∣ Cost = c) =
1 + exp(−2 −c+µ
σ )

● Sigmoid has similar shape to probit but much longer tails:

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

inference

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Inference Tasks 29

● Simple queries: compute posterior marginal P(Xi∣E = e)

e.g., P (N oGas∣Gauge = empty, Lights = on, Starts = f alse)

● Conjunctive queries: P(Xi, Xj ∣E = e) = P(Xi∣E = e)P(Xj ∣Xi, E = e)

● Optimal decisions: decision networks include utility information;

probabilistic inference required for P (outcome∣action, evidence)

● Value of information: which evidence to seek next?

● Sensitivity analysis: which probability values are most critical?

● Explanation: why do I need a new starter motor?

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Inference by Enumeration 30

● Slightly intelligent way to sum out variables from the joint without actually
constructing its explicit representation

● Simple query on the burglary network

P(B∣j, m)
= P(B, j, m)/P (j, m)
= αP(B, j, m)
= α ∑e ∑a P(B, e, a, j, m)

● Rewrite full joint entries using product of CPT entries:

P(B∣j, m)
= α ∑e ∑a P(B)P (e)P(a∣B, e)P (j∣a)P (m∣a)
= αP(B) ∑e P (e) ∑a P(a∣B, e)P (j∣a)P (m∣a)

● Recursive depth-first enumeration: O(n) space, O(dn) time

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Enumeration Algorithm 31

function E NUMERATION -A SK(X, e, bn) returns a distribution over X

inputs: X, the query variable
e, observed values for variables E
bn, a Bayesian network with variables {X} ∪ E ∪ Y
Q(X ) ← a distribution over X, initially empty
for each value xi of X do
extend e with value xi for X
Q(xi ) ← E NUMERATE -A LL(VARS[bn], e)
return N ORMALIZE(Q(X ))
function E NUMERATE -A LL(vars, e) returns a real number
if E MPTY ?(vars) then return 1.0
Y ← F IRST(vars)
if Y has value y in e
then return P (y ∣ P a(Y )) × E NUMERATE -A LL(R EST(vars), e)
else return ∑y P (y ∣ P a(Y )) × E NUMERATE -A LL(R EST(vars), ey )
where ey is e extended with Y = y

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Evaluation Tree 32

● Enumeration is inefficient: repeated computation

e.g., computes P (j∣a)P (m∣a) for each value of e

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Inference by Variable Elimination 33

● Variable elimination: carry out summations right-to-left,

storing intermediate results (factors) to avoid recomputation
P(B∣j, m)
= α P(B) ∑e P (e) ∑a P(a∣B, e) P (j∣a) P (m∣a)
² ² ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¸¹ ¹ ¹ ¹¶ ´¹¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¶
B E A J M

= αP(B) ∑e P (e) ∑a P(a∣B, e)P (j∣a)fM (a)

= αP(B) ∑e P (e) ∑a P(a∣B, e)fJ (a)fM (a)
= αP(B) ∑e P (e) ∑a fA(a, b, e)fJ (a)fM (a)
= αP(B) ∑e P (e)fĀJM (b, e) (sum out A)
= αP(B)fĒ ĀJM (b) (sum out E)
= αfB (b) × fĒ ĀJM (b)

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Variable Elimination Algorithm 34

function E LIMINATION -A SK(X, e, bn) returns a distribution over X

inputs: X, the query variable
e, evidence specified as an event
bn, a belief network specifying joint distribution P(X1, . . . , Xn)
factors ← [ ]; vars ← R EVERSE(VARS[bn])
for each var in vars do
factors ← [M AKE -FACTOR(var , e)∣factors]
if var is a hidden variable then factors ← S UM -O UT(var, factors)
return N ORMALIZE(P OINTWISE -P RODUCT(factors))

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Irrelevant Variables 35

● Consider the query P (JohnCalls∣Burglary = true)

P (J∣b) = αP (b) ∑ P (e) ∑ P (a∣b, e)P (J∣a) ∑ P (m∣a)
e a m
Sum over m is identically 1; M is irrelevant to the query

● Theorem 1: Y is irrelevant unless Y ∈ Ancestors({X} ∪ E)

● Here
– X = JohnCalls, E = {Burglary}
– Ancestors({X} ∪ E) = {Alarm, Earthquake}
⇒ M aryCalls is irrelevant

● Compare this to backward chaining from the query in Horn clause KBs

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Irrelevant Variables 36

● Definition: moral graph of Bayes net: marry all parents and drop arrows

● Definition: A is m-separated from B by C iff separated by C in the moral graph

● Theorem 2: Y is irrelevant if m-separated from X by E

● For P (JohnCalls∣Alarm = true), both

Burglary and Earthquake are irrelevant

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Complexity of Exact Inference 37

● Singly connected networks (or polytrees)

– any two nodes are connected by at most one (undirected) path
– time and space cost of variable elimination are O(dk n)

● Multiply connected networks

– can reduce 3SAT to exact inference Ô⇒ NP-hard
– equivalent to counting 3SAT models Ô⇒ #P-complete

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

approximate inference

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Inference by Stochastic Simulation 39

● Basic idea
– Draw N samples from a sampling distribution S
– Compute an approximate posterior probability P̂
– Show this converges to the true probability P

● Outline
– Sampling from an empty network
– Rejection sampling: reject samples disagreeing with evidence
– Likelihood weighting: use evidence to weight samples
– Markov chain Monte Carlo (MCMC): sample from a stochastic process
whose stationary distribution is the true posterior

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Sampling from an Empty Network 40

function P RIOR -S AMPLE(bn) returns an event sampled from bn

inputs: bn, a belief network specifying joint distribution P(X1, . . . , Xn)
x ← an event with n elements
for i = 1 to n do
xi ← a random sample from P(Xi ∣ parents(Xi))
given the values of P arents(Xi) in x
return x

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 41

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 42

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 43

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 44

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 45

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 46

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Example 47

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Sampling from an Empty Network 48

● Probability that P RIOR S AMPLE generates a particular event

SP S (x1 . . . xn) = ∏ni= 1 P (xi∣parents(Xi)) = P (x1 . . . xn)
i.e., the true prior probability

● E.g., SP S (t, f, t, t) = 0.5 × 0.9 × 0.8 × 0.9 = 0.324 = P (t, f, t, t)

● Let NP S (x1 . . . xn) be the number of samples generated for event x1, . . . , xn

● Then we have lim P̂ (x1, . . . , xn) = lim NP S (x1, . . . , xn)/N

N →∞ N →∞
= SP S (x1, . . . , xn)
= P (x1 . . . xn)

● That is, estimates derived from P RIOR S AMPLE are consistent

● Shorthand: P̂ (x1, . . . , xn) ≈ P (x1 . . . xn)

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Rejection Sampling 49

● P̂(X∣e) estimated from samples agreeing with e

function R EJECTION -S AMPLING(X, e, bn, N) returns an estimate of P (X ∣e)

local variables: N, a vector of counts over X, initially zero
for j = 1 to N do
x ← P RIOR -S AMPLE(bn)
if x is consistent with e then
N[x] ← N[x]+1 where x is the value of X in x
return N ORMALIZE(N[X])

● E.g., estimate P(Rain∣Sprinkler = true) using 100 samples

27 samples have Sprinkler = true
Of these, 8 have Rain = true and 19 have Rain = f alse
● P̂(Rain∣Sprinkler = true) = N ORMALIZE(⟨8, 19⟩) = ⟨0.296, 0.704⟩
● Similar to a basic real-world empirical estimation procedure

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Analysis of Rejection Sampling 50

● P̂(X∣e) = αNP S (X, e) (algorithm defn.)

= NP S (X, e)/NP S (e) (normalized by NP S (e))
≈ P(X, e)/P (e) (property of P RIOR S AMPLE)
= P(X∣e) (defn. of conditional probability)

● Hence rejection sampling returns consistent posterior estimates

● Problem: hopelessly expensive if P (e) is small

● P (e) drops off exponentially with number of evidence variables!

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Likelihood Weighting 51

● Idea: fix evidence variables, sample only nonevidence variables,

and weight each sample by the likelihood it accords the evidence

function L IKELIHOOD -W EIGHTING(X, e, bn, N) returns an estimate of P (X ∣e)

local variables: W, a vector of weighted counts over X, initially zero
for j = 1 to N do
x, w ← W EIGHTED -S AMPLE(bn)
W[x ] ← W[x ] + w where x is the value of X in x
return N ORMALIZE(W[X ])

function W EIGHTED -S AMPLE(bn, e) returns an event and a weight

x ← an event with n elements; w ← 1
for i = 1 to n do
if Xi has a value xi in e
then w ← w × P (Xi = xi ∣ parents(Xi ))
else xi ← a random sample from P(Xi ∣ parents(Xi ))
return x, w

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Likelihood Weighting Example 52

w = 1.0

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Likelihood Weighting Example 53

w = 1.0

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Likelihood Weighting Example 54

w = 1.0

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Likelihood Weighting Example 55

w = 1.0 × 0.1

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Likelihood Weighting Example 56

w = 1.0 × 0.1

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Likelihood Weighting Example 57

w = 1.0 × 0.1

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Likelihood Weighting Example 58

w = 1.0 × 0.1 × 0.99 = 0.099

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Likelihood Weighting Analysis 59

● Sampling probability for W EIGHTED S AMPLE is

SW S (z, e) = ∏li = 1 P (zi∣parents(Zi))

● Note: pays attention to evidence in ancestors only

Ô⇒ somewhere “in between” prior and
posterior distribution

● Weight for a given sample z, e is

w(z, e) = ∏mi = 1 P (ei ∣parents(Ei ))

● Weighted sampling probability is

SW S (z, e)w(z, e)
= ∏li = 1 P (zi∣parents(Zi)) ∏m
i = 1 P (ei ∣parents(Ei ))
= P (z, e) (by standard global semantics of network)

● Hence likelihood weighting returns consistent estimates

but performance still degrades with many evidence variables
because a few samples have nearly all the total weight

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Approximate Inference using MCMC 60

● “State” of network = current assignment to all variables

● Generate next state by sampling one variable given Markov blanket
Sample each variable in turn, keeping evidence fixed

function MCMC-A SK(X, e, bn, N) returns an estimate of P (X ∣e)

local variables: N[X ], a vector of counts over X, initially zero
Z, the nonevidence variables in bn
x, the current state of the network, initially copied from e
initialize x with random values for the variables in Y
for j = 1 to N do
for each Zi in Z do
sample the value of Zi in x from P(Zi ∣mb(Zi ))
given the values of M B(Zi ) in x
N[x ] ← N[x ] + 1 where x is the value of X in x
return N ORMALIZE(N[X ])

● Can also choose a variable to sample at random each time

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

The Markov Chain 61

● With Sprinkler = true, W etGrass = true, there are four states:

● Wander about for a while, average what you see

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

MCMC Example 62

● Estimate P(Rain∣Sprinkler = true, W etGrass = true)

● Sample Cloudy or Rain given its Markov blanket, repeat.

Count number of times Rain is true and false in the samples.

● E.g., visit 100 states

31 have Rain = true, 69 have Rain = f alse

● P̂(Rain∣Sprinkler = true, W etGrass = true)

= N ORMALIZE(⟨31, 69⟩) = ⟨0.31, 0.69⟩

● Theorem: chain approaches stationary distribution:

long-run fraction of time spent in each state is exactly
proportional to its posterior probability

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Markov Blanket Sampling 63

● Markov blanket of Cloudy is Sprinkler and Rain

● Markov blanket of Rain is

Cloudy, Sprinkler, and W etGrass

● Probability given the Markov blanket is calculated as follows:

P (x′i∣mb(Xi)) = P (x′i∣parents(Xi)) ∏Zj ∈Children(Xi) P (zj ∣parents(Zj ))

● Easily implemented in message-passing parallel systems, brains

● Main computational problems

– difficult to tell if convergence has been achieved
– can be wasteful if Markov blanket is large:
P (Xi∣mb(Xi)) won’t change much (law of large numbers)

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Summary 64

● Bayes nets provide a natural representation for (causally induced)

conditional independence
● Topology + CPTs = compact representation of joint distribution
● Generally easy for (non)experts to construct
● Canonical distributions (e.g., noisy-OR) = compact representation of CPTs
● Continuous variables Ô⇒ parameterized distributions (e.g., linear Gaussian)
● Exact inference by variable elimination
– polytime on polytrees, NP-hard on general graphs
– space = time, very sensitive to topology
● Approximate inference by LW, MCMC
– LW does poorly when there is lots of (downstream) evidence
– LW, MCMC generally insensitive to topology
– Convergence can be very slow with probabilities close to 1 or 0
– Can handle arbitrary combinations of discrete and continuous variables

Philipp Koehn Artificial Intelligence: Bayesian Networks 6 April 2017

Penpal
100% (5)
Penpal
128 pages
Optimization Theory with Applications
From Everand
Optimization Theory with Applications
Donald A. Pierre
4/5 (4)
Hormisdallen Primary Schools English Language Scheme of Work For Primary Seven Term One
100% (2)
Hormisdallen Primary Schools English Language Scheme of Work For Primary Seven Term One
34 pages
Beyond The Last Blue Mountain
0% (2)
Beyond The Last Blue Mountain
16 pages
Lecture Bayesian Networks
No ratings yet
Lecture Bayesian Networks
50 pages
PPT06-Probabilistic Reasoning
No ratings yet
PPT06-Probabilistic Reasoning
31 pages
Bayesian and inference
No ratings yet
Bayesian and inference
86 pages
Probabilistic Reasoning
No ratings yet
Probabilistic Reasoning
58 pages
BayesNets2016
No ratings yet
BayesNets2016
62 pages
CHAPTER_13
No ratings yet
CHAPTER_13
65 pages
13 Bayes Nets
No ratings yet
13 Bayes Nets
38 pages
Bayesian Neworks
No ratings yet
Bayesian Neworks
32 pages
Unit-5 Bayes' Rule and Bayesian Network
No ratings yet
Unit-5 Bayes' Rule and Bayesian Network
9 pages
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation
No ratings yet
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation
58 pages
Lecture 5 Bayesian Networks
No ratings yet
Lecture 5 Bayesian Networks
12 pages
Bayesian Networks
No ratings yet
Bayesian Networks
8 pages
Bayesian Belief Network in Artificial Intelligence
No ratings yet
Bayesian Belief Network in Artificial Intelligence
10 pages
Unit 5
No ratings yet
Unit 5
98 pages
Bayesian Networks and Inference
No ratings yet
Bayesian Networks and Inference
50 pages
Unit 6
No ratings yet
Unit 6
126 pages
Bayesian Networks
No ratings yet
Bayesian Networks
45 pages
Bayesian Networks: Chapter 14, Sections 1-4
No ratings yet
Bayesian Networks: Chapter 14, Sections 1-4
22 pages
Aiml Unit 2
No ratings yet
Aiml Unit 2
15 pages
Uncertain Knowledge
No ratings yet
Uncertain Knowledge
31 pages
An Introduction To Artificial Intelligence: Chapter 13 &14.1-14.2: Uncertainty & Bayesian Networks
No ratings yet
An Introduction To Artificial Intelligence: Chapter 13 &14.1-14.2: Uncertainty & Bayesian Networks
31 pages
2021 Lecture09 BayesianNetworks
No ratings yet
2021 Lecture09 BayesianNetworks
60 pages
2025 MMP AI-KRR Unit 4 Uncertain Knowledge and Reasoning
No ratings yet
2025 MMP AI-KRR Unit 4 Uncertain Knowledge and Reasoning
79 pages
Baes Rule
No ratings yet
Baes Rule
8 pages
AI14
No ratings yet
AI14
6 pages
3. Probabilistic Reasoning
No ratings yet
3. Probabilistic Reasoning
37 pages
Bayesian Networks in AI
No ratings yet
Bayesian Networks in AI
8 pages
Libpgm For Bayesian Networks: Dr. A. Obulesh Associate Professor
No ratings yet
Libpgm For Bayesian Networks: Dr. A. Obulesh Associate Professor
59 pages
AI_UNIT_2
No ratings yet
AI_UNIT_2
30 pages
AI Unit-3
No ratings yet
AI Unit-3
51 pages
AIFA 25 Bayesian Logic 120324
No ratings yet
AIFA 25 Bayesian Logic 120324
33 pages
4.2 Bayes-nets
No ratings yet
4.2 Bayes-nets
33 pages
CS480 Lecture October24th
No ratings yet
CS480 Lecture October24th
90 pages
Bayesian Belief Network
No ratings yet
Bayesian Belief Network
41 pages
EECS6895 AdvancedBigDataAnalytics Lecture6
No ratings yet
EECS6895 AdvancedBigDataAnalytics Lecture6
81 pages
L08 Probabilistic Reasoning
No ratings yet
L08 Probabilistic Reasoning
90 pages
Bayesian Networks
No ratings yet
Bayesian Networks
16 pages
ANU July2001 Tutorial 4
No ratings yet
ANU July2001 Tutorial 4
28 pages
Ai Notes
No ratings yet
Ai Notes
68 pages
Lec7_Bayesian Network I(1)
No ratings yet
Lec7_Bayesian Network I(1)
62 pages
Ai Pro
No ratings yet
Ai Pro
11 pages
Artificial Intelligence: Adina Magda Florea
No ratings yet
Artificial Intelligence: Adina Magda Florea
36 pages
Aids Lab PDF
No ratings yet
Aids Lab PDF
53 pages
Contact session6
No ratings yet
Contact session6
57 pages
EXP1_A09_DS
No ratings yet
EXP1_A09_DS
6 pages
202004021910158758chandrabhan Artificial Intelligence Probabilistic Reasoning
No ratings yet
202004021910158758chandrabhan Artificial Intelligence Probabilistic Reasoning
11 pages
Cs3491 Aiml Unit 2 Qbank
No ratings yet
Cs3491 Aiml Unit 2 Qbank
33 pages
07 Bayesian Networks
No ratings yet
07 Bayesian Networks
106 pages
Good BayesianNetworksPrimer
No ratings yet
Good BayesianNetworksPrimer
23 pages
4 Unce
No ratings yet
4 Unce
32 pages
Learning Bayesian Networks (Neapolitan, Richard) PDF
100% (1)
Learning Bayesian Networks (Neapolitan, Richard) PDF
704 pages
Calculus I Essentials
From Everand
Calculus I Essentials
Editors of REA
1/5 (1)
A Short Course in Discrete Mathematics
From Everand
A Short Course in Discrete Mathematics
Edward A. Bender
3/5 (1)
Lectures on Boolean Algebras
From Everand
Lectures on Boolean Algebras
Paul R. Halmos
4/5 (2)
Mathematical Foundations of Information Theory
From Everand
Mathematical Foundations of Information Theory
A. Ya. Khinchin
3.5/5 (9)
Math for Computer Applications
From Everand
Math for Computer Applications
The Editors of REA
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Lectures on the Coupling Method
From Everand
Lectures on the Coupling Method
Torgny Lindvall
No ratings yet
Introduction to Minimax
From Everand
Introduction to Minimax
V. F. Dem’yanov
No ratings yet
Women's Studies International Forum 46 (2014)
No ratings yet
Women's Studies International Forum 46 (2014)
135 pages
Online Shopping
50% (6)
Online Shopping
54 pages
Sheqxel Hse Kpi Dashboard Template IV
No ratings yet
Sheqxel Hse Kpi Dashboard Template IV
11 pages
Failure Analysis of Mechanical Components PDF
67% (3)
Failure Analysis of Mechanical Components PDF
2 pages
''Different Responses To Compliments in Chinese and English
No ratings yet
''Different Responses To Compliments in Chinese and English
16 pages
CSE110 Practice Problems
No ratings yet
CSE110 Practice Problems
40 pages
Mock Test 1
No ratings yet
Mock Test 1
7 pages
Schedule Margin Key in Produc
No ratings yet
Schedule Margin Key in Produc
4 pages
Bend and Elbow
No ratings yet
Bend and Elbow
7 pages
Cafe Naver
No ratings yet
Cafe Naver
15 pages
Econometrics Methodologies
No ratings yet
Econometrics Methodologies
4 pages
Project Management in Practice 6th Edition Meredith Solutions Manual - Download All Chapters Immediately In PDF Format
100% (3)
Project Management in Practice 6th Edition Meredith Solutions Manual - Download All Chapters Immediately In PDF Format
46 pages
Binary Subtractor
No ratings yet
Binary Subtractor
8 pages
UNIT 11
No ratings yet
UNIT 11
11 pages
Gender Student Guidelines
No ratings yet
Gender Student Guidelines
12 pages
Afro-Asian Literature
100% (2)
Afro-Asian Literature
64 pages
Preload - Part 2 Tightening Strategies
No ratings yet
Preload - Part 2 Tightening Strategies
3 pages
Levels of Analysis: - Chapter 3 - PS130 World Politics - Michael R. Baysdell
No ratings yet
Levels of Analysis: - Chapter 3 - PS130 World Politics - Michael R. Baysdell
28 pages
Example NPD Report Acik
No ratings yet
Example NPD Report Acik
13 pages
VTS Model Course V103-1
0% (1)
VTS Model Course V103-1
86 pages
Chapter 11 PDF
No ratings yet
Chapter 11 PDF
8 pages
Balanced Sourcing The Honda Way
No ratings yet
Balanced Sourcing The Honda Way
4 pages
BSRIA A-Z List
No ratings yet
BSRIA A-Z List
14 pages
DBS3900 CDMA Hardware Structure: Wireless Case & Training Department
No ratings yet
DBS3900 CDMA Hardware Structure: Wireless Case & Training Department
66 pages
1549831480948resume Mohammed
No ratings yet
1549831480948resume Mohammed
2 pages
Lesson 2 The Role of Technology in Delivering The Curriculum
100% (1)
Lesson 2 The Role of Technology in Delivering The Curriculum
16 pages
APPLE MacBook Pro With Retina Display-e-Catalogue - LKPP
No ratings yet
APPLE MacBook Pro With Retina Display-e-Catalogue - LKPP
2 pages