0% found this document useful (0 votes)
29 views

AAI Module 3 Notes

The document discusses various techniques for representing knowledge and performing inference in Bayesian networks. It describes Bayesian networks, exact inference algorithms like variable elimination and enumeration, and approximate inference methods like MCMC, importance sampling, and variational inference.

Uploaded by

Jenifer Mary
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

AAI Module 3 Notes

The document discusses various techniques for representing knowledge and performing inference in Bayesian networks. It describes Bayesian networks, exact inference algorithms like variable elimination and enumeration, and approximate inference methods like MCMC, importance sampling, and variational inference.

Uploaded by

Jenifer Mary
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

ADVANCED ARTIFICIAL INTELLIGENCE (18AI71)

Module 3

Representing Knowledge in an Uncertain Domain - Bayesian Networks


• A Bayesian network is a probabilistic graphical model which represents a set of variables and their
conditional dependencies using a directed acyclic graph. It is also called a Bayes network, belief
network, decision network, or Bayesian model.
• Bayesian networks are probabilistic, because these networks are built from a probability distribution,
and also use probability theory for prediction and anomaly detection.
• Real world applications are probabilistic in nature, and to represent the relationship between
multiple events, we need a Bayesian network. It can also be used in various tasks including prediction,
anomaly detection, diagnostics, automated insight, reasoning, time series prediction, and decision
making under uncertainty.
• Bayesian Network can be used for building models from data and experts’ opinions, and it consists
of two parts:
o Directed Acyclic Graph
o Table of conditional probabilities
• The generalized form of Bayesian network that represents and solve decision problems under
uncertain knowledge is known as an Influence diagram.
• A Bayesian network graph is made up of nodes and Arcs (directed links), where:

• Each node corresponds to the random variables, and a variable can be continuous or discrete.
• Arc or directed arrows represent the causal relationship or conditional probabilities between random
variables. These directed links or arrows connect the pair of nodes in the graph.
• These links represent that one node directly influence the other node, and if there is no directed link
that means that nodes are independent with each other.
• In the diagram, A, B, C, and D, are random variables represented by the nodes of the network graph.
• If we are considering node B, which is connected with node A by a directed arrow, then node A is
called the parent of Node B.
• Node C is independent of node A.

• A simple Bayesian network in which Weather is independent of of the other 3 variables.


Dr. Navaneeth Bhaskar, Associate Professor, CSE (Data Science), SCEM Mangalore 1
• Toothache and Catch are conditionally independent, given Cavity. This is indicated by the absence of
link between Toothache and Catch.
• It also indicates that Cavity is the direct cause of Toothache and Catch, whereas no direct causal
relationship exists between Toothache and Catch.

Example 1:

• You have installed a new burglar alarm at your home to detect burglary. The alarm reliably responds
at detecting a burglary but also responds for minor earthquakes. You have two neighbours John and
Mary, who have taken a responsibility to inform you when they hear the alarm. John always calls
when he hears the alarm, but sometimes he gets confused with the phone ringing and calls at that
time too. On the other hand, Mary likes to listen to high music, so sometimes she misses to hear the
alarm.
• Problem: Calculate the probability that alarm has sounded, but there is neither a burglary, nor an
earthquake occurred, and John and Mary, both called you.
• List of all events occurring in this network:
o Burglary (B)
o Earthquake(E)
o Alarm(A)
o John Calls(J)
o Mary calls(M)

• The probability that alarm has sounded, but there is neither a burglary, nor an earthquake occurred,
and John and Mary both called you is:

Exact Inference in Bayesian Networks


• Exact inference in Bayesian networks refers to the process of calculating the exact probability
distribution of certain variables in the network given evidence (observed values) and the model's
structure.
• Bayesian networks are graphical models that represent probabilistic relationships among a set of
variables using a directed acyclic graph (DAG). Nodes in the graph represent variables, and edges
represent probabilistic dependencies between them.
In Bayesian networks, two main types of inference are often performed:

Dr. Navaneeth Bhaskar, Associate Professor, CSE (Data Science), SCEM Mangalore 2
• Probability Query (Marginal Inference): This involves calculating the marginal probability of a variable
or a set of variables given evidence. The goal is to determine the likelihood of certain events occurring
based on observed data.
• MAP Query (Maximum A Posteriori): This involves finding the most probable assignment of values to
a set of variables given evidence. It aims to identify the most likely explanation for the observed data.

Variable elimination algorithm for inference in Bayesian networks


• Variable elimination (VE) is a simple and general exact inference algorithm in probabilistic graphical
models, such as Bayesian networks and Markov random fields.
• It can be used for inference of maximum a posteriori (MAP) state or estimation of conditional or
marginal distributions over a subset of variables.
• The algorithm has exponential time complexity, but could be efficient in practice for low-treewidth
graphs, if the proper elimination order is used.
• Start with the Bayesian network. Identify the query variable(s) for which you want to compute
probabilities.
• Identify the evidence variables and their observed values.
• Create a factor for each conditional probability distribution in the network. For each conditional
probability distribution in the network, create a factor. A factor is a table representing the conditional
probabilities of a variable given its parents.
• Modify the factors based on the observed evidence. Set the probabilities in the factor to 0 or 1 based
on the observed values of the evidence variables.
• Choose an order for eliminating variables. A good variable ordering can significantly reduce
computation time.
• For each variable in the elimination order: Multiply the factors involving the variable and Sum out
the variable by marginalizing over its values.

• This operation essentially eliminates the variable X by summing over all its possible values.
• Repeat the elimination process for each variable in the chosen order until only factors involving the
query variables remain.
• Multiply the remaining factors to get the joint probability distribution over the query variables.
• Normalize the resulting distribution to obtain the final probabilities.

Dr. Navaneeth Bhaskar, Associate Professor, CSE (Data Science), SCEM Mangalore 3
Enumeration algorithm for answering queries on Bayesian networks
• The Enumeration algorithm is a simple but exact method for answering probability queries in
Bayesian networks.
• It is based on the principle of systematically enumerating all possible combinations of values for the
variables in the network, considering the evidence observed.
• The algorithm calculates the probability of interest by summing up the joint probabilities of the
relevant combinations.

Rejection-sampling algorithm for answering queries in Bayesian network

• Rejection sampling is a general REJECTION method for producing samples from a hard-to-sample
distribution given an easy-to-sample distribution.
• First, it generates samples from the prior distribution specified by the network. Then, it rejects all
those that do not match the evidence.
• The biggest problem with rejection sampling is that it rejects so many samples. The fraction of
samples consistent with the evidence e drops exponentially as the number of evidence variables
grows, so the procedure is simply unusable for complex problems.

Dr. Navaneeth Bhaskar, Associate Professor, CSE (Data Science), SCEM Mangalore 4
Likelihood-weighting algorithm for inference in Bayesian networks
• Likelihood weighting is a form of importance sampling where the variables are sampled in the order
defined by a belief network, and evidence is used to update the weights.
• The weights reflect the probability that a sample would not be rejected.

Approximate Inference in Bayesian Networks


• Approximate inference in Bayesian networks is employed when exact inference becomes
computationally infeasible due to the complexity of the network or the size of the dataset.
• The goal is to estimate probabilities and make predictions without explicitly computing the entire
probability distribution.
• Various methods for approximate inference exist, and they can be broadly categorized into sampling-
based methods and variational methods.
Markov Chain Monte Carlo (MCMC):

• MCMC methods, such as the Metropolis-Hastings algorithm and Gibbs sampling, involve iteratively
sampling values for the unobserved variables given the observed evidence.
• These methods aim to explore the space of possible configurations and converge to the true
distribution.
Importance Sampling:

• Importance sampling involves sampling from an easily sampled distribution (the proposal
distribution) and re-weighting the samples to approximate the target distribution.
• The challenge is to choose a proposal distribution that is easy to sample from and has significant
overlap with the target distribution.

Dr. Navaneeth Bhaskar, Associate Professor, CSE (Data Science), SCEM Mangalore 5
Variational Inference:

• Variational inference formulates the inference problem as an optimization task. It introduces a family
of approximating distributions and seeks the distribution within that family that is closest to the true
posterior.
• The optimization involves minimizing the Kullback-Leibler divergence between the true distribution
and the approximating distribution.
Expectation-Maximization (EM):

• EM is an iterative optimization algorithm that alternates between an E-step, where the expected
value of the unobserved variables is calculated given the observed data and current parameters, and
an M-step, where the parameters are updated to maximize the expected log-likelihood.
• EM is often used for learning the parameters of the Bayesian network when some variables are
unobserved.

Gibbs sampling algorithm for approximate inference in Bayesian networks


• Gibbs sampling is an algorithm to generate a sequence of samples from such a joint probability
distribution. The purpose of such a sequence is to approximate the joint distribution (as with a
histogram), or to compute an integral (such as an expected value).
• Gibbs sampling is applicable when the joint distribution is not known explicitly, but the conditional
distribution of each variable is known.
• The Gibbs sampling algorithm is used to generate an instance from the distribution of each variable
in turn, conditional on the current values of the other variables.
• It can be shown that the sequence of samples comprises a Markov chain, and the stationary
distribution of that Markov chain is just the sought-after joint distribution.
• Gibbs sampling is particularly well-adapted to sampling the posterior distribution of a Bayesian
network, since Bayesian networks are typically specified as a collection of conditional distributions.

Important Questions
1. Write the variable elimination algorithm and rejection-sampling algorithm for inference in Bayesian
networks.
2. Write the likelihood-weighting algorithm for inference in Bayesian networks and explain the working
of the algorithm.
Dr. Navaneeth Bhaskar, Associate Professor, CSE (Data Science), SCEM Mangalore 6
3. Explain the Gibbs sampling algorithm for approximate inference in Bayesian networks.
4. Explain the Enumeration algorithm for answering queries on Bayesian networks.
5. What is Exact Inference in Bayesian Networks.
6. Explain the use of Approximate Inference in Bayesian Networks.
7. We have a bag of three biased coins a, b, and c with probabilities of coming up heads of 20%, 60%,
and 80%, respectively. One coin is drawn randomly from the bag (with equal likelihood of drawing
each of the three coins), and then the coin is flipped three times to generate the outcomes X1, X2,
and X3.
Draw the Bayesian network corresponding to this setup and define the necessary CPTs.
Calculate which coin was most likely to have been drawn from the bag if the observed flips come out
heads twice and tails once.
8. A patient has a disease N. Physicians measure the value of a parameter P to see the disease
development. The parameter can take one of the following values {low, medium, high}. The value of
P is a result of patient’s unobservable condition/state S. S can be {good, poor}. The state changes
between two consecutive days in one fifth of cases. If the patient is in good condition, the value for
P is rather low (having 10 sample measurements, 5 of them are low, 3 medium and 2 high), while if
the patient is in poor condition, the value is rather high (having 10 measurements, 3 are low, 3
medium and 4 high). On arrival to the hospital on day 0, the patient’s condition was unknown, i.e., P
r (S0 = good) = 0.5.
Draw the transition and sensor model of the dynamic Bayesian network modelling the domain under
consideration.
Calculate probability that the patient is in good condition on day 2 given low P values on days 1 and
2.
can you determine the most likely patient state sequence in days 0, 1 and 2 without any additional
computations? justify.

Dr. Navaneeth Bhaskar, Associate Professor, CSE (Data Science), SCEM Mangalore 7

You might also like