Ai Unit 2
Ai Unit 2
SYLLABUS:
Acting under uncertainty – Bayesian inference – Naïve Bayes
models. Probabilistic reasoning – Bayesian networks – exact
inference in BN – approximate inference in BN – causal networks.
PART A
1. Define uncertainty and list the causes of uncertainty.
Uncertainty:
• The knowledge representation, A→B, means if A is true then B is true,
but a situation where not sure about whether A is true or not then cannot
express this statement, this situation is called uncertainty.
• So to represent uncertain knowledge, uncertain reasoning or probabilistic
reasoning is used.
Causes of uncertainty:
1. Causes of uncertainty in the real world
2. Information occurred from unreliable sources.
3. Experimental Errors
4. Equipment fault
5. Temperature variation
6. Climate change.
1
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
2
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
7. In a class, there are 70% of the students who like English and 40%
of the students who likes English and mathematics, and then what
is the percent of students those who like English also like
mathematics?
Solution:
Let, A is an event that a student likes Mathematics
B is an event that a student likes English.
Hence, 57% are the students who like English also like Mathematics
3
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
11. Consider two events: A (it will rain tomorrow) and B (the sun will
shine tomorrow).
• Use Bayes’ theorem to compute the posterior probability of each event
occurring, given the resulting weather conditions for today:
P(A|sunny) = P(sunny|A) * P(A) / P(sunny)
4
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
5
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
Local Semantics
6
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
PART B
Agents almost never have access to the whole truth about their environment.
Agents must, therefore, act under uncertainty.
The problem is that this rule is wrong. Not all patients with toothaches have
cavities; some of them have gum disease, an abscess, or one of several other
problems:
Unfortunately, in order to make the rule true, we have to add an almost unlimited
list of possible causes. We could try turning the rule into a causal rule:
7
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
But this rule is not right either; not all cavities cause pain The only way to fix the
rule is to make it logically exhaustive: to augment the left-hand side with all the
qualifications required for a cavity to cause a toothache. Even then, for the
purposes of diagnosis, one must also take into account the possibility that the
patient might have a toothache and a cavity that are unconnected. Trying to use
first-order logic to cope with a domain like medical diagnosis thus fails for three
main reasons:
That is, we expect that out of all the situations that are indistinguishable
from the current situation as far as the agent's knowledge goes, the patient will
have a cavity in 80% of them. This belief could be derived from statistical data-80%
of the toothache patients seen so far have had cavities-or from some general rules,
or from a combination of evidence sources.
The 80% summarizes those cases in which all the factors needed for a cavity
to cause a toothache are present and other cases in which the patient has both
toothache and cavity but the two are unconnected. The missing 20% summarizes
all the other possible causes of toothache that we are too lazy or ignorant to confirm
or deny.
8
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
9
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
1.2.3 Probability:
• Probability can be defined as a chance that an uncertain event
will occur.
• The value of probability always remains between 0 and 1 that
represent ideal uncertainties.
o 0 ≤ P(A) ≤ 1, where P(A) is the probability of an event A.
o P(A) = 0, indicates total uncertainty in an event A.
o P(A) =1, indicates total certainty in an event A.
• Formula to find the probability of an uncertain event
10
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
1.2.4.1 Example:
In a class, there are 70% of the students who like English and 40% of
the students who likes English and mathematics, and then what is
the percent of students those who like English also like mathematics?
Solution:
Let, A is an event that a student likes Mathematics
B is an event that a student likes English.
Hence, 57% are the students who like English also like Mathematics
2. Explain in detail about Bayesian inference and Naive Bayes Model or Naive
Bayes Theorem or Bayes Rule.
11
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
12
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
13
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
14
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
15
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
16
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
Global Semantics
Local Semantics
Markov Blanket
▪ Each node is conditionally independent of all others given its
Markov blanket: parents + children + children’s parents
3.4 Example:
Harry installed a new burglar alarm at his home to detect burglary.
The alarm reliably responds at detecting a burglary but also responds
for minor earthquakes. Harry has two neighbors David and Sophia,
who have taken a responsibility to inform Harry at work when they
hear the alarm. David always calls Harry when he hears the alarm,
but sometimes he got confused with the phone ringing and calls at
that time too. On the other hand, Sophia likes to listen to high music,
so sometimes she misses to hear the alarm. Here we would like to
compute the probability of Burglary Alarm.
Problem:
Calculate the probability that alarm has sounded, but there is
neither a burglary, nor an earthquake occurred, and David and
Sophia both called the Harry.
17
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
Solution:
• The Bayesian network for the above problem is given in figure 2.2.
The network structure is showing that burglary and earthquake is
the parent node of the alarm and directly affecting the probability
of alarm's going off, but David and Sophia's calls depend on alarm
probability.
•
•
•
•
18
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
Let's take the observed probability for the Burglary and earthquake
component:
• P(B=True) = 0.002, which is the probability of burglary.
• P(B=False)= 0.998, which is the probability of no burglary.
• P(E=True)= 0.001, which is the probability of a minor earthquake
• P(E=False)= 0.999, Which is the probability that an earthquake not
occurred.
19
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
From the formula of joint distribution, the problem statement in the form of
probability distribution:
P(S, D, A, ¬B, ¬E) = P (S|A) *P (D|A)*P (A|¬B ^ ¬E) *P (¬B) *P (¬E).
= 0.75* 0.91* 0.001* 0.998*0.999
= 0.00068045.
Hence, a Bayesian network can answer any query about the domain
by using Joint distribution.
20
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
2. Biomonitoring:
a. This involves the use of indicators to quantify the concentration of
chemicals in the human body.
3. Information retrieval:
a. Bayesian networks assist in information retrieval for research,
which is a constant process of extracting information from
databases.
4. Image processing:
a. A form of signal processing, image processing uses mathematical
operations to convert images into digital format.
5. Gene regulatory network:
a. A Bayesian network is an algorithm that can be applied to gene
regulatory networks in order to make predictions about the effects
of genetic variations on cellular phenotypes.
b. Gene regulatory networks are a set of mathematical equations that
describe the interactions between genes, proteins, and metabolites.
c. They are used to study how genetic variations affect the
development of a cell or organism.
6. Turbo code:
a. Turbo codes are a type of error correction code capable of achieving
very high data rates and long distances between error correcting
nodes in a communications system.
b. They have been used in satellites, space probes, deep-space
missions, military communications systems, and civilian wireless
communication systems, including WiFi and 4G LTE cellular
telephone systems.
7. Document classification:
a. The main issue is to assign a document multiple classes. The task
can be achieved manually and algorithmically. Since manual effort
takes too much time, algorithmic documentation is done to
complete it quickly and effectively.
4. Explain in detail about Bayesian Inference and its type Exact Inference
with suitable example.
The basic task for any probabilistic inference system is to compute the
posterior probability distribution for a set of query variables, given some observed
event-that is, some assignment of values to a set of evidence variables. We will use
21
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
the notation X denotes the query variable; E denotes the set of evidence variables
El, . . . , Em, and e is a particular observed event; Y will denote the nonevidence
variables Yl, . . . , (some- times called the hidden variables). Thus, the complete set
of variables X = {X} U E U Y. A typical query asks for the posterior probability
distribution P(X|e)4
Inference by enumeration
Conditional probability can be computed by summing terms from the full joint
distribution. More specifically, a query P(X|e) can be answered using Equation,
which we repeat here for convenience:
Consider the query P(Burglary1 JohnCalls = true, &Jury Calls = true). The
hidden variables for this query are Earthquake and Alarm. From Equation (13.6),
using initial letters for the variables in order to shorten the expressions, we have
22
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
The structure of this computation is shown in above Figure. Using the numbers
from Figure, we obtain P(b| j , m) = a x 0.00059224. 'The co~respondingc
omputation for ~b yields a x 0.0014919; hence
23
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
That is, the chance of a burglary, given calls from both neighbors, is about 28%.
The evaluation process for the expression in Equation is shown as an expression
tree in Figure.
24
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
the number of parents of each node is bounded by a constant, then the complexity
will also be linear in the number of nodes. These results hold for any ordering
consistent with the topological ordering of the network .
25
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
• A network with 2 nodes (fire icon and smoke icon) and 1 edge (arrow
pointing from fire to smoke).
• This network can be both a Bayesian or causal network.
• The key distinction, however, is when interpreting this network.
• For a Bayesian network, we view the nodes as variables and the arrow
as a conditional probability, namely the probability of smoke given
information about fire.
• When interpreting this as a causal network, we still view nodes as
variables, however, the arrow indicates a causal connection.
26
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
• In this case, both interpretations are valid. However, if we were to flip the
edge direction, the causal network interpretation would be invalid, since
smoke does not cause fire.
1.The do-operator
• The do-operator is a mathematical representation of a physical
intervention.
• If the model starts with Z → X → Y, simulate an intervention in X by
deleting all the incoming arrows to X, and manually setting X to some
value x_0. Refer Figure 2.4 denotes the example of do-operator.
27
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
2: Confounding
A simple example of confounding is shown in the figure 2.5 below.
28
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
We begin with a random sampling process for a Bayes net that has no evidence
associated with it. The idea is to sample each variable in turn, in topological order.
The probability distribution from which the value is sampled is conditioned on the
values already assigned to the variable’s parents. (Because we sample in topological
order, the parents are guaranteed to have values already.) This algorithm is shown
in Figure. Applying it to the network with the ordering Cloudy, Sprinkler, Rain,
WetGrass, we might produce a random event as follows:
29
CS 3491 – ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING APEC
Let ˆP(X |e) be the estimated distribution that the algorithm returns; this
distribution is computed by normalizing NPS(X,e), the vector of sample counts for
each value of X where the sample agrees with the evidence e:
Consider the query P(Rain1 Sprinkler = true, Wet Grass = true) applied to the
network in Figure. The evidence variables Sprinkler and WetGrass are fixed to their
observed values and the hidden variables Cloudy and Rain are initialized randomly-
let us say to true and false respectively. Thus, the initial state is [true, true, false,
true]. Now the following steps are executed repeatedly:
Cloudy is sampled, given the current values of its Markov blanket variables:
in this case, we sample from P(Cloudy1 Sprinkler = true, Rain =false). Suppose the result is Cloudy
=false. Then the new current state is [false, true, false, true].
1. Rain is sampled, given the current values of its Markov blanket variables: in this
case, we sample from P(Rain1 Cloudy =false, Sprinkler = true, WetGrass = true).
Suppose this yields Rain = true. The new current state is [false, true, true, true].
Each state visited during this process is a sample that contributes to the estimate
for the query variable Rain. If the process visits 20 states where Rain is true and 60
states where Rain is false, then the answer to the query is NORMALIZE
30