Aies Unit 3
Aies Unit 3
Prepared By:
Mrs.P. BHAVANI, Asst.Prof / CSE
Mr.S.KUMARAKRISHNAN, Asst.Prof / CSE
SYLLABUS – UNIT - 3
Basic Probability Notations – Bayes Rule and its Applications – Bayesian Networks – Hidden Markov Models –
Kalman Filters, Dempster-Shafer Theory.
2MARKS
Hidden Markov Models (HMMs) are a class of probabilistic graphical model that allow us to predict a
sequence of unknown (hidden) variables from a set of observed variables. A simple example of an HMM is
predicting the weather (hidden variable) based on the type of clothes that someone wears (observed).
15. What are different Hidden Markov Models?
Hidden Markov models (HMMs) have been extensively used in biological sequence analysis. In this paper, we
give a tutorial review of HMMs and their applications in a variety of problems in molecular biology. We
especially focus on three types of HMMs: the profile-HMMs, pair-HMMs, and context-sensitive HMMs.
16. How do hidden Markov models work?
The Hidden Markov Model (HMM) is a relatively simple way to model sequential data. A hidden Markov
model implies that the Markov Model underlying the data is hidden or unknown to you. More specifically, you
only know observational data and not information about the states.
17. What is Kalman filter in AI?
A Kalman Filter is an algorithm that takes data inputs from multiple sources and estimates unknown variables,
despite a potentially high level of signal noise.
18. What is the Kalman filter used for?
Kalman filters are used to optimally estimate the variables of interests when they can't be measured directly ,
but an indirect measurement is available. They are also used to find the best estimate of states by combining
measurements from various sensors in the presence of noise.
19. What is Dempster-Shafer theory in artificial intelligence?
Often used as a method of sensor fusion, Dempster–Shafer theory is based on two ideas: obtaining degrees of
belief for one question from subjective probabilities for a related question, and Dempster's rule for combining
such degrees of belief when they are based on independent items of evidence.
20. What is Dempster-Shafer theory compare it with Bayesian reasoning?
Dempster-Shafer was a further generalization of Bayesian Networks, in which malformed probability
distributions were permitted as a way to capture uncertainty.
5 MARKS
Probabilistic reasoning:
Probabilistic reasoning is a way of knowledge representation where we apply the concept of probability to
indicate the uncertainty in knowledge.
In probabilistic reasoning, we combine probability theory with logic to handle the uncertainty.
We use probability in probabilistic reasoning because it provides a way to handle the uncertainty that is the
result of someone's laziness and ignorance.
In the real world, there are lots of scenarios, where the certainty of something is not confirmed, such as "It
will rain today," "behavior of someone for some situations," "A match between two teams or two players."
These are probable sentences for which we can assume that it will happen but not sure about it, so here we
use probabilistic reasoning.
Need of probabilistic reasoning in AI:
When there are unpredictable outcomes.
When specifications or possibilities of predicates becomes too large to handle.
When an unknown error occurs during an experiment.
In probabilistic reasoning, there are two ways to solve problems with uncertain knowledge: Bayes' rule
As probabilistic reasoning uses probability and related terms, so before understanding probabilistic
reasoning, let's understand some common terms:
Probability:
Probability can be defined as a chance that an uncertain event will occur. It is the numerical measure of
the likelihood that an event will occur. The value of probability always remains between 0 and 1 that
represent ideal uncertainties. 1. 0 ≤ P(A) ≤ 1, where P(A) is the probability of an event A. 1. P(A) = 0,
indicates total uncertainty in an event A. 1. P(A) =1, indicates total certainty in an event A.
Disadvantages:
In this, computation effort is high, as we have to deal with 2 n of sets.
Bayes' theorem:
Bayes' theorem is also known as Bayes' rule, Bayes' law, or Bayesian reasoning, which determines the probability
of an event with uncertain knowledge.
In probability theory, it relates the conditional probability and marginal probabilities of two random events.
Bayes' theorem was named after the British mathematician Thomas Bayes. The Bayesian inference is an
application of Bayes' theorem, which is fundamental to Bayesian statistics.
Bayes' theorem allows updating the probability prediction of an event by observing new information of the
real world.
Example: If cancer corresponds to one's age then by using Bayes' theorem, we can determine the
probability of cancer more accurately with the help of age.
Bayes' theorem can be derived using product rule and conditional probability of event A with known event
B:
As from product rule we can write:
P(A ⋀ B)= P(A|B) P(B) or
Similarly, the probability of event B with known event A:
P(A ⋀ B)= P(B|A) P(A)
Equating right hand side of both the equations, we will get:
The above equation (a) is called as Bayes' rule or Bayes' theorem. This equation is basic of most modern
AI systems for probabilistic inference.
10 MARKS
Given the available evidence, A25 will get me there on time with probability 0:04
(Fuzzy logic handles degree of truth NOT uncertainty e.g., WetGrass is true to degree 0:2)
These are not claims of a \probabilistic tendency" in the current situation (but might be learned from past
experience of similar situations)
Probabilistic Reasoning
Using logic to represent and reason we can represent knowledge about the world with facts and rules, like the
following ones
bird(tweety).
fly(X) :- bird(X).
We can also use a theorem-prover to reason about the world and deduct new facts about the world, for e.g.,
?- fly(tweety).
Yes
However, this often does not work outside of toy domains - non-tautologous certain rules are hard to find. A
way to handle knowledge representation in real problems is to extend logic by using certainty factors. In other
words, replace
IF condition THEN fact with
IF condition with certainty x THEN fact with certainty f(x) Unfortunately
cannot really adapt logical inference to probabilistic inference, since the latter is not context-free. Replacing
rules with conditional probabilities makes inferencing simpler.
Replace smoking -> lung cancer
or
lots of conditions, smoking -> lung cancer
with
P(lung cancer | smoking) = 0.6
Uncertainty is represented explicitly and quantitatively within probability theory, a formalism that has been
developed over centuries.
A probabilistic model describes the world in terms of a set S of possible states - the sample space. We don’t
know the true state of the world, so we (somehow) come up with a probability distribution over S which gives
the probability of any state being the true one.
The world usually described by a set of variables or attributes. Consider the probabilistic model of a fictitious
medical expert system. The ‘world’ is described by 8 binary valued variables:
Visit to Asia? A
Tuberculosis? T
Either tub. or lung cancer? E
Lung cancer? L
Smoking? S
Bronchitis? B
Dyspnoea? D Positive X-ray? X
We have 28 = 256 possible states or configurations and so 256 probabilities to find. 10.3 Review of
Probability Theory .The primitives in probabilistic reasoning are random variables. Just like primitives in
Propositional Logic are propositions.
A random variable is not in fact a variable, but a function from a sample space S to another space,
often the real numbers. For example, let the random variable Sum (representing outcome of two die throws)
be defined thus:Sum(die1, die2) = die1 +die2
Each random variable has an associated probability distribution determined by the underlying distribution on
the sample space
Continuing our example : P(Sum = 2) = 1/36, P(Sum
= 3) = 2/36, . . . , P(Sum = 12) = 1/36
Consider the probabilistic model of the fictitious medical expert system mentioned before. The sample space
is described by 8 binary valued variables.
Syntax:
In the simplest case, conditional distribution represented as a conditional probability table (CPT) giving the
distribution over Xi for each combination of parent values
Example
Example
I'm at work, neighbor John calls to say my alarm is ringing, but neighbor Mary doesn't call. Sometimes it's set
of by minor earthquakes. Is there a burglar?
Hidden Markov Model is an temporal probabilistic model for which a single discontinuous random variable
determines all the states of the system.
A Kalman Filter is an algorithm that takes data inputs from multiple sources and estimates unknown variables,
despite a potentially high level of signal noise. Often used in navigation and control technology, the Kalman Filter
has the advantage of being able to predict unknown values more accurately than if individual predictions are made
using singular methods of measurement.
If Noise is Gaussian: the Kalman filter minimizes the mean square error of the estimated parameters.
• If Noise is NOT Gaussian: Kalman filter is still the best linear estimator.
Nonlinear estimators may be better.
• Gauss-Markov Theorem – Optimal among all Linear, Unbiased Estimators