Unit Iii: Reasoning Under Uncertainty: Logics of Non-Monotonic Reasoning - Implementation-Basic
Unit Iii: Reasoning Under Uncertainty: Logics of Non-Monotonic Reasoning - Implementation-Basic
1.What is uncertainty.Explain
Uncertainty
Problems:
―A25 will get me there on time if there's no accident on the bridge and it doesn't rain and my tires
remain intact etc etc."
A monotonic logic cannot handle various reasoning tasks such as reasoning by default
(consequences may be derived only because of lack of evidence of the contrary), abductive
reasoning (consequences are only deduced as most likely explanations), some important
approaches to reasoning about knowledge
Default reasoning
An example of a default assumption is that the typical bird flies. As a result, if a given
animal is known to be a bird, and nothing else is known, it can be assumed to be able to fly.
The default assumption must however be retracted if it is later learned that the considered
animal is a penguin. This example shows that a logic that models default reasoning should not
be monotonic.
Logics formalizing default reasoning can be roughly divided in two categories: logics
able to deal with arbitrary default assumptions (default logic, defeasible logic/defeasible
reasoning/argument (logic), and answer set programming) and logics that formalize the
specific default assumption that facts that are not known to be true can be assumed false by
default (closed world assumption and circumscription).
Abductive reasoning
Abductive reasoning is the process of deriving the most likely explanations of the
known facts. An abductive logic should not be monotonic because the most likely
explanations are not necessarily correct.
For example, the most likely explanation for seeing wet grass is that it rained;
however, this explanation has to be retracted when learning that the real cause of the grass
being wet was a sprinkler. Since the old explanation (it rained) is retracted because of the
addition of a piece of knowledge (a sprinkler was active), any logic that models explanations
is non-monotonic.
If a logic includes formulae that mean that something is not known, this logic should not
be monotonic. Indeed, learning something that was previously not known leads to the removal of
the formula specifying that this piece of knowledge is not known. This second change (a removal
caused by an addition) violates the condition of monotonicity. A logic for reasoning about
knowledge is the autoepistemic logic.
Belief revision
Belief revision is the process of changing beliefs to accommodate a new belief that might be
inconsistent with the old ones. In the assumption that the new belief is correct, some of the old
ones have to be retracted in order to maintain consistency. This retraction in response to an
addition of a new belief makes any logic for belief revision to be non-monotonic. The belief
revision approach is alternative to paraconsistent logics, which tolerate inconsistency rather than
attempting to remove it.
(Fuzzy logic handles degree of truth NOT uncertainty e.g., WetGrass is true to degree 0:2)
These are not claims of a \probabilistic tendency" in the current situation (but might be learned from past
experience of similar situations)
Probabilistic Reasoning
Using logic to represent and reason we can represent knowledge about the world
with facts and rules, like the following ones:
bird(tweety).
fly(X) :- bird(X).
We can also use a theorem-prover to reason about the world and deduct new facts about the
world, for e.g.,
?- fly(tweety).
Yes
However, this often does not work outside of toy domains - non-tautologous certain rules
are hard to find. A way to handle knowledge representation in real problems is to extend logic by
using certainty factors. In other words, replace
IF condition THEN fact with
IF condition with certainty x THEN fact with certainty f(x)
Unfortunately cannot really adapt logical inference to probabilistic inference,
since the latter is not context-free. Replacing rules with conditional probabilities makes
inferencing simpler.
Replace smoking -> lung cancer
or
lots of conditions, smoking -> lung cancer
with
P(lung cancer | smoking) = 0.6
Uncertainty is represented explicitly and quantitatively within probability theory,
a formalism that has been developed over centuries.
A probabilistic model describes the world in terms of a set S of possible states -
the sample space. We don‘t know the true state of the world, so we (somehow) come up
with a probability distribution over S which gives the probability of any state being the
true one.
The world usually described by a set of variables or attributes. Consider the
probabilistic model of a fictitious medical expert system. The ‗world‘ is described by 8
binary valued variables:
Visit to Asia? A
Tuberculosis? T
Either tub. or lung cancer? E
Lung cancer? L
Smoking? S
Bronchitis? B
Dyspnoea? D
Positive X-ray? X
8
We have 2 = 256 possible states or configurations and so 256 probabilities to find.
10.3 Review of Probability Theory .The primitives in probabilistic reasoning are random
variables. Just like primitives in Propositional Logic are propositions.
A random variable is not in fact a variable, but a function from a sample space S to
another space, often the real numbers. For example, let the random variable Sum (representing
outcome of two die throws) be defined thus:
Sum(die1, die2) = die1 +die2
Each random variable has an associated probability distribution determined by the underlying
distribution on the sample space
Continuing our example : P(Sum = 2) = 1/36,
P(Sum = 3) = 2/36, . . . , P(Sum = 12) = 1/36
Consdier the probabilistic model of the fictitious medical expert system mentioned before. The
sample space is described by 8 binary valued variables.
Visit to Asia? A
Tuberculosis? T
Either tub. or lung cancer? E
Lung cancer? L
Smoking? S
Bronchitis? B
Dyspnoea? D
Positive X-ray? X
8
There are 2 = 256 events in the sample space. Each event is determined by a joint instantiation
of all of the variables.
Each of the random variables {A,T,E,L,S,B,D,X} has its own distribution, determined by
the underlying joint distribution. This is known as the margin distribution. For example, the
distribution for L is denoted P(L), and this distribution is defined by the two probabilities P(L =
f) and P(L = t). For example,
P(L = f)
= P(A = f, T = f,E = f,L = f, S = f,B = f,D = f,X = f)
+ P(A = f, T = f,E = f,L = f, S = f,B = f,D = f,X = t)
+ P(A = f, T = f,E = f,L = f, S = f,B = f,D = t,X = f)
...
P(A = t, T = t,E = t,L = f, S = t,B = t,D = t,X = t)
P(L) is an example of a marginal distribution.
Syntax:
In the simplest case, conditional distribution represented as a conditional probability table (CPT) giving
the distribution over Xi for each combination of parent values
Example
Topology of network encodes conditional independence assertions:
Example
I'm at work, neighbor John calls to say my alarm is ringing, but neighbor Mary doesn't call. Sometimes
it's set of by minor earthquakes. Is there a burglar?
Compactness
Example
Deciding conditional independence is hard in noncausal directions (Causal models and conditional
independence seem hardwired for humans!)
Mean Cost varies linearly with Harvest, variance is fixed Linear variation is unreasonable over the full
range but works OK if the likely range of Harvest is narrow
All-continuous network with LG distributions full joint distribution is a multivariate Gaussian
Discrete+continuous LG network is a conditional Gaussian network i.e., a
Inference by enumeration
Slightly intelligent way to sum out variables from the joint without actually
In a narrow sense, the term Dempster–Shafer theory refers to the original conception of
the theory by Dempster and Shafer. However, it is more common to use the term in the wider
sense of the same general approach, as adapted to specific kinds of situations. In particular, many
authors have proposed different rules for combining evidence, often with a view to handling
conflicts in evidence better.
These degrees of belief may or may not have the mathematical properties of probabilities;
how much they differ depends on how closely the two questions are related.[4] Put another way, it
is a way of representing epistemic plausibilities but it can yield answers that contradict those
arrived at using probability theory.
Dempster–Shafer theory is based on two ideas: obtaining degrees of belief for one
question from subjective probabilities for a related question, and Dempster's rule[ for combining
such degrees of belief when they are based on independent items of evidence.
In essence, the degree of belief in a proposition depends primarily upon the number of
answers (to the related questions) containing the proposition, and the subjective probability of
each answer. Also contributing are the rules of combination that reflect general assumptions
about the data.
Dempster–Shafer theory assigns its masses to all of the non-empty subsets of the entities that
compose a system.[clarification needed]
belief ≤ plausibility.
Belief in a hypothesis is constituted by the sum of the masses of all sets enclosed by it (i.e. the
sum of the masses of all subsets of the hypothesis).[clarification needed]
It is the amount of belief that directly supports a given hypothesis at least in part, forming a
lower bound. Belief (usually denoted Bel) measures the strength of the evidence in favor of a set
of propositions. It ranges from 0 (indicating no evidence) to 1 (denoting certainty).
Plausibility is 1 minus the sum of the masses of all sets whose intersection with the
hypothesis is empty. It is an upper bound on the possibility that the hypothesis could be true, i.e.
it ―could possibly be the true state of the system‖ up to that value, because there is only so much
evidence that contradicts that hypothesis.
The remaining mass of 0.3 (the gap between the 0.5 supporting evidence on the one hand,
and the 0.2 contrary evidence on the other) is ―indeterminate,‖ meaning that the cat could either
be dead or alive. This interval represents the level of uncertainty based on the evidence in your
system.
The null hypothesis is set to zero by definition (it corresponds to ―no solution‖).
The orthogonal hypotheses ―Alive‖ and ―Dead‖ have probabilities of 0.2 and 0.5,
respectively. This could correspond to ―Live/Dead Cat Detector‖ signals, which have
respective reliabilities of 0.2 and 0.5. Finally, the all-encompassing ―Either‖ hypothesis
(which simply acknowledges there is a cat in the box) picks up the slack so that the sum
of the masses is 1.
The belief for the ―Alive‖ and ―Dead‖ hypotheses matches their corresponding
masses because they have no subsets; belief for ―Either‖ consists of the sum of all three
masses (Either, Alive, and Dead) because ―Alive‖ and ―Dead‖ are each subsets of
―Either‖. The ―Alive‖ plausibility is 1 − m (Dead) and the ―Dead‖ plausibility is 1 − m
(Alive). Finally, the ―Either‖ plausibility sums m(Alive) + m(Dead) + m(Either). The
universal hypothesis (―Either‖) will always have 100% belief and plausibility —it acts as
a checksum of sorts.
Here is a somewhat more elaborate example where the behavior of belief and plausibility begins
to emerge. We're looking through a variety of detector systems at a single faraway signal light,
which can only be coloured in one of three colours (red, yellow, or green):
Combining beliefs
Fuzzy logic has been extended to handle the concept of partial truth, where the
truth value may range between completely true and completely false.[1] Furthermore,
when linguistic variables are used, these degrees may be managed by specific functions.
Overview
The reasoning in fuzzy logic is similar to human reasoning. It allows for approximate
values and inferences as well as incomplete or ambiguous data (fuzzy data) as opposed to only
relying on crisp data (binary yes/no choices). Fuzzy logic is able to process incomplete data and
provide approximate solutions to problems other methods find difficult to solve
Degrees of truth
Fuzzy logic and probabilistic logic are mathematically similar – both have truth values
ranging between 0 and 1 – but conceptually distinct, due to different interpretations—see
interpretations of probability theory. Fuzzy logic corresponds to "degrees of truth", while
probabilistic logic corresponds to "probability, likelihood"; as these differ, fuzzy logic and
probabilistic logic yield different models of the same real-world situations.
Both degrees of truth and probabilities range between 0 and 1 and hence may seem
similar at first. For example, let a 100 ml glass contain 30 ml of water. Then we may consider
two concepts: Empty and Full. The meaning of each of them can be represented by a certain
fuzzy set. Then one might define the glass as being 0.7 empty and 0.3 full.
Fuzzy logic temperatureIn this image, the meanings of the expressions cold, warm, and hot are
represented by functions mapping a temperature scale. A point on that scale has three "truth values"—
one for each of the three functions. The vertical line in the image represents a particular temperature
that the three arrows (truth values) gauge. Since the red arrow points to zero, this temperature may be
interpreted as "not hot". The orange arrow (pointing at 0.2) may describe it as "slightly warm" and the
blue arrow (pointing at 0.8) "fairly cold".
Linguistic variables
A linguistic variable such as age may have a value such as young or its antonym old.
However, the great utility of linguistic variables is that they can be modified via linguistic hedges
applied to primary terms. The linguistic hedges can be associated with certain functions.