0% found this document useful (0 votes)
33 views

Module 4

The document discusses uncertainty and methods for reasoning under uncertainty. It covers sources of uncertainty like incomplete knowledge, imprecise language, and conflicting information. It introduces probability theory and Bayesian networks as quantitative methods to represent uncertainty. It discusses how probability, combined with representing preferences as utilities, allows for rational decision making by choosing the option with the highest expected utility. The maximum expected utility principle provides a standard for rational behavior under uncertainty.

Uploaded by

GUNEET SURA
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

Module 4

The document discusses uncertainty and methods for reasoning under uncertainty. It covers sources of uncertainty like incomplete knowledge, imprecise language, and conflicting information. It introduces probability theory and Bayesian networks as quantitative methods to represent uncertainty. It discusses how probability, combined with representing preferences as utilities, allows for rational decision making by choosing the option with the highest expected utility. The maximum expected utility principle provides a standard for rational behavior under uncertainty.

Uploaded by

GUNEET SURA
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 86

Uncertain

Knowledge and
Reasoning
By Dr. Sonali Patil
1
Reasoning Under Uncertainty
• Uncertainty
• Sources of uncertainty
• Methods to handle Uncertainty
• Probability Theory
• Uncertainty and Rational Decisions
• Basic Probability Notations
• Probability Axioms
• Independence
• Bayes Rule
• Bayesian Networks
2
Uncertainty
Reasoning under uncertainty.
• There are different types of uncertainties. What are the different
ways in which you can deal with that?
• The doorbell problem
• The doorbell rang at 12 O'clock at midnight.
• Que to ans
• was someone there at the door?
• Mohan was sleeping in the room. Did Mohan wake up when the
doorbell rang?
• My fact is that the doorbell rang at 12 O'clock in the midnight.
Therefore if we place the propositions in the logic form

3
Uncertainty
• Given Doorbell, can we say AtDoor(x), because
AtDoor(x) Doorbell?
• Can we say that there is some one at Door? We can using the
deductive reasoning/normal implication (p implies q, if p true…
Q is necessarily true, if p false…q may be or may not be true)
• Abductive Reasoning (p implies q and we find q is true then we
infer p. Most of the time right, but may not always). Other
reasons, though rare
• Short Circuit
• Wind
• Dog or other Animal pressed the button

4
Uncertainty
• Given Doorbell, can we say Wake(Mohan), because
Doorbell Wake(Mohan)?
• Using, Deductive Reasoning. Yes, if proposition 2 is always true
• However always this may not be true (May be tired and in
sound sleep)
Hence, we cannot answer either of Questions with certainty.

5
Uncertainty

Planning Example
• Let action A(t) denote leaving for the airport t minutes before the
flight
– For a given value of t, will A(t) get me there on time?
• Problems:
– Partial observability (roads, other drivers’ plans, etc.)
– Noisy sensors (traffic reports)
– Uncertainty in action outcomes (flat tire, etc.)
– Immense complexity of modelling and predicting traffic

6
Uncertainty
• Diagnosis always involves uncertainty.
• Eg:
Dental diagnosis: (toothache)
Toothache Cavity
Its wrong as not all people with toothaches have cavity. It may
be due other problems
Toothache Cavity V Gum Problem V Abscess…….
In order to complete the list, we have to add an almost unlimited
list of possible problems
The causal rule for this:
Cavity Toothache
This is not also the right one. Not all cavities cause pain
Uncertainty
• Trying to cope up with domains like Medical diagnosis fails for 3
main reasons:
 Laziness:
Too hard to list out all antecedence & consequents needed to
ensure an exception-less rule and too hard to use such rules.
 Theoretical ignorance:
The domains like Medical science has No complete theory for the
domain.
 Practical ignorance:
Even all rules are known – uncertain about a particular patient.
Not all test have been or can be run.
Uncertainty
• The problems like Doorbell/diagnosis are very common in real
world
• In AI, we need to reason under such circumstances
• We solve such problems by proper modelling of Uncertainty
and impreciseness and developing appropriate reasoning
techniques

9
Sources of uncertainty
• Implications may be weak

• Imprecise language like often, rarely, sometimes


• Need to quantify these terms of frequencies
• Need to design rules for reasoning with these frequencies

• Precise information (input) may be too complex


• Too many antecedents or consequents

• Incomplete Knowledge
• We may not know or guess all the possible antecedents or consequents 10

• The bell rang due to some spooky reason


Sources of uncertainty
• Conflicting Information
• Patient-complicated symptoms-two diff doctors-may be possible they
differ in there diagnosis if the symptoms do not lead to a vary obvious
disease

• Propagation of Uncertainties
• In absence of interdependencies of propagation of uncertain
knowledge the uncertainty of the conclusions increases

11
12
Methods of handling
Uncertainty
• Fuzzy Logic
• Logic that extends traditional 2-valued logic to be a continuous
logic (values from 0 to 1)
• while this early on was developed to handle natural language
ambiguities such as “you are very tall” it instead is more successfully
applied to device controllers
• Probabilistic Reasoning
• Using probabilities as part of the data and using Bayes theorem or
variants to reason over what is most likely
• Hidden Markov Models
• A variant of probabilistic reasoning where internal states are not
observable (so they are called hidden)
• Certainty Factors and Qualitative Fuzzy Logics
• More ad hoc approaches (non formal) that might be more flexible 13
or at least more human-like (MYCIN expert system)
• Neural Networks
Uncertainty tradeoffs
• Bayesian networks: Nice theoretical properties combined
with efficient reasoning make BNs very popular; limited
expressiveness, knowledge, engineering challenges may
limit uses
• Non-monotonic logic: Represent commonsense
reasoning, but can be computationally very expensive
• Certainty factors: Not semantically well founded
• Fuzzy reasoning: Semantics are unclear (fuzzy!), but has
proved very useful for commercial applications

14
Probability Theory
• Deals with degrees of belief
• Provides a way of summarizing the uncertainty that comes from
our laziness & ignorance thereby solving the qualification
problem (specifying all exceptions)
• A90 will take us to airport on time, as long as car doesn't break
down or run out of gas, does not indulge into accident, no
accidents on bridge, plane doesn't live early, no meteorite hits the
car, and ….)
0.8
• Toothache cavity
• The probability that the patient has a cavity, given that she has a
toothache is 0.8

15
Probability Theory
• Consider previous statement: “The probability that the
patient has a cavity, given that she has a toothache is 0.8”
• If we later learn that patient has a history of gum disease we
can say “The probability that the patient has a cavity, given
that she has a toothache and a history of gum disease, is 0.4”
• If further we gather evidence, we can say “The probability that
the patient has a cavity, given all we know now, is almost zero”
• Above three statements do not contradict each other; each is a
separate assertion about a difference knowledge state

16
Uncertainty and rational
decisions
• “Say A90 has 92% chance of catching our flight. Is it
rational choice? Not necessarily
• A180 has higher probability of reaching. If its vital to not
miss the flight, then its worth risking the longer wait time
at airport
• A1440 almost guarantees reaching on time but I’d have
to stay overnight in the airport (intolerable wait and may
be unpleasant diet of airport food)
• To make choices, an agent must have preferences
between different possible outcomes of various plans
• Utility Theory is used to represent & reason with 17

preferences
Uncertainty and rational
decisions
• Utility Theory
• Every state has a degree of usefulness or utility, to an
agent and the agent will prefer states with higher
utility
• The utility state is relative to agent
• Ex. Consider the state in which White has checkmated
Black in chess. Here, Utility is high for agent playing
White but low for agent playing Black
• A Utility function can account for any set of
preferences- quirky or typical, noble or perverse
18
Uncertainty and rational
decisions
• Decision Theory
• Preferences as expressed by utilities, are combined with
probabilities in general theory of rational decisions
Decision Theory = Probability Theory + Utility Theory
• Maximum Expected Utility(MEU)
• An agent is rational if and only if it chooses the action that yields
the highest expected utility, averaged over all possible outcomes of
the action. This is principle of MEU
• Here the term expected is not vague. Its average or statistical mean
of outcomes weighted by the probability of outcome
• The basic difference between A decision-theoretic agent &
other agents is that the former’s belief state represents
19
not just the possibilities for world states but also their
probabilities
Uncertainty and rational
decisions

20
Uncertainty and rational
decisions summary
• Rational behavior:
• For each possible action, identify the possible
outcomes
• Compute the probability of each outcome
• Compute the utility of each outcome
• Compute the probability-weighted (expected) utility
over possible outcomes for each action
• Select the action with the highest expected utility
(principle of Maximum Expected Utility)

21
Basic Probability Notation
• A random variable is a variable whose possible values are the
numerical outcomes of a random experiment.
• It is a function which associates a unique numerical value with every
outcome of an experiment.
• Its value varies with every trial of the experiment.
• It describes an outcome that cannot be determined in advance
• It is Boolean , Discrete or continuous
• Ex. Roll of a die, number of emails received in a day etc.
• The sample space S of the random variable X is the set of all possible
worlds
• The possible worlds are mutually exclusive & exhaustive (at a time one
possible outcome and all possible outcomes are in the S)
• Tossing a coin: S={H,T}
• Tossing two coins simultaneously S={HH, HT, TH, TT}
• Rolling a die: S={ 1,2,3,4,5,6} 23
Basic Probability Notation

• An atomic event is a complete specification of the state of


the world about which the agent is uncertain.
• Eg:
Cavity & Toothache has four distinct atomic events.
Cavity = False  Toothache = True
Cavity = True  Toothache = True
Cavity = False  Toothache = False
Cavity = True  Toothache = False
Basic Probability Notation
• The sample space is denoted by Ω (upper case omega)
and elements in sample space are denoted by ω (lower
case omega)
• P(ω ) -> Probability of occurance of ω

• Probabilistic assertions & queries are not about particular


possible worlds, but about sets of them
• The two dice add upto 11, Doubles are rolled; Picking ace
from pack of cards, number of email> 100 in a day, etc.
• These sets are called events. Event is subset of ω
• Events are described by proposition in common language 25
Basic Probability Notation
• The probability associated with the proposition is
defined to be the sum of the probabilities of the world
in which the proposition holds

• ϕ is getting odd number after rolling the dice


S={1, 2, 3, 4, 5, 6}, ϕ ={1, 3, 5}
P(Odd)=P(1)+ P(3) + P(5)= 1/6 + 1/6 + 1/6 = ½

26
Unconditional or Prior
Probabilities
• Degree of belief in proposition in the absence of any other
information/evidence
• P(Fever)=0.1
• The probability that the patient has fever is 0.1(in
absence of any other information
• A die is rolled, P(odd), P(even) indicated the probability of
getting the even number and the probability of getting the
even number on the rolled dice respectively. Both of these
are prior probabilities
• When a pair of dice rolled simultaneously, the possible
outcomes are 36. P(doubles), P(Total=15) are prior 27
probabilities
Unconditional or Prior
Probabilities
• The random variables Fever, Doubles, Odd, Even are
Discrete Random variables as they take finite number of
distinct values
• The Boolean random variables have values True or false
ex. P(cavity)
• A continuous random variable is a random variable that
takes infinite number of distinct values
• EX. P(Temp=x) = Uniform[18C,26C] (x)
• Expresses that the temperature is distributed uniformly
between 10 and 26 degrees
• This is called Probability Density Function 28
Conditional or Posterior
Probabilities
• Let A be an event in the world and B be another event.
Suppose that events A and B are not mutually
exclusive, but occur conditionally on the occurrence of the
other. The probability that event A will occur if event B occurs
is called the conditional probability. Conditional probability is
denoted mathematically as p(A|B) in which the vertical bar
represents GIVEN and the complete probability expression is
interpreted as “Conditional probability of event A occurring
given that event B has occurred”.

the number of times A and B can occur


p AB  29

the number of times B can occur


Conditional or Posterior
Probabilities
• The number of times A and B can occur, or the probability that both A
and B will occur, is called the joint probability of A and B. It is
represented mathematically as p(AՈB). The number of ways B can
occur is the probability of B, p(B), and thus

• The eq. of conditional can also be written in the form

Product Rule
• Similarly, the conditional probability of event B occurring given that
event A has occurred equals

30

Product Rule
Probability Axioms
• All probabilities are between 0 & 1
• 0  P(A)  1
• Necessarily True propositions have probability 1
and necessarily false propositions have probability
0
• ( P(true) = 1 and P(false) = 0)
• Probability of disjunction Inclusion-Exclusion Principle
• P(A  B) = P(A) + P(B) - P(A  B)

31

• These axioms often called as Kolmogorov’s axiom


Probability Axioms
• From Axioms we can derive other properties
• P(A  B) = P(A) + P(B) - P(A  B) Substitute B = Ø A
• P(A  Ø A) = P(A) + P(Ø A) - P(A  Ø A)
1 = P(A) + P(Ø A) – 0
P(Ø A)= 1- P(A)
P(A) = 1- P(Ø A)
• A and B mutually exclusive  P(A  B) = P(A) + P(B)
P(e1  e2  e3  … en) = P(e1) + P(e2) + P(e3) + … + P(en)
The probability of a proposition a is equal to the sum of the
probabilities of the atomic events in which a holds
e(a) – the set of atomic events in which a holds
P(a) =  P(ei) 32

eie(a)
Inference using Full Joint Distribution
Probability distribution P(Cavity, Toothache)
Toothache  Toothache
Cavity 0.04 0.06
 Cavity 0.01 0.89
Sum of all entries =1
P(Cavity) = 0.04 + 0.06 = 0.1 (using Axioms)
P(Cavity  Toothache) = 0.04 + 0.01 + 0.06 = 0.11
P(Cavity|Toothache) = P(CavityToothache)/P(Toothache)
= 0.04 / (0.04 + 0.01)
= 0.8
• Obtain P( cavity), P(Toothache), P( Toothache), P(cavity|  33
toothache) P( cavity| toothache), P( cavity|  toothache),
Inference using Full Joint Distribution

34
Inference using Full Joint
Distribution
P(cavity|toothache)= P(cavity^ Toothache)/ P(Toothache)
= (0.108+0.012)/(0.108+0.012+0.016+0.064)
= 0.6
• Observe, P(cavity|toothache)+P( cavity|toothache)
=0.6 + 0.4=1 as it should be
• 1/P(toothache) remains constant no matter which value
of cavity we calculate. Such constants in probability are
called as normalization constant

35
Inference using Full Joint
Distribution

36
Independence

• Independence is simplifying the modelling assumption


• Variable represented for probability are
P( Weather, toothache , catch , cavity)
• It can be deduced as
P(weather= cloudy) P(toothache ,catch ,cavity )
How to verify Independence?

• Given a joint distribution P1(T,W) how to verify T and W are


independent or not
• Build marginals for each of the variables. Her two variables so two 39

marginals
How to verify Independence?

• Calculate another distribution P2(T,W) as P(T)*P(W)


• If P1(T,W)= P2(T,W)… T and W are independent
40
Example independence

41
Bayesian or Bayes Rule
• Let A be an event in the world and B be another event.
Hence from product rule
P(A  B) = P(A|B) *P(B)
P(A  B) = P(B|A) *P(A)
• LHS are same. Equating the RHS of both equations yields
P(A|B) = P(B | A) * P(A) / P(B)
P(B|A) = P(A | B) * P(B) / P(A)

where:
p(A|B) is the conditional probability that event A occurs given that event B has occurred;

Bayes rule/ p(B|A) is the conditional probability of event B occurring given that event A has occurred;

Bayesian rule p(A) is the probability of event A occurring;


p(B) is the probability of event B occurring.
42
Bayesian or Bayes Rule
(Hypothesis-Evidence)

43
Bayesian or Bayes Rule
The Joint
Probability

44
Bayesian or Bayes Rule
• If the occurrence of event A depends on only two mutually
exclusive events, B and NOT B , we obtain:
p(A) = p(AB) p(B) + p(A B)  p(B)

where Ø is the logical function NOT.


Similarly,
p(B) = p(B A)  p(A) + p(BA)  p(A)
Substituting this equation into the Bayesian rule yields:

p BAp A
p AB 
p B A  p A  p B A  p A 45
Bayesian or Bayes Rule
• The Bayesian rule expressed in terms of hypotheses and
evidence looks like this:

p EH p H
pHE 
p E H  p H  p E H  p H
where:
p(H) is the prior probability of hypothesis H being true;
p(E|H) is the probability that hypothesis H being true will result
in evidence E; p(ØH) is the prior
probability of hypothesis H being false;
p(E|ØH ) is the probability of finding evidence E
even when hypothesis H is false.
46
Example: Bayes Rule

47
Example: Bayes Rule

48
Example: Bayes’ rule
• Disease Meningitis:
• It Cause patient to have stiff neck- 50% of the time.
• Prior Probability that Patients has meningitis is 1/50000.
• Prior Probability that patient has stiff neck is 1/20.
• Let s be stiff neck & m be Meningitis.
P(s|m) = 0.5
P(m) = 1/50000
P(s)= 1/20
P(m|s) = P(s|m) P(m)
P(s)
= 0.5 * 1/50000 = 0.0002.
1/20
• 1 in 5000 patients with stiff neck to have Meningitis.
Conditional Independence

52
Conditional Independence

53
Conditional Independence
Conditional Independence
Conditional Independence

56
Conditional Independence

57
Conditional Independence

58
Conditional Independence &
Chain Rule

59
60
What Bayesian Networks are good for?

 Diagnosis: P(cause|symptom)=?
cause
 Prediction: P(symptom|cause)=?
C1 C2
 Classification: max P(class|data)
class
 Decision-making (given a cost function) symptom

Medicine
Speech Bio-
informatics
recognition

Text 61
Classification Computer
Stock market
troubleshooting
Why learn Bayesian networks?

 Combining domain expert <9.7 0.6 8 14 18>


<0.2 1.3 5 ?? ??>
knowledge with data <1.3 2.8 ?? 0 1 >
<?? 5.6 0 10 ??>
……………….

 Efficient representation and


inference
 Incremental learning

 Handling missing data: <1.3 2.8 ?? 0 1 >

 Learning causal relationships: S C 62


Probabilistic reasoning- Bayesian network
• Bayesian network is a systematic way to represent independence
relationships explicitly.
• Data structure to represent the dependencies among variables.
• Directed graph – each node is annotated with quantitative
probability information.
Specification of Bayesian network
1. A set of random variables makes up the node.
2. A set of directed links or arrows connects pair of nodes.

X Y

X is the Parent of Y.
3. Each node Xi has a conditional probability distribution P(Xi|
Parent(Xi))
4. No directed cycles.
Topology
• Nodes & Links – specifies the conditional independence
relationships.
• Variables Weather ,Toothache ,Catch ,Cavity .
Example: Traffic
Example: Traffic II
Example: Alarm Network

69
Semantics of Bayesian network

• 2 ways to understand:
1. To see the network as a representation of the joint probability
distribution.
2. View as an encoding of a collection of conditional independence
statements.
Semantics of Bayesian network
Probabilities in Bayes Nets

Joint Distribution for this Bayes Net:


P(cavity, Toothache, Catch) = P(cavity)*P(Toothache | Cavity)
* P(Catch | Cavity)
74
Example
• Calculate probability that alarm sounds but neither a burglary nor an
earth quake has occurred and both John & Mary call.
Node ordering
Node ordering
Example: Flip Coins

P(X1=h) * P(X2=h) * P(X3=t) * P(X4=h)

79
Example: Burglar Alarm
Example: Burglar Alarm

81
Example: Traffic

P(+r) * P(-t | +r) = ¼* ¼

82
Example: Traffic

83
Example: Traffic

84

Self learn: Causal Vs diagnostic models. Causal better? Why?


Causality

85
Summary

86

You might also like