0% found this document useful (0 votes)

16 views80 pages

Quantifying Uncertainty

The document presents a lecture on quantifying uncertainty in artificial intelligence, focusing on decision-making under uncertainty, basic probability notation, and the use of Bayes' Rule. It discusses the challenges of handling uncertainty in real-world scenarios, such as partial observability and nondeterminism, and introduces concepts like belief states and contingency plans. Additionally, it emphasizes the importance of rational decision-making and the application of probability theory to represent uncertainty in various contexts, including medical diagnosis.

Uploaded by

Gaming with Joel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views80 pages

Quantifying Uncertainty

Uploaded by

Gaming with Joel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 80

Slide 1

Quantifying Uncertainty
Introduction to Artificial Intelligent
Hamdi Abdurhman, PhD

401892 LECTURE 6 11

____________________________________________________________________________________
____________________________________________________________________________________
____________________________________________________________________________________
____________________________________________________________________________________
____________________________________________________________________________________
____________________________________________________________________________________
____________________________________________________________________________________
____________________________________________________________________________________
____________________________________________________________________________________
____________________________________________________________________________________
____________________
Slide 2

Last Time

• Knowledge-Based Agents
• Wumpus World
• Logic
• Propositional Logic: A very Simple Logic

401892 LECTURE 6 22

Today
• Acting under Uncertainty
• Basic Probability Notation
• Inference Using Full Joint Distributions
• Independence
• Bayes’ Rule and Its Use
• Naive Bayes Models
• The Wampus World Revisited

401892 LECTURE 6 33

Introduction to Uncertainty
• Real World Uncertainty:
• Partial observability, nondeterminism, and adversaries.
• Agents may not know current or future states.
• Handling Uncertainty:
• Belief state: Represents possible world states.
• In partially observable environments, agents can't determine their exact state.
• In nondeterministic environments, agents use belief states to account for multiple
possible state transitions.
• Contingency plan: Handles every possible sensor observation.
• In partially observable and nondeterministic environments, the solution to a problem is
no longer a sequence, but rather a conditional plan (sometimes called contingency plan
or a strategy)

401892 LECTURE 6 44

Drawbacks of Contingency Plans

• Despite its many virtues, however, this approach has many significant
drawbacks:
• With partial information, an agent must consider every possible eventuality,
no matter how unlikely. This leads to impossibly large and complex belief-
state representations
• A correct contingent plan that handles every eventuality can grow arbitrarily
large and must consider arbitrarily unlikely contingencies.
• Sometimes there is no plan that is guaranteed to achieve the goal—yet the
agent must act. It must have some way to compare the merits of plans that
are not guaranteed.

401892 LECTURE 6 55

Uncertainty
• Uncertainty is everywhere. Consider the following proposition.
• At: Leaving t minutes before the flight will get me to the airport.
• Problems:
1. Partial observability (road state, other drivers plans, etc.)
2. Noisy sensors (radio traffic reports)
3. Uncertainty in action outcomes (a flat tire, etc.)
4. Immense complexity of modelling and predicting traffic

401892 LECTURE 6 66

Example -
Automated Taxi

Rational Decision-Making
• Performance Measure:
• Timeliness, avoiding unproductive waits, avoiding speeding tickets.
• Comparing Plans:
• Plan A90: Maximizes performance measure based on knowledge.
• Plan A180: Increases belief in success but has trade-offs.
• Rational Decision:
• Depends on goal importance and likelihood of achievement.
• Expected to maximize performance based on environmental knowledge.

401892 LECTURE 6 88

A Visit to the Dentist

• We'll use medical/dental diagnosis examples extensively
• our new prototype problem relates to whether a dental patient has a cavity or not
• the process of diagnosis always involves uncertainty & this leads to difficulty with
logical representations (propositional logic examples)
1. T
2.
3.
1. is just wrong since other things cause toothaches
2. will need to list all possible causes
3. tries a causal rule but it's not always the case that cavities cause toothaches &
fixing the rule requires making it logically exhaustive
401892 LECTURE 6 99

Representation for Diagnosis

• Logic is not sufficient for medical diagnosis, due to
• Our laziness: it's too hard to list all possible antecedents or consequents to make the
rule have no exceptions
• Our theoretical ignorance: generally, there is no complete theory of the domain, no
complete model
• Our practical ignorance: even if the rules were complete, in any particular case it's
impractical or impossible to do all the necessary tests, to have all relevant evidence
• The example relationship between toothache & cavities is not a logical
consequence in either direction
• Instead, knowledge of the domain provides a degree of belief in diagnostic
sentences & the way to represent this is with probability theory

401892 LECTURE 6 10
10

Epistemological Commitment
• Ontological commitment
• What a representational language assumes about the nature of reality - logic
& probability theory agree in this, that facts do or do not hold
• Epistemological commitment
• The possible states of knowledge
• For logic, sentences are true/false/unknown
• For probability theory, there's a numerical degree of belief in sentences, between 0
(certainly false) and 1 (certainly true)

401892 LECTURE 6 11
11

Knowledge representation
Language Main elements (Ontological Assignments
Commitment) (Epistemological Commitment)
Propositional logic Facts T, F, unknown
First-order logic Facts, objects, relations T, F, unknown
Temporal logic Facts, objects, relations, times T, F, unknown
Temporal constraint satisfaction problems Time points Time intervals
Fuzzy logic Set membership Degree of truth
Probability theory Facts Degree of belief

• The first three do not represent uncertainty, while the last three do.

401892 LECTURE 6 12
12

The Qualification Problem

• For a logical representation
• The success of a plan can't be inferred because of all the conditions that could
interfere but can't be deduced not to happen (this is the qualification
problem)
• Probability is a way of dealing with the qualification problem by numerically
summarizing the uncertainty that derives from laziness &/or ignorance
• Returning to the toothache & cavity problem
• In the real world, the patient either does or does not have a cavity
• A probabilistic agent makes statements with respect to the knowledge state, & these
may change as the state of knowledge changes
• For example, an agent initially may believe there's an 80% chance (probability 0.8) that
the patient with the toothache has a cavity, but subsequently revises that as additional
evidence is available

401892 LECTURE 6 13
13

Rational Decisions
• Making choices among plans/actions when the probabilities of their
success differ
• This requires additional knowledge of preferences among outcomes
• This is the domain of utility theory: every state has a degree of
utility/usefulness to the agent & the agent will prefer those with higher utility
• Utilities are specific to an agent, to the extent that they can even encompass perverse or
altruistic preferences

401892 LECTURE 6 14
14

Rational Decisions
• Making choices among plans/actions when the probabilities of their
success differ
• We can combine to get a general
theory of rational decisions: decision theory
• A rational agent chooses actions to yield the highest expected utility
averaged over all possible outcomes of the action
• This is the maximum expected utility (MEU) principle
• Expected = average of the possible outcomes of an action weighted by their probabilities
• Choice of action = the one with highest expected utility

401892 LECTURE 6 15
15

1.2 Uncertainty and Rational Decisions:

Decision-Theoretic Agent Structure

• Reflects history of percepts to date.

• Belief state includes probabilities of
world states.
• Probabilistic Predictions:
• Agent predicts action outcomes
probabilistically.
• Selects actions with highest Figure 1. A decision-theoretic agent that selects rational actions [1].
expected utility.

16
401892 LECTURE 6 16

2. Basic Probability Notation

Represent and use probabilistic information.
Traditionally informal, written for mathematicians.
Approach tailored for AI and linked to formal logic.

401892 LECTURE 6 17
17

2.1 What Probabilities are About

• Sample Space:
• Set of all possible worlds (Ω).
• Each ω is a particular configuration or outcome.
• Example: Rolling two dice gives 36 possible worlds.
▪ Sample space (Ω): { (1,1), (1,2), ..., (6,6) }
▪ Specific outcome (ω): (5,6).
• Probability Model:
• Assigns numerical probability P(ω) to each possible world.
• Axioms: . ……(Equations 1)
• Example:
• Fair dice: Each world (1,1), (1,2), ..., (6,6) has P(ω)=1/36.

401892 LECTURE 6 18
18

Events and Propositions

• Assertions & queries in probabilistic reasoning
• Events:
• Sets of possible worlds.
• Example: Probability of two dice adding up to 11.
• Propositions:
• Correspond to sets of possible worlds in logic.
• Probability of a proposition: Sum of probabilities of worlds where it holds.
• Use the Greek letter phi for proposition
• Equation (2):
• Example:
• Rolling dice:
1/36 1/36 = /36 = 1/1

401892 LECTURE 6 19
19

Unconditional Probabilities
• Prior Probabilities:
o Also called "unconditional probabilities" or "priors."
o Degree of belief in propositions without additional information.
o When rolling fair dice, if we assume that each die is fair, and the rolls don’t
interfere with each others
o The set of possible worlds
o (1,1), (1, ), (1,3) …( ,1), ( , ), …,(6,5), (6,6)
o P(Dice) = 1/36

401892 LECTURE 6 20
20

Conditional Probabilities
• Posterior Probabilities:
• Also called "conditional probabilities" or "posteriors."
• The probability of a certain event happening, given the effect of another
event(called evidence).
• For example, the first die may be already showing 5 and we are waiting for
the other die to settle down
• In that case, we are interested in the probability of the other die given the
first one is 5
• Example: P(Dice ∣ Die1=5).
• Notation:
• P(A∣B) is read as "Probability of A given B."
401892 LECTURE 6 21
21

Understanding Conditional Probability

• Example Context:
• Prior probability: P(cavity) = 0.2 for a regular checkup.
• Conditional probability: P(cavity | toothache) = 0.6 for a toothache.
• Valid but Not Useful:
• P(cavity) = 0.2 remains valid but less useful after observing a toothache.
• Decisions should condition on all observed evidence.
• Difference from Logical Implication:
• P(cavity |toothache) = 0.6 does not mean “If toothache, then cavity with probability
0.6”.
• It means “If toothache and no further information, then cavity with probability 0.6”.
• Further evidence (e.g., dentist finds no cavities) updates probability: P(cavity | tooth
ache ∧ ¬cavity) = 0.

401892 LECTURE 6 22
22

Defining Conditional Probability

• Mathematical Definition:
• For propositions a and b:
∧
• Equation (3): .
• Example: P(doubles | Die = 5) = P(doubles ∧ Die = 5) / P(Die = 5).
• Product rule:
• Equation 3 can be written in a different form called product rule:
• Equation (4): ∧
• Easier to Remember:
• For a and b to be true:
• must be true.
• must be true given .

401892 LECTURE 6 23
23

Random Variables
• Factored representation of possible worlds: sets of
pairs
• Variables in probability theory: Random variables
• Domain: the set of possible values a variable can take on
• Names begin with an uppercase letter (e.g., Weather, Die1).
• Function mapping from possible worlds (Ω) to a range of values.
• Example Ranges:
• Weather: {sunny, rain, cloudy, snow} ,Die1: {1, ..., 6}, Odd: {true, false}
• Value Naming Conventions:
• Lowercase for values (e.g., P ( = ) ) to sum over the value of X.
• Boolean variable ranges: {true, false} or {0,1} (Bernoulli distribution).
401892 LECTURE 6 24
24

Variable Ranges and Propositions

• Ranges can be sets of arbitrary tokens.
• Examples of Ranges:
• Age: {juvenile, teen, adult}.
• Weather: {sun, rain, cloud, snow}.
• Abbreviations:
• A= : simply .
• A= : .
• Using Values for Propositions:
• When no ambiguity is possible, it is common to use a value by itself to stand for the
proposition that a particular variable has that value;
• Example: can stand for .
• Infinite Ranges:
• Discrete (e.g., integers) or continuous (e.g., reals).

401892 LECTURE 6 25
25

Combining Propositions and Probability

Distributions
• Elementary Propositions:
• Combine using propositional logic connectives.
• Example: ∧ .
• Conjunction Notation:
• Common to use a comma (e.g., P(cavity | ¬toothache, teen)).
• Probability Distributions:
• Bold is used as notational coding
• List probabilities of all possible values of a variable.
• Example: P(Weather) = ⟨0.6, 0.1, 0.29, 0.01⟩ for {sun, rain, cloud, snow}.
• P(Weather = sun) = 0.6
• Where the bold P indicates that the result is a vector of numbers
• And we assume a predefined order (sun, rain, cloud, snow) on the range of weather.
• We say that the P statement defines a probability distribution for the random variable .
• We can use a similar shorthand for conditional distributions
• Conditional Distributions:
• for each pair.

401892 LECTURE 6 26
26

Continuous Variable and Probability Density

Functions (PDFs)
• Infinite Values: Continuous variables have infinitely many possible values, making it impossible to
list out the entire distribution as a vector.
• Probability Density Function (PDF): Instead of exact probabilities, we use PDFs to describe the
likelihood of a random variable taking on a specific value
Uniform Distribution Example:

• Probability density from to is uniform.

• Probability Density P(x):

• For NoonTemp:

401892 LECTURE 6 27
27

Continuous Variable and Probability Density

Functions (PDFs) cont.
• Probabilities vs. Densities: Probabilities are unitless, while densities have units
(e.g., reciprocal degrees centigrade).
• Understanding Densities:
• A probability density of means there is a 100% chance the temperature will be within the
8°C range (18°C to 26°C).
• The density function varies with units. For example, the same temperature range in degrees
Fahrenheit (18°C = 64.4°F, 26°C = 78.8°F) results in a width of 14.4°F and a density of
• Implications:
• The exact probability of NoonTemp being 20.18°C is zero because it is an exact point, not an
interval.
• The density function provides a way to understand probabilities over intervals, not specific
points.

401892 LECTURE 6 28
28

Distribution Notation
• for distributions on multiple variables
o we use commas between the variables: so P(Weather, Cavity) denotes the
probabilities of all combinations of values of the 2 variables
o for discrete random variables we can use a tabular representation, in this
case yielding a 4x2 table of probabilities this gives the joint probability
distribution of Weather & Cavity
o tabulates the probabilities for all combinations

401892 LECTURE 6 29
29

Distribution Notation
• for distributions on multiple variables
o the notation also allows mixing variables & values
▪ P(sunny, Cavity) is just a 2-vector of probabilities
o the distribution notation, P, allows compact expressions
▪ for example, here are the product rules for all possible combinations of Weather &
Cavity
▪ P(Weather, Cavity) = P(Weather | Cavity)P(Cavity)
• the distribution notation summarizes what otherwise would be 8 separate equations each of
the form
•

401892 LECTURE 6 30
30

Full joint Distribution (FJD)

• now we fill in some details
o of the semantics of the probability of a proposition as the sum of probabilities
for the possible worlds in which it holds
▪ possible worlds are analogous to those in propositional logic
▪ each possible world is specified by an assignment of values to all of the random variables
under consideration
o for the random variables Cavity, Toothache & Weather there are 16 possible
worlds (2x2x4) & the value of a given proposition is determined in the same
recursive fashion as for formulas in propositional logic

401892 LECTURE 6 31
31

Full joint Distribution

• semantics of a proposition
o the probability model is determined by the joint distribution for all the
random variables: the full joint probability distribution
▪ for the Cavity, Toothache, Weather domain, the notation is:
▪ P(Cavity, Toothache, Weather)
▪ this can be represented as a 2x2x4 table
o given the definition of the probability of a proposition as a sum over possible
worlds, the full joint distribution allows calculating the probability of any
proposition over its variables by summing entries in the FJD

401892 LECTURE 6 32
32

Probability Axioms
• We can derive
o Some additional relationships for degrees belief among logically related propositions,
from axioms (equation 1 and 2) and some algebraic manipulation.
o For example, , the relationship between the probability of a
proposition & its negation
o and also axiom (eq. 5)
o This axiom from the probability of a disjunction is referred to as the inclusion-exclusion
principle
o Equation 1:
o Equation 5:
o Together, equations 1 and 5 are referred to as Kolmogorov’s axioms, in honor of the
Russian mathematician Andrey Kolmogorov, who showed how to build up the rest of
probability theory, including issue related to handling continuous variables.
401892 LECTURE 6 33
33

Is Probability the Answer?

• Historically
o there's been a debate over whether probabilities are the only viable
mechanism for describing degrees of belief
o the degree of belief in a proposition can be reformulated as betting odds for
establishing amounts of wagers on outcomes of events
o Bruno de Finetti (1931, 1993) proved that if an agent's set of degrees of belief
are inconsistent with the probability axioms, then when formulated as bets
on outcomes of events, there is a combinations of bets by an opposing agent
that will cause the agent to lose money every time

401892 LECTURE 6 34
34

Rationality & Probability Axioms

• apparently then
o no rational agent will have beliefs that violate the axioms of probability
▪ a common rebuttal to this argument is that betting is a poor metaphor & the agent could
just refuse to bet
▪ which itself is countered by pointing out that betting is just a model for the decision-
making that goes on, inevitably, all the time
o other authors have constructed similar arguments to support those of Bruno
de Finetti
o furthermore, in the "real world", AI reasoning systems based on probability
have been highly successful

401892 LECTURE 6 35
35

Don't Mess with the Probability Axioms

• From Figure 12.2 AIMA 4e
o Evidence for the rationality of probability

Figure 2. Agent 1's inconsistent beliefs allow Agent 2 to set up bets to guarantee Agent 1 loses, independent of outcome of a and b

o So, for example, Agent 1's degree of belief in a is 0.4, so will bet "against" it &
pay 6 to Agent 2 if a is the outcome, receive 4 from Agent 2 if it is not, and so
on

401892 LECTURE 6 36
36

Inference Using Full Joint Distributions

• Using the full joint distributions for inference
o We use FJD as the KB from which answers to all questions may be derived.
o Here's the FJD for the Toothache, Cavity, Catch domain of 3 Boolean variables
o The FJD is a

Figure 3. A full joint distribution for the Toothache, Cavity, Catch world.

o As required by the axioms, the probabilities sum to 1.0

o When available, the FJD gives a direct means of calculating the probability of
any proposition
o Just sum the probabilities for all the possible worlds in which the proposition is true

401892 LECTURE 6 37
37

Inference Using Full Joint Distributions

• An example of using the FJD for inference

o To calculate:
o Cavity toothache holds for 6 possible worlds
o The corresponding sum is = 0.28.

401892 LECTURE 6 38
38

Inference Using Full Joint Distributions

• Using the FJD for inference

• A common take is to state the distribution over a single variable or a subset of

variables: sum over the other variables to get the unconditional or marginal
probability.
• For example,
• The terminology for this is: “marginalization” or “summing out”
• It takes other variables out of the equation
• For sets of variables Y and Z: P(Y) =
• means to sum over all the possible combinations of values of the set of
variables Z

401892 LECTURE 6 39
39

Inference Using Full Joint Distributions

• Using the FJD for inference
• A variant considers conditional probabilities instead of joint probabilities uses the
product rule, referred to as conditioning

• The common scenario is to want conditional probabilities of some variable given

evidence about others
• Use the product Rule Equation 3 ∧ to get an expression in terms of
unconditional probabilities, then sum appropriately in the FJD.
• For example: the probability of a cavity, given evidence of a toothache
• ∧

401892 LECTURE 6 40
40

Inference Using Full Joint Distributions

• as a check we might
• compute the probability of no cavity, given a toothache
∧

• as they should, the probabilities sum to 1.0

• we note that is the denominator for both, & as part of the calculation of both
values for cavity, can be viewed as a normalization constant for the distribution
• terms both have as denominator ensuring they sum to 1

401892 LECTURE 6 41
41

Normalization Constant
• note that was the denominator
• for calculating both conditional probabilities
• it functions as a normalization constant for the distribution
, ensuring the probabilities add to 1
• in AIMA, this constant is denoted by and we use it to mean a normalizing
constant , where probabilities must add to 1
• since the sum for the distribution must be 1, we can just sum the raw values
obtained and then use 1/sum for
• this may make calculations simpler, and might even allow them when some
probability assessment is not available

401892 LECTURE 6 42
42

Normalization Constant
• an example of using the normalization constant

• ( | )= ( , )/ ( )
= ( , )
= [P(Cavity, toothache, catch)+P(Cavity, toothache, ¬catch)]
= [ 0.10 , 0.016 + 0.01 , 0.06 ]
= 0.1 , 0.0
= 0.6, 0.
• since the probabilities must add to 1.0, the calculation can be done without knowing , just
normalizing at the end

401892 LECTURE 6 43
43

Generalization of Inference
• given a query, the generalized version of the process for a conditional
probability distribution is:
• for a single variable X (Cavity in the preceding example), let E be the list of
evidence variables (just Toothache in the example) and e the list of observed
values for them, and Y the unobserved variables (Catch in the example)
• the query: is calculated by summing out over the unobserved
variables
• Equation (9):

401892 LECTURE 6 44
44

Inference for Probability

• given the full joint distribution & Equation 9
• we can answer all probability queries for discrete variables
• are we left with any unresolved issues?
• well, given n variables, and d as an upper bound on the number of values then the full
joint distribution table size & corresponding processing of it are , exponential in
• since might be 100 or more for real problems, this is often simply not
practical
• as a result, the FJD is not the implementation of choice for real
systems, but functions more as the theoretical reference point
(analogous to role of truth tables for propositional logic)
• the next sections we look at are foundational for developing practical
systems
401892 LECTURE 6 45
45

Independence

401892 LECTURE 6 46
46

Independence
• consider a new version of our example domain
• now defined in terms of 4 random variables
•
• so has a FJD with
entries
• one way to display it would be as four tables, 1 for each value of
Weather
• how are they related?
• for example:
•
•

401892 LECTURE 6 47
47

Independence
• in the 4-variable domain
• what is the relationship between
•
• given what we know about relating probabilities (the product rule)
•

• but we "know" that dental problems don't influence the weather

• & we know weather doesn't seem to influence dental variables
• so
•
• P(toothache, catch, cavity, cloudy) = P(cloudy) P(toothache, catch, cavity)
• & similarly for each entry in
• thus the 32 element table for 4 variables reduces to an 8 element table & a 4 element table

401892 LECTURE 6 48
48

Independence

• The property of independence

• or marginal independence or absolute
independence
• notationally, in terms of propositions or random
variables, is:
• ∧

• from our knowledge of the domain, we can Figure 4. Two examples of factoring a large joint distribution into smaller distributions, using
simplify the full joint distribution, dividing absolute independence. (a) Weather and dental problems are independent. (b) Coin flips are
variables into independent subsets with separate independent.
distributions
• as an example, for the Dentistry-Weather
domain

401892 LECTURE 6 49
49

Independence

• Absolute independence
• while very powerful for simplifying probability
representation & inference absolute independence is
unfortunately rare
• though, for example, for independent coin tosses
• P( ), the full joint distribution with
entries becomes n single variable
distributions
• and while
• this is an artificial example and the converse is more
likely the case for real domains
• that is, within a large domain like dentistry there are
likely dozens of diseases & hundreds of symptoms, all
interrelated

401892 LECTURE 6 50 50

• this is an artificial example and the converse is more likely the case for real domains
• that is, within a large domain like dentistry there are likely dozens of diseases & hundreds of
symptoms, all interrelated

401892 LECTURE 6 51
51

Bayes’ Rule and Its Use

401892 LECTURE 6 52
52

Bayes’ Rule and Its Use

• From the product Rule, for propositions a & b: ∧
• This expresses the probability of both a and b happening in terms of the conditional probability ∧
happening in terms of the conditional probability and the probability of b.
• Bayes’ Rule
• Now, let’s also e press ∧ in another way, by switching the roles of a and b:
∧
Since ∧ is the same regardless of the order, we can set these two expressions equal to each
other:

Solving for P(a|b)

• To derive Bayes’ Rule, we solve for P(a|b) by dividing both sides by P(b):

• This simple equation allows us to update the probability of an event based on new evidence .

401892 LECTURE 6 53
53

Bayes’ Rule and Its Use Cont.

• in the general case of multivalued variables, in distribution form
( ) ( )

• representing the set of equations, each for specific values of the variables
• & finally, a version indicating conditionalizing on background evidence
e

401892 LECTURE 6 54
54

Bayes’ Rule
• Bayes' rule
• is the basis of most AI systems of probabilistic inference
• It allows us to compute the single term P(b|a) in terms of three terms: P(a|b),
P(b), and P(a).
• finding diagnostic probability from causal probability
• specifies relationship in causal direction
• describes diagnostic direction

• in the medical domain, it is common to have conditional probabilities on

causal relationships
• P(symptoms | disease)

401892 LECTURE 6 55
55

Bayes’ Rule
• Bayes’ rule: a medical e ample

• here's a medical domain example

• a patient presents with a stiff neck, a known symptom of the disease meningitis
• the physician "knows" the prior probabilities of stiff neck (P(s) = 0.01) & meningitis (P(m) =
0.0002)
• in addition the physician knows that 70% of patients with meningitis have a stiff neck: P(s|m)
= 0.7

• P(m|s) = 0.7 × 0.00002/ 0.01

• 0.0014

401892 LECTURE 6 56
56

Bayes’ Rule Example

• Bayes’ rule & the meningitis e ample
P(m|s)=P(s|m)P(m)/P(s)
=0.7*0.00002/0.01
=0.0014
• So, we should expect only 1 in 700 patient with a stiff neck to have meningitis,
reflecting the much higher prior probability of stiff neck than of meningitis
• Note: normalization can be applied when using Bayes' Rule
• P(Y|X) =
• Where is a normalization constant so entries in P(Y|X) sum to 1

401892 LECTURE 6 57
57

Bayes’ Rule: n Evidence Variables

• Bayes’ rule & the dental diagnosis: scaling up
• for the combining of evidence from multiple sources/variables, how does use
of Bayes' Rule scale up, compared to using the FJD?
• the sample problem:
• what does the dentist conclude about a cavity when the patient has a
toothache & the probe catches in the sore tooth
• Equation (16):
∧
∧
• there's not an issue with just 2 sources, but if there are , then we have
possible combinations of observed values & we need to know the
conditional probabilities for each (no better than needing the full joint
distribution)
401892 LECTURE 6 58
58

Bayes’ Rule: n Evidence Variables

• Bayes' rule & the dental diagnosis: scaling up
• we return to the idea of independence
• in the example, Toothache & Catch are not absolutely independent, but are independent given either the
presence or absence of a cavity (each is caused by the cavity but otherwise they are independent)
• expressing the conditional independence given Cavity we get
• Equation (17): ∧

• (16):
∧ ∧
• substituting into yields the following, reflecting the conditional independence of
Toothache and Catch
• ∧

401892 LECTURE 6 59
59

Conditional Independence
• the general form of the conditional independence rule
• here are the most general & for the dental diagnosis domain

• (Eq.19):
• conditional independence also allows decomposition
• for the dental problem, algebraically, given (Eq.19), we have

401892 LECTURE 6 60
60

Conditional Independence
• implications of the conditional independence rule

• we decompose the original large table, which has = 7 independent

entries, into 3 smaller tables
• 2 of the tables are of the form P(T|C) with 2 rows, each of which must sum to 1 so has 1
independent number
• 1 table with 1 row for the prior distribution P(C) so having 1 more independent number
• for our Toothache, Catch, Cavity domain, we've gone from 7 to 5 independent
values in total, a small gain for a small problem
• but if there were n symptoms, all conditionally independent given Cavity, the size of the
resulting representation would be linear in n instead of exponential

401892 LECTURE 6 61
61

Conditional Independence
• summary: conditional independence
• allows scaling up to real problems since the representational complexity can
go from exponential to linear
• is more often applicable than absolute independence assertions
• yields this net gain: the decomposition of large domains into weakly
connected subsets
• is illustrated in a prototypical way by the dental domain: one cause influences
multiple effects, which are conditionally independent, given that cause

401892 LECTURE 6 62
62

Conditional Independence
• summary: conditional independence
• with multiple effects, which are conditionally independent, given the cause,
the full joint distribution then is rewritten as

• this is called the naïve Bayes model

• it makes the simplifying assumption that all effects are conditionally
independent
• it is naïve in that it is applied to many problems although the effect variables
are not precisely conditionally independent given the cause variable
• nevertheless, such systems often work well in practice

401892 LECTURE 6 63
63

The Wampus World Revisited

401892 LECTURE 6 64
64

The Wampus World Revisited

• recall the Wumpus World agent
• the agent explores the grid world to grab the gold while attempting to avoid being
eaten by the Wumpus or falling into a bottomless Pit
• we used propositional logic for representation & inference
• now we'll explore an example
• that uses probability in Wumpus World
• we'll simplify by restricting our WW hazards only to Pits
• recall that
1. the percept of a breeze in a square indicates a pit in a neighbouring square
2. the logical representation allowed some conclusions about whether a square was
safe but not a quantitative measure of risk if not absolutely safe
• the "is it safe" problem can be reformulated to use our new probability
tools

401892 LECTURE 6 65
65

The Wampus World Revisited

• the world
• incomplete information about the presence of Pits leads
to uncertainty, & the agent should choose the best next
move
• Figure 5, shows a situation in which each of the three
unvisited but reachable squares—[1,3], [2,2], and [3,1]—
might contain a pit
• Out aim is to calculate the probability that each of the
three squares contains a pit. (for this example, we ignore
the Wumpus and the gold.)

Figure 5 [1]. After finding a breeze in both [1,2] and [2,1], the agent is
stuck—there is no safe place to explore. (b) Division of the squares into ,
Known Frontier , and other, for a query about [1,3].

401892 LECTURE 6 66
66

The Wampus World Revisited

• The relevant properties of the wump world are that:
1. A pit cause breezes in all neighbouring squares
2. Each square other than [1,1] contain a pit with probability 0.2
• The first step is to identify the set of random variables we need:
here are the Random Variables in the problem
1. one Boolean variable for each square, which is true [ , ] contains a
pit
2. one per observed square, = [ , ] is breezy; we include these
variables only for the observed squares—in this case [1,1], [1,2], [2,1]
so we include only in the probability model
• The next step is specifying the full joint distribution

401892 LECTURE 6 67
67

Probabilities in Wumpus World

• We begin with the full joint distribution
=
• Applying the product rule yields

• 1st term: the conditional probability of a breeze configuration given a pit

configuration
• values in the first term are 1 if adjacent to a pit, 0 otherwise
• 2nd term: the prior probability of a pit configuration
• pits are placed randomly, independent of each other, with probability 0.2 for any square, so
• Equation 22:
• For a particular configuration with exactly pits, the probability is

401892 LECTURE 6 68
68

Probabilities in Wumpus World

• in the example, we have observed evidence
• a breeze or not in each visited square + no pit in any
visited square, abbreviated as b & known:
∧ ∧
∧ ∧
• an example query concerns the safety of other squares:
what's the probability of a pit at [1,3], given the evidence
so far?
•
• we could answer by summing over cells in the FJD

401892 LECTURE 6 69
69

Probabilities in Wumpus World

• to use summation over the FJD
• let Unknown be the set of variables for squares other than Known & [1,3]
• so from (Equation 9) we have

• that is, we can just sum over the entries in the Full Joint Distribution
but with 12 unknown squares we have terms in the summation,
so the calculation is exponential in the number of squares
• so we'll need to simplify from insight about independence
• we note: not all unknown squares are equally relevant to the query

401892 LECTURE 6 70
70

Probabilities in Wumpus World

• since summations over the FJD are exponential
• we need to simplify, given insight about independence
• to begin, we note that not all unknown squares are equally relevant to the
query
• first, some terminology about partitioning the pit variables
• frontier are those pit variables (besides the query variable) neighbouring the visited
squares
• other are the remaining pit variables
• with this revision, we see that the observed breezes are conditionally
independent of the other variables, given the known, frontier & query
variables

401892 LECTURE 6 71
71

Probabilities in Wumpus World

• using conditional independence
•
• note that the figures use Frontier to name the relevant squares neighbouring the visited
squares ([2,2] & [3,1])
• then we'll need to manipulate our query into a form where we can use this
• the query:
• the world:

401892 LECTURE 6 72
72

Using Conditional Independence

• using the conditional independence simplification

(the query, from Eq. 3)

• Then by product rule

• then partitioning unknown into frontier & other

• then using the conditional independence of b from other

• given (& so dropping other from first term)

• since the 1st term now does not depend on other, move the summation inward:

401892 LECTURE 6 73
73

Using Conditional Independence

• Manipulating the query to get efficient computation
• We began with

• so far, we have

• Using independence as in (Equ. 22), to factor the prior term

• then reorder the term

• fold P(known) into the normalizing constant

• & use

401892 LECTURE 6 74
74

Probabilities in Wumpus World

• using conditional independence & independence
• has yielded an expression with just 4 terms in the summation over the frontier variables &
eliminating other squares

• the expression is 1 when the frontier is consistent with the breeze

observations, 0 otherwise
• so to get each value of we sum over the logical models for frontier variables that are consistent
with known facts
• this figure shows the models & the associated priors

• .
Figure 6. Consistent model for frontier variables, and , showing P(frontier) for each model: (a) three model with = true showing two or three models with
= true showing two or three pits, and (b) two models with = false showing one or two pits [1].

401892 LECTURE 6 75
75

Probabilities in Wumpus World

• using conditional independence & independence
• has yielded an expression with just 4 terms in the summation over the
frontier variables & eliminating other squares

•.
Figure 6. Consistent model for frontier variables, and , showing P(frontier) for each model: (a) three model with = true showing two or three models with
= true showing two or three pits, and (b) two models with = false showing one or two pits [1].

401892 LECTURE 6 76
76

Using Conditional Independence

• note that , are symmetric
• so by symmetry, [3,1] would contain a pit about 31% of the time:

• & by a similar calculation, [2,2] can be shown to contain a pit with about 0.86
probability: 0. 6, 0.1
• it is clear to the probabilistic agent where not to go next

401892 LECTURE 6 77
77

Probability in Wumps World

• the logical agent & the probabilistic agent
• strictly logical inferencing can only yield known safe/known unsafe/unknown
• the probabilistic agent knows which move is relatively safer, relatively more
dangerous
• for efficient probabilistic solutions we can use independence & conditional
independence among variables to simplify the summations involved
• fortunately, these often match our natural understanding of how the problem should be
decomposed

401892 LECTURE 6 78
78

Reference
• Book: Russell, S., & Norvig, P. (2020). Artificial Intelligence: A Modern
Approach (4th Edition). Pearson
• [1] Russell, S., & Norvig, P. (2020). Artificial Intelligence: A Modern
Approach (4th Edition). Pearson.

401892 LECTURE 6 79
79

Next Class
• Introduction to Machine Learning

401892 LECTURE 6 80
80

Probability Random Variables and Random Processes Part 1
100% (10)
Probability Random Variables and Random Processes Part 1
30 pages
UNIT II - Probabilistic Reasoning
No ratings yet
UNIT II - Probabilistic Reasoning
74 pages
AI - Unit - 5
No ratings yet
AI - Unit - 5
25 pages
Lesson - 12
0% (1)
Lesson - 12
38 pages
Screenshot 2024-07-18 at 11.08.37
No ratings yet
Screenshot 2024-07-18 at 11.08.37
63 pages
Unit-4 Ai
No ratings yet
Unit-4 Ai
32 pages
CH 4 Order Statistics
No ratings yet
CH 4 Order Statistics
5 pages
Uncertain Domain in AI
No ratings yet
Uncertain Domain in AI
20 pages
Double Integrals
No ratings yet
Double Integrals
11 pages
Unit 3 - Decision Making Under Uncertainty in AI
No ratings yet
Unit 3 - Decision Making Under Uncertainty in AI
25 pages
GEORGE - Lectures in Turbulence For The 21st Century
No ratings yet
GEORGE - Lectures in Turbulence For The 21st Century
255 pages
03 QuantifyingUncertainty
No ratings yet
03 QuantifyingUncertainty
145 pages
Chapter 4 Expectation, Moments and Moment Generating Functions
No ratings yet
Chapter 4 Expectation, Moments and Moment Generating Functions
81 pages
24-Module - 5 Uncertain Knowledge and Reasoning-12!03!2024
No ratings yet
24-Module - 5 Uncertain Knowledge and Reasoning-12!03!2024
54 pages
IAI Unit5
No ratings yet
IAI Unit5
58 pages
Quantifying Uncertainty
No ratings yet
Quantifying Uncertainty
44 pages
Bcse306l Ai Module-5 Smsatapathy
No ratings yet
Bcse306l Ai Module-5 Smsatapathy
98 pages
Imp 13-QuantifyingUncertainty
No ratings yet
Imp 13-QuantifyingUncertainty
148 pages
Part.A Problem 1. Let X Andy Have Joint Density Function Solution
No ratings yet
Part.A Problem 1. Let X Andy Have Joint Density Function Solution
36 pages
CH 7 - Uncertain Knowledge & Reasoning2
No ratings yet
CH 7 - Uncertain Knowledge & Reasoning2
89 pages
Chapter1and3 Slides PDF
No ratings yet
Chapter1and3 Slides PDF
75 pages
CH 7 - Uncertain Knowledge & Reasoning-1
No ratings yet
CH 7 - Uncertain Knowledge & Reasoning-1
92 pages
University of Dar Es Salaam Coict: Department of Computer Science & Eng
No ratings yet
University of Dar Es Salaam Coict: Department of Computer Science & Eng
42 pages
Module-5 Complete Notes-Quantifying Uncertainty 20th February 2024
No ratings yet
Module-5 Complete Notes-Quantifying Uncertainty 20th February 2024
66 pages
Lecture 2-3 Reasoning With Uncertainty-1
No ratings yet
Lecture 2-3 Reasoning With Uncertainty-1
27 pages
Module 5
No ratings yet
Module 5
65 pages
Uncertainty Presented
No ratings yet
Uncertainty Presented
65 pages
Mod 3-1
No ratings yet
Mod 3-1
80 pages
Chapter 10 Reasoning Under Uncertainty
No ratings yet
Chapter 10 Reasoning Under Uncertainty
35 pages
Module 5 Complete Notes Quantifying Uncertainty 20th February 2024
No ratings yet
Module 5 Complete Notes Quantifying Uncertainty 20th February 2024
66 pages
CS3351 AIML UNIT 2 Notes
No ratings yet
CS3351 AIML UNIT 2 Notes
27 pages
Uncertainty: Vineet Sahula
No ratings yet
Uncertainty: Vineet Sahula
42 pages
PG Syllabus 2022-24
No ratings yet
PG Syllabus 2022-24
53 pages
UNIT II - AIML - Docsd
No ratings yet
UNIT II - AIML - Docsd
43 pages
Uncertainty
No ratings yet
Uncertainty
22 pages
Lec 2
No ratings yet
Lec 2
35 pages
Lecture 28
No ratings yet
Lecture 28
16 pages
Artificial Intelligence Notes 5
No ratings yet
Artificial Intelligence Notes 5
22 pages
Module 5
No ratings yet
Module 5
14 pages
Advanced Artificial Intelligence Module 2 - 2021 Scheme
No ratings yet
Advanced Artificial Intelligence Module 2 - 2021 Scheme
28 pages
Uncertainty Planning
No ratings yet
Uncertainty Planning
53 pages
PAI Module 5
No ratings yet
PAI Module 5
13 pages
AIML Mod 2
No ratings yet
AIML Mod 2
13 pages
4 AI Module 5
No ratings yet
4 AI Module 5
13 pages
Lecture2.2.2 Unit2 - AI (Autosaved)
No ratings yet
Lecture2.2.2 Unit2 - AI (Autosaved)
17 pages
Unit V
No ratings yet
Unit V
17 pages
Week 4
No ratings yet
Week 4
51 pages
AI Unit5
No ratings yet
AI Unit5
58 pages
BTECH
No ratings yet
BTECH
40 pages
CH-3b (Cond CDF and PDF
No ratings yet
CH-3b (Cond CDF and PDF
58 pages
Unit-4 Ai
No ratings yet
Unit-4 Ai
28 pages
Module 5 - Part1
No ratings yet
Module 5 - Part1
28 pages
Random Variables
No ratings yet
Random Variables
16 pages
Unit 5 Notes
No ratings yet
Unit 5 Notes
31 pages
Module 5
No ratings yet
Module 5
24 pages
Unit 2 Study Material
No ratings yet
Unit 2 Study Material
28 pages
A Response Surface Approach To Tolerance Design: Statistica Neerlandica (2006) Vol. 60, Nr. 3, Pp. 379-395
No ratings yet
A Response Surface Approach To Tolerance Design: Statistica Neerlandica (2006) Vol. 60, Nr. 3, Pp. 379-395
17 pages
CONTINUOUS RANDOM VARIABLE S2 Edexcel IAL
No ratings yet
CONTINUOUS RANDOM VARIABLE S2 Edexcel IAL
17 pages
Lec 26
No ratings yet
Lec 26
21 pages
Ai Unit-5
No ratings yet
Ai Unit-5
33 pages
Unit-3 3
No ratings yet
Unit-3 3
13 pages
Ai - Bad402 - M5
No ratings yet
Ai - Bad402 - M5
14 pages
Is1 Class 1 Group 6 Assignment 2
No ratings yet
Is1 Class 1 Group 6 Assignment 2
13 pages
Hydrological Frequency Analysis
No ratings yet
Hydrological Frequency Analysis
14 pages
Probabilistic Reasoning
No ratings yet
Probabilistic Reasoning
32 pages
A Flexible Lognormal Sum Approximation
No ratings yet
A Flexible Lognormal Sum Approximation
7 pages
Uncertainty
No ratings yet
Uncertainty
78 pages
Another Practical For Continuous Distribution
No ratings yet
Another Practical For Continuous Distribution
14 pages
Ai Own
No ratings yet
Ai Own
6 pages
Sistem Pakar: Pertemuan 6 Inferensi Dengan Ketidak Pastian
No ratings yet
Sistem Pakar: Pertemuan 6 Inferensi Dengan Ketidak Pastian
11 pages
MIT18 05S14 Exam1 PDF
No ratings yet
MIT18 05S14 Exam1 PDF
9 pages
AI - Unit 04
No ratings yet
AI - Unit 04
12 pages
AI Chapter 5
No ratings yet
AI Chapter 5
22 pages
Unit 5
No ratings yet
Unit 5
10 pages
Erlang PDF Generation
No ratings yet
Erlang PDF Generation
2 pages
Unit-3 1 1
No ratings yet
Unit-3 1 1
11 pages
Ai Notes 5
No ratings yet
Ai Notes 5
34 pages
Chapter4 1
No ratings yet
Chapter4 1
6 pages
MIT6 057IAP19 hw4
No ratings yet
MIT6 057IAP19 hw4
7 pages
1.7.2b Probability Quiz #12 Solutions
No ratings yet
1.7.2b Probability Quiz #12 Solutions
6 pages
Faen 302 Exams Questions 2021 2022
No ratings yet
Faen 302 Exams Questions 2021 2022
7 pages
Error Function Table PDF
No ratings yet
Error Function Table PDF
2 pages
Reasoning Under Uncertainty For GATE Exam
No ratings yet
Reasoning Under Uncertainty For GATE Exam
3 pages
UT Dallas Syllabus For Te3341.003.10s Taught by Mohammad Saquib (Saquib)
No ratings yet
UT Dallas Syllabus For Te3341.003.10s Taught by Mohammad Saquib (Saquib)
5 pages
Unit 4 - Representing and Reasoning With Uncertain Knowledge
No ratings yet
Unit 4 - Representing and Reasoning With Uncertain Knowledge
4 pages
Chapter 7.1-7.5 Worksheet
No ratings yet
Chapter 7.1-7.5 Worksheet
3 pages
Uncertain Domain in AI
No ratings yet
Uncertain Domain in AI
3 pages
06 5.4a Plan Generation 7-17
No ratings yet
06 5.4a Plan Generation 7-17
4 pages
Probability and Random Processes - Course - Outline - 2021 - 22docx
No ratings yet
Probability and Random Processes - Course - Outline - 2021 - 22docx
2 pages

Quantifying Uncertainty

Uploaded by

Quantifying Uncertainty

Uploaded by

Slide 1

Drawbacks of Contingency Plans

A Visit to the Dentist

Representation for Diagnosis

The Qualification Problem

1.2 Uncertainty and Rational Decisions:

• Reflects history of percepts to date.

2. Basic Probability Notation

2.1 What Probabilities are About

Events and Propositions

Understanding Conditional Probability

Defining Conditional Probability

Variable Ranges and Propositions

Combining Propositions and Probability

Continuous Variable and Probability Density

• Probability density from to is uniform.

Continuous Variable and Probability Density

Full joint Distribution (FJD)

Full joint Distribution

Is Probability the Answer?

Rationality & Probability Axioms

Don't Mess with the Probability Axioms

Inference Using Full Joint Distributions

o As required by the axioms, the probabilities sum to 1.0

Inference Using Full Joint Distributions

Inference Using Full Joint Distributions

• A common take is to state the distribution over a single variable or a subset of

Inference Using Full Joint Distributions

• The common scenario is to want conditional probabilities of some variable given

Inference Using Full Joint Distributions

• as they should, the probabilities sum to 1.0

Inference for Probability

• but we "know" that dental problems don't influence the weather

• The property of independence

Bayes’ Rule and Its Use

Bayes’ Rule and Its Use

Solving for P(a|b)

Bayes’ Rule and Its Use Cont.

• in the medical domain, it is common to have conditional probabilities on

• here's a medical domain example

• P(m|s) = 0.7 × 0.00002/ 0.01

Bayes’ Rule Example

Bayes’ Rule: n Evidence Variables

Bayes’ Rule: n Evidence Variables

• we decompose the original large table, which has = 7 independent

• this is called the naïve Bayes model

The Wampus World Revisited

The Wampus World Revisited

The Wampus World Revisited

The Wampus World Revisited

Probabilities in Wumpus World

• 1st term: the conditional probability of a breeze configuration given a pit

Probabilities in Wumpus World

Probabilities in Wumpus World

Probabilities in Wumpus World

Probabilities in Wumpus World

Using Conditional Independence

(the query, from Eq. 3)

• Then by product rule

• then partitioning unknown into frontier & other

• then using the conditional independence of b from other

Using Conditional Independence

• Using independence as in (Equ. 22), to factor the prior term

• then reorder the term

• fold P(known) into the normalizing constant

Probabilities in Wumpus World

• the expression is 1 when the frontier is consistent with the breeze

Probabilities in Wumpus World

Using Conditional Independence

Probability in Wumps World

You might also like