0% found this document useful (0 votes)

74 views71 pages

Probabilistic Reasoning

- The document discusses probabilistic reasoning and Bayesian networks. - It introduces probability, interpretations of probabilities, Bayesian networks, probabilistic inference, and learning in probabilistic models. - Bayesian networks allow modeling probabilistic relationships between variables and performing probabilistic inference.

Uploaded by

Ana Laura Malta Rendohl

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

74 views71 pages

Probabilistic Reasoning

Uploaded by

Ana Laura Malta Rendohl

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 71

Probabilistic Reasoning -

Bayesian Networks
Prof. Dr. Paulo André L. de Castro
[email protected]
www.comp.ita.br/~pauloac Sala 110,
IEC-ITA
Summary
• Introduction and Review of Probability

• Interpretations of Probabilities

• Bayesian Networks or Belief Netorks

• Probabilistic Inference

• Learning in Probabilistic models

• Simplified Models: Näive Bayes and Noisy-OR

What does make the world Uncertainty
(Partially observed or stochastic) ?
1. Ignorance. The limits of our knowledge lead us to be uncertain
about many things. Does our poker opponent have a flush or
is she bluffing?

2. Phyical randomness or indeterminism. Even if we know

everything that we might care about a coin and how we impart
spin to it when we toss it., there will remain an inescapable
degree of uncertainty about it wil land heads or tails
• A strong deterministr person might claim otherwise, that it would be
possible to calculate......but such a view is for the foreseeable future a
mere act of scientistic faith. We are all practical indeterminists

3. Vagueness. Many of predicates we employ appear to be vague.

It is often unclear whether to classify a bird as big or small, a
human as brave or not, a thought as knowledge or opinion
Example 1: Breast Cancer
• Suponha que uma mulher tenha 1% de chance de ter cancer de mama.
Em uma clínica, há um teste de cancer com 20% de falso positivo e
10% de falso negativo, i.e. 10% das mulheres com cancer terão um
resultado negativo. Logo, 90% (das mulheres com câncer) terão um
resultado positivo.Uma paciente da clínica teve um resultado
positivo de cancer. Qual a probabilidade dela ter cancer
realmente?
• Como há apenas 20% de chance falso positivo, então seria
80%,certo?
Não! P (Cancer | Pos )  1  P ( Pos | Cancer )
P ( Pos | Cancer ) P (Cancer )
P (Cancer | Pos )  
P ( Pos )
P ( Pos | Cancer ) P (Cancer )

P ( Pos | Cancer ) P (Cancer )  P ( Pos | Cancer ) P (Cancer )
0.9 * 0.01
  0.043
0.9 * 0.01  0.2 * 0.99
Example 2: People vs Collins

• In 1964 (Los Angeles), an interracial couple was convicted of

robbery, largely on the grounds that they matched a highly
improbable profile, a profile witch witness reported. The two
robbbers were reported to be:
• A man with moustache
• Who has black and had a beard
• And a woman with a ponytail
• Who was blonde
• The couple was interracial
• And were driving a yellow car

• The prosecution suggested that these features had the

following probabilities of beign observed in LA at the time:
1. A man with moustache - 1/4
2. Who was black and had a beard - 1/10
3. And a woman with a ponytail - 1/10
4. Who was blonde - 1/3
5. The couple was interracial - 1/1000
6. And were driving a yellow car - 1/10
Example 2: People vs Collins – cont.
• The prosecution called an instructor of math from a State
university who apparently testified that the “product rule”
could be applied. So, the probability of the evidence(e) be
collected for an non guilty couple (h) would be:

P (e | h)   P (ei | h)  1 / 12.000 .000

• The prosecution stated that given the evidence the

probabiliyt of the couple were innocent was no more than
1/12.000.000.
• The jury convicted them

• Is the probability estimate correct?

Example 2: People vs Collins – cont.
• Is the probability estimate (1/12.000.000) correct?
• No!!!

• The product rule does not apply in this case!!

• The observations are not independent!!!

• P(h|e) is not equal to 1-P(e| not h)!

• Alright, What is the probability of the couple being guilty then

given all this data?
Example 2: People vs Collins – cont.
• The pieces of evidence are NOT independent!!!
1. A man with moustache - 1/4
2. Who was black and had a beard - 1/10
3. And a woman with a ponytail - 1/10
4. Who was blonde - 1/3
5. The couple was interracial - 1/1000
6. And were driving a yellow car - 1/10
• Given 2 implies 1, and together 2, 3 and 4 imply 5 (to a fair approximation). Then a
better estimate is

P ( e | h )  P ( e2 |h ) P ( e3 |h ) P ( e4 |h ) P ( e6 |h )  1 / 3000
• Furthermore, P(h|e) is not equal to 1-P(e| not h), but to : (using Bayes investion and sum-out)

• We do not have P(e|h) and P(h)...

• If the coulple is guitly, what are the chances the evidences
would be observed, i.e., How do estimate P(e|h) ?
• That is a hard question, but feeling generous for the prossecutions.
Let's say it's 100%
• Now, we are missing the prior probability of a random couple
being guilty of the robbery, or P(h) . The most plausible
approach to estimate it is to count the number of couples in LA
are give them an equal prior probability
• Let’s say there are 1,625,000 eligible males and as many
female in Los Angeles area...so:
P (e | h ) P ( h ) P (h)
P (h | e)   
P (e | h ) P ( h )  P (e | h ) P ( h ) P ( h )  P (e | h ) P ( h )
1 / 1625000
  0,002
1 / 1625000  (1  1 / 1625000 ) / 3000
Brief Review about
Statistics and
Probabilities
For more references about Prob.&Stat. see:
Devore, J. L. Probability and Statistics for Engineering and the Sciences. 6. ed.
Southbank:
Thomson, 2004.
Ross, M. S. Introduction to Probability and Statistics for Engineers and
Scientists. 2. ed. Harcourt: Academic Press, 1999
Statistics for Business and Economics. McClave, Benson, Sincich. 1998
Statistics and Probability [Short] Review

• A random variable has a domain (set of values) and associates

each one with an ocular value probability. This function is called
a probability distribution
• In the continuous case, the term probability density function is used.
• There are many classical distributions: Normal (Gaussian), Uniform,
Binomial, Poisson, Exponential, etc.

• P(A) – probabilidade a priori

• Example:
• Variable weather= {Sunny, Clouy, rainny}
• P(Weather) – is the probability distribution
• P(Weather) = <0,7;0,2;0,1>
• P(Weather=sunny) = 0.7
• P(Weather=rainny) = 0.1
• or
• P(sunny) = 0.7, P(rainny)=0.1
Two examples of Continuous
distributions
• Normal Distribution
• Uniform Distribution
2
1  x 
1   
f ( x)  e 2  

 2

• (a+b)/2 is the mean

• μ is the mean •
ba
is the standard deviation
12
• σ is the standard deviation
Example of a Discrete Distribution

• Dirichlet Discrete Distribution: a categorical distribution

(with finite possible states), it can be written as:

D[1 ,  2 ,..,  i ,...,  ]

• The probability of observing state i is:

i
P( X  i )  


j 1
j
Probability Axioms
• For any propositons A and B
Condicional probability
(Probabilidade condicional)

• P(A | K) – condicional probability or posterior probability (probabilidade

condicional ou probabilidade a posterior)
• For example, P(A=carie| K=toothache )=0,8 means that :
• given that toothache is all I know, the chance of caries (seen by me) is 80%.
• P(A |K) is a vector of 2 elements each one with two elemnts. Given
A=<carie, not carie>, K=<toothache, not toothache>
• For instance, P(A | K) = <<0,8;0,2>;<0,01;0,99>>
• If we know more, e.g., I know that I have carie than
• P(carie|toothache, carie) = 1
• Obs:
1) One belief may stay valid, but it may become useless
2) The new evidence may be useless:
P(carie|toothache, “corinthias lost the game”) = P(carie|toothache) = 0.8
Note the relevance of knowledge about the domain to any inference process!!
Conditional Prob. Basic axioms

P ( A, B )
P( A | B) 
P(B)
• Or we can write as:

P ( A, B )  P ( A | B ) P ( B )
• And we know that (sum-out):

P ( A)   P ( A, Bi )
i
• Then

P ( A)   P ( A | Bi ) P ( Bi )
i
Chain Rule (Regra da Cadeia)
n
P ( X 1 , X 2 , X 3 ,.. X n )   P ( X i | X 1 , X 2 , X 3 ,.. X )
i 1

• Demonstration:
P ( X 1 , X 2 , X 3 ,.. X n )  P ( X n | X 1 , X 2 , X 3 ,.. X n 1 ) P ( X 1 , X 2 , X 3 ,.. X n 1 ) 
P ( X n | X 1 , X 2 , X 3 ,.. X n 1 ) P ( X n 1 | X 1 , X 2 , X 3 ,.. X n  2 ) P ( X 1 , X 2 , X 3 ,.. X n  2 )
...... 
P ( X n | X 1 , X 2 , X 3 ,.. X n 1 ) P ( X n 1 | X 1 , X 2 , X 3 ,.. X n  2 ).. P ( X 1 ) 
n

 P( X
i 1
i | X 1 , X 2 , X 3 ,.. X )
Bayes Rule (Regra de Bayes)

P (e | H ) P ( H )
P ( H | e) 
P (e)
P(H): Hypothesis a priori probability

P(e): evidence a priori probability

P(H|e): Hypothesis posterior Probability

P(e|H): Probability of observing evidence e given H

Why is it relevant?
Cause and Effect
• We usually observe an effect and try to identify its cause

• So, We wanna know P(Cause| Effect) (i.e. probability of each

possible cause)

• However, it is usually easier to determine P(Effect| Cause)

than P( Cause | Effect), and:

P ( Effect | Cause ) P (Cause )

P (Cause | Effect ) 
P ( Effect )
Casino Example

• In one casino, the croupier speaks 12!

• Does he played dice or is he in a roullete??

• The questions are; P(roullete|12) =? and P(dice|12)=?

• We know that:
P (12 | dice ) P ( dice )
P ( dice | 12 ) 
P (12 )

• P(12|dice), P(12|roullete): easier to model...how?

• P(dice), P(roullete): How to estimate?

Another example: Meningitis
• Let's assume 0.8 of people with Meningitis present stiff neck
(S), probability of Meningitis is 1 in 10000 and Stiff neck
prob. is 0.1
Calculating the probability of the
evidence
• Suppose we wish to computer the probability of the observed
evidence, let's say P(E=e) and A has possible values a1, ...am . We
can apply Bayes' rule for each value of A:
P ( A  a1 | E  e )  P ( E  e | A  a1 ) P ( A  a1 ) / P ( E  e )
....
P ( A  am | E  e )  P ( E  e | A  am ) P ( A  am ) / P ( E  e )

• Adding these up and noting that,  P( A  a

i
i | E  e)  1

 P( A  a
i
i | E  e )  1   P ( E  e | A  ai ) P ( A  ai ) / P ( E  e )
i

• Then:
P ( E  e )   P ( E  e | A  ai ) P ( A  ai )
i
Calculating the probability of the
evidence - 2
• Since P ( E  e )   P ( E  e | A  ai ) P ( A  ai )
i

• The division by P(E=e) can be seen as a normalization factor

α, in equation below for any ak

P ( A  a k | E  e )  P ( E  e | A  a k ) P ( A  a k ) /  P ( E  e | A  ai ) P ( A  ai ) 
i

P ( E  e | A  ak ) P ( A  ak )

• In vectorial notation, we can write:

P ( A | E  e )   P ( E  e | A) P ( A)
Inference from Full joint distributions

• Tyipcally w, we are interested in the posterior joint distribution of the query variable Y
• given specifc values e for the evidence variables E

• Let the hidden variables be H= X - Y - E

• then the required summation of joint entries is done by summing out the hidden variables:

• the terms in the summation are joint entries because Y, E and H together exhaust the set of random
variables
• Obvious problems
1. Worst-case time complexity O(dn) where d is the number of possible elments of variable
2. Space complexity O(dn) to store the joint distirbution
3. How to find the numbers (probabilities) for O(dn) entrtries?
• n – number of variables
Inference from Full joint distributions - 2

• Inference from Full joint distributions could estimate any

conditional probability even when involving hidden variables
• But, it would require a large amount of space to store it and
even more data to build such full joint distribution

• Bayesian Network make it easier to build and store

distributions
Summary
• Introduction and Review of Probability

• Interpretations of Probabilities

• Bayesian Networks or Belief Netorks

• Probabilistic Inference

• Learning in Probabilistic models

• Simplified Models: Näive Bayes and Noisy-OR

Interpretations of Probabilities
• There are two main views about how to understand probilities: One
asserts that probabilites are fundamentally properties of non-
deterministic physical systems. This view is particulary associated with
frequentism.

• Popper's observation (195) thar frequency interpreation, precisse

though it wass fail to accommodate our intuition that probabilities of
singular events exist and are meangiful
• Do we need to toss a coin infinity (or many times) to make statements about the
probability of it landing head in one specific toss?

• The alternative view of probability is to think of probabilities as

reporting our subjective degrees of belief. This view was expressed by
Thomas Bayes (1958) and Pierre Simon de Laplace (1951)
Principal Principle and
Conditionalization
• Principal Principle: whenever you learn that the physical
probability of an outcome is r, set your subjective probability
for that outcome to r
• This is really just common sense, you may think that probability of a
friend shaving his head is 0.01, but if you learn that he will do so if
and only if a fair coin yet to be flipped lands head, you will revise
your subjective probability to 0.5

• Definition Conditionalization: After applying Bayes' theorem

to obtain P(h|e) adopt that as your degree of belief in h or
Bel(h) = P(h|e)
Belief Network (Rede Bayesiana ou Rede
de Crença)
• A simple, graphical notation for conditional independence
assertions and hece for compct specification of full joint
distributions

• Syntax:
• a set of nodes, one node per variable
• a directed, acyclic graph (link means “directly influences”)
• a conditional probabilty distribution (CPD) for each node given its
parents:
• P(Xi | Parents(Xi) )
• In the simplest case, conditional distribution are represented as a
conditional probability table (CPT) giving the distribution over Xi for
each combination of parent values
Example: Is it an Earthquake or burglar?
Example - 2
Markov Blanket (Cobertor de Markov)
A very simple Method to build Bayes
Networks
Exemplo
Another Example: Car Diagnosis
Another Example: Car Insurance
• Problem: Estimate expected costs (Medical, Liability,
Property) given some information (gray nodes)
I-map and D-map and Perfect Map
• I-map: All direct dependencies in the system being modeled
are explicitly shown via arcs. (Independence Map or I-map for
short).

• D-map: If every arc in a BN happens to correspond to a direct

dependence in the system, then the BN is said to be a
Dependence-map (or, D-map for short).

• A BN which is both an I-map and a D-map is said to be a

perfect map.
Sumário
• Redes Bayesianas ou Redes de crença

• Inferência probabilística

• Aprendizado em método probabilísticos

• Métodos simplificados: Bayes ingênuo e Noisy-OR

Inferência em Redes Bayesianas
• Dada uma rede, devemos ser capaz de inferir a partir dela
isto é :

• Busca responder questões simples, P(X| E=e)

• Ex.:
• Ou questões conjuntivas: P( Xi , Xj | E=e)
• Usando o fato:

• A inferência pode ser feita a partir da distribuição conjunta

total ou por enumeração
Inferência com Distribuição Conjunta
Total: Exemplo
Por exemplo para saber
P(A|b) temos
P(A|b)= P(A,b)/P(b)=

<P(a, b)/P(b);P(⌐a , b)/P(b) > =

=α< P(a, b);P(⌐a , b)>

= α [ <P(a,b,c)+P(a,b,⌐c); P(⌐a,b,c)+P(⌐a,b, ⌐c)>]

Observe que α pode ser visto como um fator de normalização para o vetor resultante
da distribuição de probabilidade, pedida P(A|b). Assim pode-se evitar seu cálculo,
Simplesmente normalizando <P(a,b); P(⌐a , b) >
Inferência em Redes Bayesianas
Inferência por Enumeração
• Enumeração é ineficiente (ex. calcula P(j|a)P(m|a) repetidamente), mas pode ser melhorada através
do armazenamento dos valores já calculados (Programação Dinâmica)
Calculando P(b|j,m) não normalizado

"P(b| j,m) nao normalizado"

0,0005922

0,001

+ 0,5922426

0,001197 0,591046

* 0,002 0,998

+ 0,598525 0,59223

X1X2X3 0,5985 0,000025 0,5922 0,00003

X1 0,95 0,05 0,94 0,06

X2 0,9 0,01 0,9 0,01

X3 0,7 0,05 0,7 0,05

Calculando P(não b|j,m) não normalizado
"P(nao b| j, m) nao normalizado"

0,001492

0,999

+ 0,001493

0,000366 0,001127

* 0,002 * 0,998

+ 0,183055 + 0,00113

Produtorio 0,1827 0,000355 0,00063 0,0005

0,29 0,71 0,001 0,999

0,9 0,01 0,9 0,01

0,7 0,05 0,7 0,05

Valores Normalizados P(b|j,m) e P(não b|j,m)
0,0005922
P (b | j , m)   0,2841
0,0005922  0,001492

0,001492
P (b | j , m )   0,7159
0,0005922  0,001492
Algoritmo de Enumeração
Inferência por Enumeração
• Algoritmo de Enumeração permite determinar uma
distribuição de probabilidade condicional
• P(variável de saída| evidências conhecidas)

• Também é possível responder perguntas conjuntivas usando

o fato:

• Demonstração?….
Demonstração

como:
Inferência por Enumeração
• Como observado, a enumeração tende a recalcular várias
vezes alguns valores

• Pode-se eliminar parte do retrabalho através da técnica de

programação dinâmica. Há vários algoritmos aplicáveis um
dos mais usados é o algoritmo de eliminação de variável
(variable enumeration)
• Basicamente, os valores já calculados são armazenados em uma
tabela e selecionados quando novamente necessários…Estas técnicas
são chamadas de Inferência Exata e podem se caras
computacioalmente para redes complexas
• Uma alterantiva são algortimos de inferência aproximada
(Approximate Inference), que se baseiam na amostragem da
rede para realizar inferência
• randomized sampling algorithms, also called Monte Carlo algorithms

• Mais informações Russel, cap. 14

Sumário
• Redes Bayesianas ou Redes de crença

• Inferência probabilística

• Aprendizado em método probabilísticos

• Métodos simplificados: Bayes ingênuo e Noisy-OR

Aprendizado em modelos
probabilísticos
• Aprender em redes bayesianas é o processo de determinar a
topologia da rede (isto é, seu grafo direcionado) e as tabelas
de probabilidade condicional

• Problemas?
• Como determinar a topologia?
• Como estimar as probabilidades ?
• Quão complexas são essas tarefas?
• Isto é quantas topologias e quantas probabilidades precisariam ser
determinadas….
Tamanho das Tabelas de Probabilidade Condicional e
Distribuição Conjunta Total
• Vamos supor que cada variável é influenciada por no máximo k outras variáveis
(Naturalmente, k<n=total de variáveis).

• Supondo variáveis booleanas, cada tabela de probabilidade condicional (CPT) terá no

máximo 2k entradas (ou probabilidades). Logo ao total haverá no máximo n* 2k
entradas

• Enquanto, na distribuição conjunta Total haverá 2n entradas. Por exemplo, para n=30
com no máximo cinco pais (k=5) isto significa 960 ao invés de mais um bilhão (230)
Número de “entradas” da Distribuição
Conjunta e na Rede Bayesiana - 2
• Em domínios onde cada variável pode ser diretemante influenciada por
todas as outras, tem-se a rede totalmente conectada e assim exige-se a
quantidade de entradas da mesma ordem da distribuição conjunta total

• Porém se essa dependência for tênue, pode não valer a pena a

complexidade adicional na rede em relação ao pequeno ganho em
exatidão

• Via de regra, se nos fixarmos em um modelo causal acabaremos tendo

de especificar uma quantidade menor de números, e os números
frequentemente serão mais fáceis de calcular. (Russel,Norvig, 2013, pg.
453)

• Modelos causais são aqueles onde se especifica no sentido causa efeito,

isto é P(efeito|causa) ao invés de P(causa|efeito), oque geralmente é
necessário para diagnóstico
Simplificando a representação tabelas
de probabilidade condicional (CPT)
• Vimos que que o número de entradas de uma CPT cresce
exponencialmente
• Para o caso binário e K pais, a CPT de um nó terá 2k probabilidades a
serem calculadas

• Vejamos duas abordagens para simplificar a rede através da

adoção de hipóteses simplificadoras
• Bayes Ingênuo e
• OU-ruidoso
Naïve Bayes (Bayes Ingênuo)
• Uma classe particular e simples de redes bayesianas é
chamada de Bayes Ingênuo (Naïve Bayes)
• Ela é simples por supor independência condicional
entre todas as variáveis X dada a variável Class
• As vezes, chamado também de classificador Bayes,
por ser frequentemente usado como abordagem
inicial para classificação
Naïve Bayes (Bayes Ingênuo) - 2
• A topologia simples traz a vantagem da representação
concisa da Distribuição Conjunta Total.
• Como todo os nós tem no máximo um pai, cada CPT de no X
tem apenas duas entradas e uma entrada no nó classe. Logo,
(2n-1) entradas para toda a rede. Naïve Bayes é linear em
relação ao número de nós (n) !!!!
• “Na prática, sistemas de Bayes ingênuos podem funcionar
surpreendentemente bem….”. pg. 438
Exemplo: Devo jogar tênis?
Ex Céu Temperatura Umidade Vento JogarTênis
X1 Ensolarado Quente Alta Fraco NÃO
X2 Ensolarado Quente Alta Forte NÃO
X3 Nublado Quente Alta Fraco SIM
X4 Chuvoso Boa Alta Fraco SIM
X5 Chuvoso Fria Normal Fraco SIM
X6 Chuvoso Fria Normal Forte NÃO
X7 Nublado Fria Normal Forte SIM
X8 Ensolarado Boa Alta Fraco NÃO
X9 Ensolarado Fria Normal Fraco SIM
X10 Chuvoso Boa Normal Fraco SIM
X11 Ensolarado Boa Normal Forte SIM
X12 Nublado Boa Alta Forte SIM
X13 Nublado Quente Normal Fraco SIM
X14 Chuvoso Boa Alta Forte NÃO
Usando a abordagem Bayes ingênuo

• O método de inferência por enumeração já visto é aplicável!!!

• Estima-se as probabilidades pelo conjunto de treinamento
Contagens e probabilides estimadas
pelo conjunto de treinamento

• P(Play=s|Outlook=sunny,Temp=cool,Hum=high,Wind=tru
e)=

• Zero! Isto é razoável? Como resolver?

• Uma Solução: estimador de Laplace (Laplace smoothing). Seja V
o número de valores possíveis para A, estima-se P(A|B) :
• P(A=a|B=b) = [N(A=a,B=b)+1]/[N(B=b)+V]
Criando Distribuições Condicionais
Conjuntas Compactadas….
• Alguns problemas podem ser modelados com uma abordagem
do tipo Noisy-OR (ou ruidoso). A técnica parte de duas
hipóteses:
• Todas as causas de uma variável ser acionada estão listadas (pode-
se adicionar uma causa geral “outros”)
• Isto é, P (Fever | F,F,F) = 0
• Há independência condicionais entre oque causa a “falha” da variável
pai acionar a variável filho (efeito). Exemplo: o que impede a gripe de
causar febre em alguém é independente do que impede o resfriado
de causar febre.
• Isto é, P (not Fever| Cold,Flu,Malaria) = P( not Fever|Cold)P(not Fever| Flu)P(not
Fever | Malaria)
• Exemplo:
• P(Not fever |malaria) =0.1
• P(Not fever| flu) =0.2
• P(Not fever| cold)=0.6
Noisy -OR

• P(X | u1,…uj, ⌐uj+1, …. ⌐uk ) = <1- ∏ji=1 qi; ∏ji=1 qi >

• qi is the probability of cause i fails !!
Noisy -OR

• P(X | u1,…uj, ⌐uj+1, …. ⌐uk ) = <1- ∏ji=1 qi; ∏ji=1 qi >

• qi is the probability of cause i fails !!

Major IDN Case Study
No ratings yet
Major IDN Case Study
4 pages
GIS and Its Implementations
No ratings yet
GIS and Its Implementations
250 pages
Bootcamp 2 Session PPT Day 1 Probability Statistics Ankit Javeri 2ND May 2024
No ratings yet
Bootcamp 2 Session PPT Day 1 Probability Statistics Ankit Javeri 2ND May 2024
37 pages
Artifical Intelligence Notes Part 7
No ratings yet
Artifical Intelligence Notes Part 7
49 pages
Module 4 - Probability Reasoning and Uncertainty
No ratings yet
Module 4 - Probability Reasoning and Uncertainty
80 pages
Ai (It) Unit-3
No ratings yet
Ai (It) Unit-3
85 pages
Chapter 4
No ratings yet
Chapter 4
25 pages
Cpts 440 / 540 Artificial Intelligence: Uncertainty Reasoning
No ratings yet
Cpts 440 / 540 Artificial Intelligence: Uncertainty Reasoning
59 pages
Unit 3 Uncertainty
No ratings yet
Unit 3 Uncertainty
36 pages
Elementary Probability and Naive Bayes Classifiers
No ratings yet
Elementary Probability and Naive Bayes Classifiers
88 pages
Unit 5
No ratings yet
Unit 5
25 pages
Prob
No ratings yet
Prob
38 pages
Probabilistic Reasoning
No ratings yet
Probabilistic Reasoning
23 pages
Probability & Probability Distribution
No ratings yet
Probability & Probability Distribution
39 pages
Unit-4 Uncertainty
No ratings yet
Unit-4 Uncertainty
49 pages
Unit6 Uncertain
No ratings yet
Unit6 Uncertain
35 pages
UNIT - VI Uncertainty Measure
No ratings yet
UNIT - VI Uncertainty Measure
20 pages
Probability and Probability Distn
100% (2)
Probability and Probability Distn
138 pages
SD Bayes Theorem 1
No ratings yet
SD Bayes Theorem 1
35 pages
Introduction To Uncertainity
No ratings yet
Introduction To Uncertainity
66 pages
Bayes' Formula and Independence: Scott Sheffield
No ratings yet
Bayes' Formula and Independence: Scott Sheffield
61 pages
Lecture 2 - CS50 - S Introduction To Artificial Intelligence With Python
No ratings yet
Lecture 2 - CS50 - S Introduction To Artificial Intelligence With Python
24 pages
AI UNIT-5 Notes AI UNIT-5 Notes: Scan To Open On Studocu Scan To Open On Studocu
No ratings yet
AI UNIT-5 Notes AI UNIT-5 Notes: Scan To Open On Studocu Scan To Open On Studocu
26 pages
Machine Learning: A Probabilistic Perspective: Solutions Manual (Please Do Not Make Publicly Available)
No ratings yet
Machine Learning: A Probabilistic Perspective: Solutions Manual (Please Do Not Make Publicly Available)
127 pages
Chapter 5 - Uncertain Knowledge and Reasoning
No ratings yet
Chapter 5 - Uncertain Knowledge and Reasoning
29 pages
Artificial Intelligence M2
No ratings yet
Artificial Intelligence M2
12 pages
Probabilistic Reasoning in Artificial Intelligence
No ratings yet
Probabilistic Reasoning in Artificial Intelligence
5 pages
ch13 Uncertainty
No ratings yet
ch13 Uncertainty
26 pages
Leon-Garcia-IPPR - Chapters 1-6
No ratings yet
Leon-Garcia-IPPR - Chapters 1-6
180 pages
Unit 3
No ratings yet
Unit 3
68 pages
Introduction To Bayesian Learning: Aaron Hertzmann University of Toronto SIGGRAPH 2004 Tutorial
No ratings yet
Introduction To Bayesian Learning: Aaron Hertzmann University of Toronto SIGGRAPH 2004 Tutorial
141 pages
Reasoning Under Uncertainity
No ratings yet
Reasoning Under Uncertainity
49 pages
ITS662 Chapter 4 - Bayes Theorem
No ratings yet
ITS662 Chapter 4 - Bayes Theorem
14 pages
Unit Ii
No ratings yet
Unit Ii
30 pages
Conditional Probability, Bayes Rule
No ratings yet
Conditional Probability, Bayes Rule
22 pages
UNIT 4 Probability
No ratings yet
UNIT 4 Probability
20 pages
Uncertainity Measure
No ratings yet
Uncertainity Measure
64 pages
Unit Iii Ii
No ratings yet
Unit Iii Ii
23 pages
Probabilistic Model
No ratings yet
Probabilistic Model
7 pages
25-27 Statistical Reasoning-Probablistic Model-Naive Bayes Classifier
No ratings yet
25-27 Statistical Reasoning-Probablistic Model-Naive Bayes Classifier
35 pages
Ai2 Unit
No ratings yet
Ai2 Unit
22 pages
AI CSE Unit - 3 First Half
No ratings yet
AI CSE Unit - 3 First Half
51 pages
M2
No ratings yet
M2
9 pages
Uncertainty PDF
No ratings yet
Uncertainty PDF
102 pages
Module3 - Learning, Uncertainity Lecture Notes. 16861418577274
No ratings yet
Module3 - Learning, Uncertainity Lecture Notes. 16861418577274
30 pages
UNIT 5 Artificial Intelligence Notes
No ratings yet
UNIT 5 Artificial Intelligence Notes
20 pages
Chapter 3
No ratings yet
Chapter 3
132 pages
2
No ratings yet
2
30 pages
Unit III Probability B.tech 2nd Sem
No ratings yet
Unit III Probability B.tech 2nd Sem
30 pages
Topic01 - Probability
No ratings yet
Topic01 - Probability
80 pages
Acting Under Uncertainty - Bayesian Inference-Probabilistic Reasoning
No ratings yet
Acting Under Uncertainty - Bayesian Inference-Probabilistic Reasoning
22 pages
Probability - Session 3 2023
No ratings yet
Probability - Session 3 2023
51 pages
07 Probability Review
No ratings yet
07 Probability Review
56 pages
5 - Uncertainty and Knowledge Reasoning
No ratings yet
5 - Uncertainty and Knowledge Reasoning
33 pages
Unit-3 Ai
No ratings yet
Unit-3 Ai
24 pages
Unit 3
No ratings yet
Unit 3
8 pages
CS3491 Unit 2 Aiml
100% (1)
CS3491 Unit 2 Aiml
21 pages
Module 5 1
No ratings yet
Module 5 1
22 pages
Topic-2 Probability & Prob Distribution
No ratings yet
Topic-2 Probability & Prob Distribution
67 pages
Ai Notes
No ratings yet
Ai Notes
68 pages
Bell's Inequality Untwisted
From Everand
Bell's Inequality Untwisted
James Spinosa
No ratings yet
Bell's Inequality Untwisted
From Everand
Bell's Inequality Untwisted
Jim Spinosa
No ratings yet
0022 Ammonia Production
No ratings yet
0022 Ammonia Production
32 pages
Design of Electrical Apparatus
No ratings yet
Design of Electrical Apparatus
15 pages
DTAV40Series Instructions PDF
No ratings yet
DTAV40Series Instructions PDF
12 pages
Saes N 120
100% (1)
Saes N 120
13 pages
Ashby Jones - Engineering Materials - Vol.1 - Necking
No ratings yet
Ashby Jones - Engineering Materials - Vol.1 - Necking
5 pages
Integrated Science ATAR Y11 Sample Course Outline WACE 201516 - PDF
No ratings yet
Integrated Science ATAR Y11 Sample Course Outline WACE 201516 - PDF
4 pages
CDS18122 SLM-C Club Shift Light Module PDF
No ratings yet
CDS18122 SLM-C Club Shift Light Module PDF
2 pages
Ferti Jet
No ratings yet
Ferti Jet
19 pages
Six Weeks Industrial Training Report Format (Training II)
No ratings yet
Six Weeks Industrial Training Report Format (Training II)
10 pages
Design of Non Parallel Gears
No ratings yet
Design of Non Parallel Gears
25 pages
Revised First Year Counselors For 2023-24
No ratings yet
Revised First Year Counselors For 2023-24
15 pages
PWB 209a
No ratings yet
PWB 209a
4 pages
Alperen Tunçkıran 2517100
No ratings yet
Alperen Tunçkıran 2517100
111 pages
English Grammar Quiz Questions With Answers PDF
100% (1)
English Grammar Quiz Questions With Answers PDF
13 pages
P4 - Clutch-Coupling and Brake
No ratings yet
P4 - Clutch-Coupling and Brake
6 pages
DOT FAA CT 89 22 Aircraft Lightning Protection Handbook
No ratings yet
DOT FAA CT 89 22 Aircraft Lightning Protection Handbook
503 pages
Rubrics For Student Engagement or Class Participation
No ratings yet
Rubrics For Student Engagement or Class Participation
2 pages
Lower Timing Chain
No ratings yet
Lower Timing Chain
34 pages
SWEDISH FA150K Lite
No ratings yet
SWEDISH FA150K Lite
2 pages
What Is Phenomenology? Describe The Proponents, Purposes and Types of Phenomenology
No ratings yet
What Is Phenomenology? Describe The Proponents, Purposes and Types of Phenomenology
4 pages
Statement of Account: Date Narration Chq./Ref - No. Value DT Withdrawal Amt. Deposit Amt. Closing Balance
No ratings yet
Statement of Account: Date Narration Chq./Ref - No. Value DT Withdrawal Amt. Deposit Amt. Closing Balance
27 pages
Fuji Inverter Manual
No ratings yet
Fuji Inverter Manual
103 pages
Upsc/Jpsc/Bpsc Online Class
No ratings yet
Upsc/Jpsc/Bpsc Online Class
4 pages
ThinkPad T520 Manual
No ratings yet
ThinkPad T520 Manual
178 pages
PFA Vs PTFE in Instrumentation
No ratings yet
PFA Vs PTFE in Instrumentation
5 pages
QPM Edition Version 2018.1
100% (1)
QPM Edition Version 2018.1
40 pages
Tender Schedule Ponshe Agency Staff
No ratings yet
Tender Schedule Ponshe Agency Staff
121 pages
Mohammad Mujahidvaliyullah's CEO Resume
No ratings yet
Mohammad Mujahidvaliyullah's CEO Resume
4 pages

Probabilistic Reasoning

Uploaded by

Probabilistic Reasoning

Uploaded by

Probabilistic Reasoning -

• Bayesian Networks or Belief Netorks

• Learning in Probabilistic models

• Simplified Models: Näive Bayes and Noisy-OR

2. Phyical randomness or indeterminism. Even if we know

3. Vagueness. Many of predicates we employ appear to be vague.

• In 1964 (Los Angeles), an interracial couple was convicted of

• The prosecution suggested that these features had the

P (e | h)   P (ei | h)  1 / 12.000 .000

• The prosecution stated that given the evidence the

• Is the probability estimate correct?

• The product rule does not apply in this case!!

• P(h|e) is not equal to 1-P(e| not h)!

• Alright, What is the probability of the couple being guilty then

• We do not have P(e|h) and P(h)...

• A random variable has a domain (set of values) and associates

• P(A) – probabilidade a priori

• (a+b)/2 is the mean

• Dirichlet Discrete Distribution: a categorical distribution

D[1 ,  2 ,..,  i ,...,  ]

• The probability of observing state i is:

• P(A | K) – condicional probability or posterior probability (probabilidade

P(e): evidence a priori probability

P(H|e): Hypothesis posterior Probability

P(e|H): Probability of observing evidence e given H

• So, We wanna know P(Cause| Effect) (i.e. probability of each

• However, it is usually easier to determine P(Effect| Cause)

P ( Effect | Cause ) P (Cause )

• In one casino, the croupier speaks 12!

• The questions are; P(roullete|12) =? and P(dice|12)=?

• P(12|dice), P(12|roullete): easier to model...how?

• P(dice), P(roullete): How to estimate?

• Adding these up and noting that,  P( A  a

• The division by P(E=e) can be seen as a normalization factor

• In vectorial notation, we can write:

• Let the hidden variables be H= X - Y - E

• Inference from Full joint distributions could estimate any

• Bayesian Network make it easier to build and store

• Bayesian Networks or Belief Netorks

• Learning in Probabilistic models

• Simplified Models: Näive Bayes and Noisy-OR

• Popper's observation (195) thar frequency interpreation, precisse

• The alternative view of probability is to think of probabilities as

• Definition Conditionalization: After applying Bayes' theorem

• D-map: If every arc in a BN happens to correspond to a direct

• A BN which is both an I-map and a D-map is said to be a

• Aprendizado em método probabilísticos

• Métodos simplificados: Bayes ingênuo e Noisy-OR

• Busca responder questões simples, P(X| E=e)

• A inferência pode ser feita a partir da distribuição conjunta

<P(a, b)/P(b);P(⌐a , b)/P(b) > =

=α< P(a, b);P(⌐a , b)>

"P(b| j,m) nao normalizado"

X1*X2*X3 0,5985 0,000025 0,5922 0,00003

X1 0,95 0,05 0,94 0,06

X3 0,7 0,05 0,7 0,05

Produtorio 0,1827 0,000355 0,00063 0,0005

0,29 0,71 0,001 0,999

0,9 0,01 0,9 0,01

0,7 0,05 0,7 0,05

• Também é possível responder perguntas conjuntivas usando

• Pode-se eliminar parte do retrabalho através da técnica de

• Mais informações Russel, cap. 14

• Aprendizado em método probabilísticos

• Métodos simplificados: Bayes ingênuo e Noisy-OR

• Supondo variáveis booleanas, cada tabela de probabilidade condicional (CPT) terá no

• Porém se essa dependência for tênue, pode não valer a pena a

• Via de regra, se nos fixarmos em um modelo causal acabaremos tendo

• Modelos causais são aqueles onde se especifica no sentido causa efeito,

• Vejamos duas abordagens para simplificar a rede através da

• O método de inferência por enumeração já visto é aplicável!!!

• Zero! Isto é razoável? Como resolver?

• P(X | u1,…uj, ⌐uj+1, …. ⌐uk ) = <1- ∏ji=1 qi; ∏ji=1 qi >

• P(X | u1,…uj, ⌐uj+1, …. ⌐uk ) = <1- ∏ji=1 qi; ∏ji=1 qi >

You might also like

X1X2X3 0,5985 0,000025 0,5922 0,00003