Probability Review À Markov Models: CSE 473: Artificial Intelligence
Probability Review À Markov Models: CSE 473: Artificial Intelligence
Daniel Weld
University of Washington
[These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at http://ai.berkeley.edu.]
Outline
§ Probability
§ Random Variables
§ Joint and Marginal Distributions
§ Conditional Distribution
§ Product Rule, Chain Rule, Bayes’ Rule
§ Inference
§ Independence & Conditional Indpendence
§ … Markov Models
1
Joint Distributions
§ A joint distribution over a set of random variables:
specifies a probability for each assignment (or outcome):
T W P
§ Must obey: hot sun 0.4
hot rain 0.1
cold sun 0.2
cold rain 0.3
Marginal Distributions
§ Marginal distributions are sub-tables which eliminate variables
§ Marginalization (summing out): Combine collapsed rows by adding
T P
hot 0.5
T W P
cold 0.5
hot sun 0.4
hot rain 0.1
cold sun 0.2 W P
cold rain 0.3 sun 0.6
rain 0.4
2
Conditional Distributions
§ Conditional distributions are probability distributions over some variables
given fixed values of others
Conditional Distributions
Joint Distribution
W P
sun 0.8
T W P
rain 0.2
hot sun 0.4
hot rain 0.1
cold sun 0.2 W P
cold rain 0.3 sun 0.4
rain 0.6
Normalization Trick
3
Probabilistic Inference
§ Probabilistic inference =
“compute a desired probability from other known
probabilities (e.g. conditional from joint)”
Inference by Enumeration
* Works fine with
§ General case: § We want: multiple query
§ Evidence variables: variables, too
§ Query* variable:
All variables
§ Hidden variables:
§ Step 1: Select the § Step 2: Sum out H to get joint § Step 3: Normalize
entries consistent of Query and evidence
with the evidence
1
⇥
Z
4
Example: Inference by Enumeration
S T W P
P(W=sun | S=winter)?
summer hot sun 0.30
summer hot rain 0.05
1. Select data consistent with evidence
summer cold sun 0.10
summer cold rain 0.05
winter hot sun 0.10
winter hot rain 0.05
winter cold sun 0.15
winter cold rain 0.20
5
Example: Inference by Enumeration
S T W P
P(W=sun | S=winter)?
summer hot sun 0.30
summer hot rain 0.05
1. Select data consistent with evidence
summer cold sun 0.10
2. Marginalize away hidden variables summer cold rain 0.05
(sum out temperature)
winter hot sun 0.10
3. Normalize winter hot rain 0.05
winter cold sun 0.15
winter cold rain 0.20
S W P
winter sun 0.25
winter rain 0.25
6
Inference by Enumeration
§ Computational problems?
§ Worst-case time complexity O(dn)
§ Space complexity O(dn) to store the joint distribution
Don’t be Fooled
§ It may look cute…
https://fc08.deviantart.net/fs71/i/2010/258/4/4/baby_dragon__charles_by_imsorrybuti-d2yti11.png
38
7
The Sword of Conditional Independence!
harrypotter.wikia.com/
Means:
Or, equivalently:
40
41
8
Preview: Bayes Nets Encode Joint Distributions
§ A set of nodes, one per variable X P(A1 ) …. P(An )
§ A directed, acyclic graph A1 An
9
Preview: Dynamic Bayes Nets (DBNs) - Ghosts
§ We want to track multiple variables over time, using
multiple sources of evidence
§ Idea: Repeat a fixed Bayes net structure at each time
§ Generalization of Hidden Markov Models (HMMs)
§ Itself a generalization of Markov Models
45
10
Ghostbusters, Revisited
§ Let’s say we have two distributions:
§ Prior distribution over ghost location: P(G)
§ Let’s say this is uniform
§ Sensor reading model: P(R | G)
§ Given: we know what our sensors do
§ R = reading color measured at (1,1)
§ E.g. P(R = yellow | G=(1,1)) = 0.1
11
Model for a Tiny Ghostbuster
§ Random Variables
§ Location of Ghost, G. Values = {L1, L2} L1 L2
48
Can marginalize to get P(S1 | distance =0)
12
The Product Rule
§ Sometimes have conditional distributions but want the joint
13
Bayes’ Rule
§ Dividing, we get:
Independence
§ Two variables are independent in a joint distribution if:
§ Says the joint distribution factors into a product of two simple ones
§ Usually variables aren’t independent!
14
Independence
P(AÙB) = P(A)P(B)
A AÙB
B
True
© Daniel S. Weld 58
Example: Independence
§ N fair, independent coin flips:
15
Example: Independence?
T P
hot 0.5
cold 0.5 P2 (T, W ) = P (T )P (W )
T W P T W P
≠
hot sun 0.4 hot sun 0.3
hot rain 0.1 hot rain 0.2
cold sun 0.2 cold sun 0.3
cold rain 0.3 cold rain 0.2
W P
sun 0.6
rain 0.4
Conditional Independence
16
Conditional Independence
§ Unconditional (absolute) independence very rare
Conditional Independence
P(A)=(.25+.5)/2
AÙB
A = .375
P(B)= .75
P(A|B)=(.25+.25+.5)/3
=.3333
B
© Daniel S. Weld 63
17
A, B Conditionally Independent Given C
BÙC
© Daniel S. Weld 64
Conditional Independence
§ What about this domain:
§ Fire
§ Smoke
§ Alarm F
18
Conditional Independence
§ What about this domain:
§ Traffic
§ Umbrella R
§ Raining
U T
I am a BIG joint
distribution!
19
Probability Recap
§ Conditional probability
§ Product rule
§ Chain rule
§ Bayes rule
Markov Models
20
Reasoning over Time or Space
Markov Models
§ Value of X at a given time is called the state
X1 X2 X3 X4
21
Joint Distribution of a Markov Model
X1 X2 X3 X4
§ Joint distribution:
P (X1 , X2 , X3 , X4 ) = P (X1 )P (X2 |X1 )P (X3 |X2 )P (X4 |X3 )
§ More generally:
P (X1 , X2 , . . . , XT ) = P (X1 )P (X2 |X1 )P (X3 |X2 ) . . . P (XT |XT 1)
T
Y
= P (X1 ) P (Xt |Xt 1)
t=2
§ Questions to be resolved:
§ Does this indeed define a joint distribution?
§ Can every joint distribution be factored this way, or are we making some assumptions
about the joint distribution by using this factorization?
§ From the chain rule, every joint distribution over can be written as:
X 1 , X2 , X3 , X4
22
Chain Rule and Markov Models
X1 X2 X3 X4
§ From the chain rule, every joint distribution over can be written as:
X 1 , X2 , . . . , X T
T
Y
P (X1 , X2 , . . . , XT ) = P (X1 ) P (Xt |X1 , X2 , . . . , Xt 1)
t=2
§ So, if we assume that for all t:
Xt ?
? X1 , . . . , X t 2 | Xt 1
We get
T
Y
P (X1 , X2 , . . . , XT ) = P (X1 ) P (Xt |Xt 1)
t=2
23