0% found this document useful (0 votes)

71 views23 pages

Probability Review À Markov Models: CSE 473: Artificial Intelligence

Uploaded by

GODFREY

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

71 views23 pages

Probability Review À Markov Models: CSE 473: Artificial Intelligence

Uploaded by

GODFREY

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

CSE

473: Artificial Intelligence

Probability Review… à Markov Models

Daniel Weld
University of Washington
[These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at http://ai.berkeley.edu.]

Outline
§ Probability
§ Random Variables
§ Joint and Marginal Distributions
§ Conditional Distribution
§ Product Rule, Chain Rule, Bayes’ Rule
§ Inference
§ Independence & Conditional Indpendence
§ … Markov Models

§ You’ll need all this stuff A LOT for the

next few weeks, so make sure you go
over it now!

1
Joint Distributions
§ A joint distribution over a set of random variables:
specifies a probability for each assignment (or outcome):

T W P
§ Must obey: hot sun 0.4
hot rain 0.1
cold sun 0.2
cold rain 0.3

§ Size of joint distribution if n variables with domain sizes d?

§ For all but the smallest distributions, impractical to write out!

Marginal Distributions
§ Marginal distributions are sub-tables which eliminate variables
§ Marginalization (summing out): Combine collapsed rows by adding

T P
hot 0.5
T W P
cold 0.5
hot sun 0.4
hot rain 0.1
cold sun 0.2 W P
cold rain 0.3 sun 0.6
rain 0.4

2
Conditional Distributions
§ Conditional distributions are probability distributions over some variables
given fixed values of others

Conditional Distributions
Joint Distribution
W P
sun 0.8
T W P
rain 0.2
hot sun 0.4
hot rain 0.1
cold sun 0.2 W P
cold rain 0.3 sun 0.4
rain 0.6

Normalization Trick

SELECT the joint NORMALIZE the

probabilities selection
T W P matching the (make it sum to one)
hot sun 0.4 evidence T W P W P
hot rain 0.1 cold sun 0.2 sun 0.4
cold sun 0.2 cold rain 0.3 rain 0.6
cold rain 0.3

§ Why does this work? Sum of selection is P(evidence)! (P(T=c), here)

3
Probabilistic Inference
§ Probabilistic inference =
“compute a desired probability from other known
probabilities (e.g. conditional from joint)”

§ We generally compute conditional probabilities

§ P(on time | no reported accidents) = 0.90
§ These represent the agent’s beliefs given the evidence

§ Probabilities change with new evidence:

§ P(on time | no accidents, 5 a.m.) = 0.95
§ P(on time | no accidents, 5 a.m., raining) = 0.80
§ Observing new evidence causes beliefs to be updated

Inference by Enumeration
* Works fine with
§ General case: § We want: multiple query
§ Evidence variables: variables, too
§ Query* variable:
All variables
§ Hidden variables:

§ Step 1: Select the § Step 2: Sum out H to get joint § Step 3: Normalize
entries consistent of Query and evidence
with the evidence
1
⇥
Z

4
Example: Inference by Enumeration
S T W P
P(W=sun | S=winter)?
summer hot sun 0.30
summer hot rain 0.05
1. Select data consistent with evidence
summer cold sun 0.10
summer cold rain 0.05
winter hot sun 0.10
winter hot rain 0.05
winter cold sun 0.15
winter cold rain 0.20

Example: Inference by Enumeration

S T W P
P(W=sun | S=winter)?
summer hot sun 0.30
summer hot rain 0.05
1. Select data consistent with evidence
summer cold sun 0.10
2. Marginalize away hidden variables summer cold rain 0.05
(sum out temperature)
winter hot sun 0.10
winter hot rain 0.05
winter cold sun 0.15
winter cold rain 0.20

5
Example: Inference by Enumeration
S T W P
P(W=sun | S=winter)?
summer hot sun 0.30
summer hot rain 0.05
1. Select data consistent with evidence
summer cold sun 0.10
2. Marginalize away hidden variables summer cold rain 0.05
(sum out temperature)
winter hot sun 0.10
3. Normalize winter hot rain 0.05
winter cold sun 0.15
winter cold rain 0.20
S W P
winter sun 0.25
winter rain 0.25

Example: Inference by Enumeration

S T W P
P(W=sun | S=winter)?
summer hot sun 0.30
summer hot rain 0.05
1. Select data consistent with evidence
summer cold sun 0.10
2. Marginalize away hidden variables summer cold rain 0.05
(sum out temperature)
winter hot sun 0.10
3. Normalize winter hot rain 0.05
winter cold sun 0.15
winter cold rain 0.20
S W P
winter sun 0.50
winter rain 0.50

6
Inference by Enumeration

§ Computational problems?
§ Worst-case time complexity O(dn)
§ Space complexity O(dn) to store the joint distribution

Don’t be Fooled
§ It may look cute…

https://fc08.deviantart.net/fs71/i/2010/258/4/4/baby_dragon__charles_by_imsorrybuti-d2yti11.png

7
The Sword of Conditional Independence!

Slay I am a BIG joint

the distribution!
Basilisk!

harrypotter.wikia.com/

Means:

Or, equivalently:

A Brief Trip Forward in Time…

8
Preview: Bayes Nets Encode Joint Distributions
§ A set of nodes, one per variable X P(A1 ) …. P(An )
§ A directed, acyclic graph A1 An

§ A conditional distribution for each node

§ A collection of distributions over X, one for each
combination of parents’ values X

§ CPT: conditional probability table

§ Description of a noisy “causal” process

A Bayes net = Topology (graph) + Local Conditional Probabilities

Benefits: Smaller, Allows Fast Inference, Learnable!

Preview: Example Bayes Net - Car

9
Preview: Dynamic Bayes Nets (DBNs) - Ghosts
§ We want to track multiple variables over time, using
multiple sources of evidence
§ Idea: Repeat a fixed Bayes net structure at each time
§ Generalization of Hidden Markov Models (HMMs)
§ Itself a generalization of Markov Models

§ Variables from time t may condition on those from t-1

t =1 t =2 t =3

G1a G2a G3a

G1b G2b G3b

E1a E1b E2a E2b E3a E3b

Back to Our Own Universe… (for now)

10
Ghostbusters, Revisited
§ Let’s say we have two distributions:
§ Prior distribution over ghost location: P(G)
§ Let’s say this is uniform
§ Sensor reading model: P(R | G)
§ Given: we know what our sensors do
§ R = reading color measured at (1,1)
§ E.g. P(R = yellow | G=(1,1)) = 0.1

§ We can calculate the posterior distribution

P(G|r) over ghost locations given a reading
using Bayes’ rule:
[Demo: Ghostbuster – with probability (L12D2) ]

What’s Our Probabilistic Model

§ Random Variables
§ Location of Ghost. Values = {L1,1, L1,2, ….., L6, 10}
§ Sensor value at locations S1,1, …. S6,10. Values = {R, O, Y, G}
§ Joint Distribution
§ Too big to write down 60 * 460 = 7.98 * 1037
§ Here’s a schema for a conditional distribution specifying part of it:
P(red | 3) P(orange | 3) P(yellow | 3) P(green | 3)
0.05 0.15 0.50 0.30
...
P(red | 0) P(orange | 0) P(yellow | 0) P(green | 0)
0.70 0.15 0.10 0.05
47

11
Model for a Tiny Ghostbuster
§ Random Variables
§ Location of Ghost, G. Values = {L1, L2} L1 L2

§ Sensor value at locations S1, S2 with values {R, O, Y, G}

§ Joint Distribution G=L1 G=L2
S2 S2
R O Y G R O Y G
R Select G=L1
O
S1
Y ∑ S2
G

48
Can marginalize to get P(S1 | distance =0)

Video of Demo Ghostbusters with Probability

12
The Product Rule
§ Sometimes have conditional distributions but want the joint

The Chain Rule

§ More generally, can always write any joint distribution as an

incremental product of conditional distributions

13
Bayes’ Rule

§ Two ways to factor a joint distribution over two variables:

That’s my rule!

§ Dividing, we get:

§ Why is this at all helpful?

§ Lets us build one conditional from its reverse
§ Often one conditional is tricky but the other one is simple
§ Foundation of many systems we’ll see later (e.g. ASR, MT)

§ In the running for most important AI equation!

Independence
§ Two variables are independent in a joint distribution if:

§ Says the joint distribution factors into a product of two simple ones
§ Usually variables aren’t independent!

§ Can use independence as a modeling assumption

§ Independence can be a simplifying assumption
§ Empirical joint distributions: at best “close” to independent
§ What could we assume for {Weather, Traffic, Cavity}?

§ Independence is like something from CSPs: what?

14
Independence

P(AÙB) = P(A)P(B)

A AÙB

B
True

Example: Independence
§ N fair, independent coin flips:

H 0.5 H 0.5 H 0.5

T 0.5 T 0.5 T 0.5

15
Example: Independence?

T P
hot 0.5
cold 0.5 P2 (T, W ) = P (T )P (W )
T W P T W P

≠
hot sun 0.4 hot sun 0.3
hot rain 0.1 hot rain 0.2
cold sun 0.2 cold sun 0.3
cold rain 0.3 cold rain 0.2
W P
sun 0.6
rain 0.4

Conditional Independence

16
Conditional Independence
§ Unconditional (absolute) independence very rare

§ Conditional independence is our most basic and robust form

of knowledge about uncertain environments.

§ X is conditionally independent of Y given Z (written )

if and only if:

or, equivalently, if and only if

Conditional Independence

Are A & B independent? P(A|B) <

? P(A)

P(A)=(.25+.5)/2
AÙB
A = .375

P(B)= .75

P(A|B)=(.25+.25+.5)/3
=.3333
B

17
A, B Conditionally Independent Given C

P(A|B,C) = P(A|C) C = striped

AÙC P(A|¬C) =.5

AÙBÙC
P(A|B,¬C)=.5

BÙC

Conditional Independence
§ What about this domain:
§ Fire
§ Smoke
§ Alarm F

18
Conditional Independence
§ What about this domain:
§ Traffic
§ Umbrella R
§ Raining

U T

What is Conditional Independence?

I am a BIG joint
distribution!

http://harrypotter.wikia.com/ Slay the Basilisk! 68

19
Probability Recap
§ Conditional probability

§ Product rule

§ Chain rule

§ Bayes rule

§ X, Y independent if and only if:

§ X and Y are conditionally independent given Z:

if and only if:

Markov Models

20
Reasoning over Time or Space

§ Often, we want to reason about a sequence of observations

§ Speech recognition
§ Robot localization
§ User attention
§ Medical monitoring

§ Need to introduce time (or space) into our models

Markov Models
§ Value of X at a given time is called the state

X1 X2 X3 X4

§ Parameters: called transition probabilities or dynamics, specify how the

state evolves over time (also, initial state probabilities)
§ Stationarity assumption: transition probabilities the same at all times
§ Means P(X5 | X4) = P(X12 | X11) etc.
§ Same as MDP transition model, but no choice of action

21
Joint Distribution of a Markov Model
X1 X2 X3 X4

§ Joint distribution:
P (X1 , X2 , X3 , X4 ) = P (X1 )P (X2 |X1 )P (X3 |X2 )P (X4 |X3 )
§ More generally:
P (X1 , X2 , . . . , XT ) = P (X1 )P (X2 |X1 )P (X3 |X2 ) . . . P (XT |XT 1)
T
Y
= P (X1 ) P (Xt |Xt 1)
t=2
§ Questions to be resolved:
§ Does this indeed define a joint distribution?
§ Can every joint distribution be factored this way, or are we making some assumptions
about the joint distribution by using this factorization?

Chain Rule and Markov Models

X1 X2 X3 X4

§ From the chain rule, every joint distribution over can be written as:
X 1 , X2 , X3 , X4

P (X1 , X2 , X3 , X4 ) = P (X1 )P (X2 |X1 )P (X3 |X1 , X2 )P (X4 |X1 , X2 , X3 )

§ And, if we assume that

X3 ?
? X1 | X2 and X4 ?
? X 1 , X2 | X 3

This formula simplifies to

P (X1 , X2 , X3 , X4 ) = P (X1 )P (X2 |X1 )P (X3 |X2 )P (X4 |X3 )

22
Chain Rule and Markov Models
X1 X2 X3 X4

§ From the chain rule, every joint distribution over can be written as:
X 1 , X2 , . . . , X T
T
Y
P (X1 , X2 , . . . , XT ) = P (X1 ) P (Xt |X1 , X2 , . . . , Xt 1)
t=2
§ So, if we assume that for all t:
Xt ?
? X1 , . . . , X t 2 | Xt 1

We get
T
Y
P (X1 , X2 , . . . , XT ) = P (X1 ) P (Xt |Xt 1)
t=2

Part 1 Notes AGB Unit1
100% (1)
Part 1 Notes AGB Unit1
17 pages
IT8601 unitIV
No ratings yet
IT8601 unitIV
47 pages
Unit Iv Learning
No ratings yet
Unit Iv Learning
40 pages
3is Q2 Module 3
100% (1)
3is Q2 Module 3
17 pages
Lec 02 Bayesian Decision Theoryv 2024
No ratings yet
Lec 02 Bayesian Decision Theoryv 2024
143 pages
Cs 228
No ratings yet
Cs 228
98 pages
Bayes Nets - Representation
No ratings yet
Bayes Nets - Representation
96 pages
cs188 Su24 Lec07
No ratings yet
cs188 Su24 Lec07
89 pages
07 Bayesian Networks
No ratings yet
07 Bayesian Networks
106 pages
CHP: 13 and 14
No ratings yet
CHP: 13 and 14
62 pages
UNIT-4 New
No ratings yet
UNIT-4 New
79 pages
Data Science Questions and Answers
No ratings yet
Data Science Questions and Answers
4 pages
Unit 5
No ratings yet
Unit 5
98 pages
L08 Probabilistic Reasoning
No ratings yet
L08 Probabilistic Reasoning
90 pages
L07 Probabilistic Reasoning Till Sep6
No ratings yet
L07 Probabilistic Reasoning Till Sep6
71 pages
Contact Session6
No ratings yet
Contact Session6
57 pages
Lecture 29
No ratings yet
Lecture 29
65 pages
12.uncertainty Reasoning Class
No ratings yet
12.uncertainty Reasoning Class
68 pages
Lec 12
No ratings yet
Lec 12
54 pages
CS480 Lecture October24th
No ratings yet
CS480 Lecture October24th
90 pages
Probability
No ratings yet
Probability
56 pages
8 - Probability
No ratings yet
8 - Probability
54 pages
Unit-V POAI
No ratings yet
Unit-V POAI
50 pages
09 Uncertainty Annot
No ratings yet
09 Uncertainty Annot
34 pages
CS115 Probability
No ratings yet
CS115 Probability
41 pages
Bayesian Networks
No ratings yet
Bayesian Networks
45 pages
Lecture 05 Reasoning Under Uncertainty
No ratings yet
Lecture 05 Reasoning Under Uncertainty
41 pages
ProbabilityStatitic Review
No ratings yet
ProbabilityStatitic Review
41 pages
Probability Review
No ratings yet
Probability Review
29 pages
Lecture8 - Bays1
No ratings yet
Lecture8 - Bays1
40 pages
Fall 2019 Prob Review
No ratings yet
Fall 2019 Prob Review
33 pages
13 Bayes Nets
No ratings yet
13 Bayes Nets
38 pages
Imp Class Bayes Therom and Basian Network Class
No ratings yet
Imp Class Bayes Therom and Basian Network Class
39 pages
Bayesian Learning
No ratings yet
Bayesian Learning
41 pages
CSE3635 Lecture 12 Probability 3
No ratings yet
CSE3635 Lecture 12 Probability 3
33 pages
Bayesian Networks: A Tutorial
No ratings yet
Bayesian Networks: A Tutorial
73 pages
SP14 CS188 Lecture 12 - Probability - Print
No ratings yet
SP14 CS188 Lecture 12 - Probability - Print
33 pages
Probabilistic Reasoning
No ratings yet
Probabilistic Reasoning
37 pages
Artificial Intelligence: Lecture 12 - Probability Dr. Shivanjali Khare
No ratings yet
Artificial Intelligence: Lecture 12 - Probability Dr. Shivanjali Khare
32 pages
Computer Science CPSC 322: Bayesian Networks: Construction
No ratings yet
Computer Science CPSC 322: Bayesian Networks: Construction
70 pages
09 Uncertainty
No ratings yet
09 Uncertainty
34 pages
Uncertainty F23 Part1
No ratings yet
Uncertainty F23 Part1
44 pages
L12 Bayesian Network
No ratings yet
L12 Bayesian Network
35 pages
AIFA 25 Bayesian Logic 120324
No ratings yet
AIFA 25 Bayesian Logic 120324
33 pages
6.1. Quantifying Uncertainty-Probability I (Updated)
No ratings yet
6.1. Quantifying Uncertainty-Probability I (Updated)
23 pages
Bayes Net
No ratings yet
Bayes Net
36 pages
CS464 Review of Probability: (Based On The Slides Provided by Öznur Taştan and Mehmet Koyutürk)
No ratings yet
CS464 Review of Probability: (Based On The Slides Provided by Öznur Taştan and Mehmet Koyutürk)
55 pages
Lec4 - Probability Theory and Naive Bayes Classifier
No ratings yet
Lec4 - Probability Theory and Naive Bayes Classifier
27 pages
SP14 CS188 Lecture 12 - Probability
No ratings yet
SP14 CS188 Lecture 12 - Probability
35 pages
Lecture Quantifying Uncertainty
No ratings yet
Lecture Quantifying Uncertainty
40 pages
SP14 CS188 Lecture 13 - Markov Models - Print
No ratings yet
SP14 CS188 Lecture 13 - Markov Models - Print
25 pages
Unit Iv L Earning
No ratings yet
Unit Iv L Earning
33 pages
Unit Iv L Earning
No ratings yet
Unit Iv L Earning
23 pages
Announcements: Released Monday 3/10, 6:30pm-9:30pm
No ratings yet
Announcements: Released Monday 3/10, 6:30pm-9:30pm
40 pages
Current State of The Course!!!: We're Done With Part I Search and Planning! Part II: Probabilistic Reasoning
No ratings yet
Current State of The Course!!!: We're Done With Part I Search and Planning! Part II: Probabilistic Reasoning
30 pages
Outline of The Course: Unknown
No ratings yet
Outline of The Course: Unknown
26 pages
Uncertainty: CSE-345: Artificial Intelligence
No ratings yet
Uncertainty: CSE-345: Artificial Intelligence
30 pages
SP14 CS188 Lecture 13 - Markov Models
No ratings yet
SP14 CS188 Lecture 13 - Markov Models
33 pages
An Introduction To Artificial Intelligence: Chapter 13 &14.1-14.2: Uncertainty & Bayesian Networks
No ratings yet
An Introduction To Artificial Intelligence: Chapter 13 &14.1-14.2: Uncertainty & Bayesian Networks
31 pages
斯坦福大学机器学习数学基础 25-32
No ratings yet
斯坦福大学机器学习数学基础 25-32
8 pages
Joint Probability Vs Conditional Probability - by Prathap Manohar Joshi - Medium
No ratings yet
Joint Probability Vs Conditional Probability - by Prathap Manohar Joshi - Medium
1 page
Review Questions Research Methods - 2023-2024
No ratings yet
Review Questions Research Methods - 2023-2024
7 pages
Hypothesis Testing For The Population Proportion: One-Tailed Test
No ratings yet
Hypothesis Testing For The Population Proportion: One-Tailed Test
5 pages
OBE Syllabus BUS 5 Quantitative Techniques To Business
No ratings yet
OBE Syllabus BUS 5 Quantitative Techniques To Business
7 pages
Data Analysis
No ratings yet
Data Analysis
7 pages
Mathematical Preliminary and Optimization Theory
No ratings yet
Mathematical Preliminary and Optimization Theory
21 pages
Lesson 1-4 PR2 Q2
No ratings yet
Lesson 1-4 PR2 Q2
113 pages
Advanced Mathematics 2
No ratings yet
Advanced Mathematics 2
5 pages
Yates y Cochran 1938
No ratings yet
Yates y Cochran 1938
25 pages
Catch and Release Lab
No ratings yet
Catch and Release Lab
3 pages
Introduction To Statistics and Data Analysis With Exercises Solutions and Applications in R 2nd Edition 2nd Christian Heumann Download
No ratings yet
Introduction To Statistics and Data Analysis With Exercises Solutions and Applications in R 2nd Edition 2nd Christian Heumann Download
88 pages
Generation Gap
No ratings yet
Generation Gap
17 pages
2017 Book GeneticImprovementOfTropicalCr PDF
No ratings yet
2017 Book GeneticImprovementOfTropicalCr PDF
331 pages
Advanced Maths 1
No ratings yet
Advanced Maths 1
4 pages
To Pool or Not To Pool: Homogeneous Versus Heterogeneous Estimators Applied To Cigarette Demand
No ratings yet
To Pool or Not To Pool: Homogeneous Versus Heterogeneous Estimators Applied To Cigarette Demand
10 pages
2024 04 15 NEP BSC (P) 23 Sem - II
No ratings yet
2024 04 15 NEP BSC (P) 23 Sem - II
4 pages
Mẫu A KQM AR Nghiên cứu Mar
No ratings yet
Mẫu A KQM AR Nghiên cứu Mar
5 pages
SPTC 0103 Q3 FPF
No ratings yet
SPTC 0103 Q3 FPF
20 pages
Part II A Step-By-Step Guide To Latent Class Analysis
No ratings yet
Part II A Step-By-Step Guide To Latent Class Analysis
4 pages
Aligood Terjemahan Bagian 6
No ratings yet
Aligood Terjemahan Bagian 6
21 pages
STAT 328: Probabilities & Statistics: Lecture-8
No ratings yet
STAT 328: Probabilities & Statistics: Lecture-8
25 pages
Corn Silk Lemon Grass Tea 1
No ratings yet
Corn Silk Lemon Grass Tea 1
50 pages
A Novel Approach To Predict Students Performance in Online Courses Through Machine Learning
No ratings yet
A Novel Approach To Predict Students Performance in Online Courses Through Machine Learning
6 pages
Chapter 6
No ratings yet
Chapter 6
53 pages
Advanced Mathematics 1
No ratings yet
Advanced Mathematics 1
4 pages
IAI: Machine Learning: © John A. Bullinaria, 2005
No ratings yet
IAI: Machine Learning: © John A. Bullinaria, 2005
20 pages
Correlation Coefficient
No ratings yet
Correlation Coefficient
4 pages
Labour November 2024
No ratings yet
Labour November 2024
1 page
Fiverr Gig Research
No ratings yet
Fiverr Gig Research
7 pages
Decision Analysis Project
No ratings yet
Decision Analysis Project
10 pages
Energy Literacy Evaluating Knowledge, Affect, and Behavior of Students
No ratings yet
Energy Literacy Evaluating Knowledge, Affect, and Behavior of Students
9 pages
Individual Assignment Fundamental of Biostatistics
No ratings yet
Individual Assignment Fundamental of Biostatistics
2 pages

Probability Review À Markov Models: CSE 473: Artificial Intelligence

Uploaded by

Probability Review À Markov Models: CSE 473: Artificial Intelligence

Uploaded by

CSE

473: Artificial Intelligence

§ You’ll need all this stuff A LOT for the

§ Size of joint distribution if n variables with domain sizes d?

SELECT the joint NORMALIZE the

§ Why does this work? Sum of selection is P(evidence)! (P(T=c), here)

§ We generally compute conditional probabilities

§ Probabilities change with new evidence:

Example: Inference by Enumeration

Example: Inference by Enumeration

Slay I am a BIG joint

A Brief Trip Forward in Time…

§ A conditional distribution for each node

§ CPT: conditional probability table

A Bayes net = Topology (graph) + Local Conditional Probabilities

Preview: Example Bayes Net - Car

§ Variables from time t may condition on those from t-1

G1a G2a G3a

G1b G2b G3b

E1a E1b E2a E2b E3a E3b

Back to Our Own Universe… (for now)

§ We can calculate the posterior distribution

What’s Our Probabilistic Model

§ Sensor value at locations S1, S2 with values {R, O, Y, G}

Video of Demo Ghostbusters with Probability

The Chain Rule

§ More generally, can always write any joint distribution as an

§ Two ways to factor a joint distribution over two variables:

§ Why is this at all helpful?

§ In the running for most important AI equation!

§ Can use independence as a modeling assumption

§ Independence is like something from CSPs: what?

H 0.5 H 0.5 H 0.5

§ Conditional independence is our most basic and robust form

§ X is conditionally independent of Y given Z (written )

if and only if:

or, equivalently, if and only if

Are A & B independent? P(A|B) <

P(A|B,C) = P(A|C) C = striped

AÙC P(A|¬C) =.5

What is Conditional Independence?

http://harrypotter.wikia.com/ Slay the Basilisk! 68

§ X, Y independent if and only if:

§ X and Y are conditionally independent given Z:

§ Often, we want to reason about a sequence of observations

§ Need to introduce time (or space) into our models

§ Parameters: called transition probabilities or dynamics, specify how the

Chain Rule and Markov Models

P (X1 , X2 , X3 , X4 ) = P (X1 )P (X2 |X1 )P (X3 |X1 , X2 )P (X4 |X1 , X2 , X3 )

§ And, if we assume that

This formula simplifies to

P (X1 , X2 , X3 , X4 ) = P (X1 )P (X2 |X1 )P (X3 |X2 )P (X4 |X3 )

You might also like