Dealing With Uncertainty P (X - E) : Probability Theory The Foundation of Statistics
Dealing With Uncertainty P (X - E) : Probability Theory The Foundation of Statistics
P(X|E)
Probability theory
The foundation of Statistics
History
Concerns
Future: what is the likelihood that a student
will get a CS job given his grades?
Current: what is the likelihood that a person
has cancer given his symptoms?
Past: what is the likelihood that Marilyn
Monroe committed suicide?
Combining evidence.
Always: Representation & Inference
Basic Idea
Attach degrees of belief to proposition.
Theorem: Probability theory is the best way
to do this.
if someone does it differently you can play a
game with him and win his money.
Probability Models:
Basic Questions
What are they?
Analogous to constraint models, with probabilities on
each table entry
Random Variable
Intuition: A variable whose values belongs to a
known set of values, the domain.
Math: non-negative function on a domain (called
the sample space) whose sum is 1.
Boolean RV: John has a cavity.
cavity domain ={true,false}
Cross-Product RV
If X is RV with values x1,..xn and
Y is RV with values y1,..ym, then
Z = X x Y is a RV with n*m values <x1,y1>
<xn,ym>
p=.6
0
1
2
3
4
5
6
7
8
9
10
Exact
.0001
.001
.010
.042
.111
.200
.250
.214
.120
.43
.005
10
.0
.0
.0
.0
.2
.1
.6
.1
.0
.0
.0
100
.0
.0
.01
.04
.05
.24
.22
.16
.18
.09
.01
1000
.0
.002
.011
.042
.117
.200
.246
.231
.108
.035
.008
P=.5
0
1
2
3
4
5
6
7
8
9
10
Exact
.0009
.009
.043
.117
.205
.246
.205
.117
.043
.009
.0009
10
.0
.0
.0
.1
.2
.0
.3
.3
.1
.0
.0
100
.0
.01
.07
.13
.24
.28
.15
.08
.04
.0
.0
1000
.002
.011
.044
.101
.231
.218
.224
.118
.046
.009
.001
0
1
2
.34
.38
.19
3
4
5
.05
.01
.02
6
7
8
.08
.20
.30
9
10
.26
.1
Mixture Model
Continuous Probability
RV X has values in R, then a prob
distribution for X is a non-negative realvalued function p such that the integral of p
over R is 1. (called prob density function)
Standard distributions are uniform, normal
or gaussian, poisson, etc.
May resort to empirical if cant compute
analytically. I.E. Use histogram.
Marginalization
Given the joint probability for X and Y, you
can compute everything.
Joint probability to individual probabilities.
P(X =x) is sum P(X=x and Y=y) over all y
Conditioning is similar:
P(X=x) = sum P(X=x|Y=y)*P(Y=y)
Marginalization Example
Conditional Probability
P(X=x | Y=y) = P(X=x, Y=y)/P(Y=y).
Intuition: use simple examples
1 card hand X = value card, Y = suit card
P( X= ace | Y= heart) = 1/13
also P( X=ace , Y=heart) = 1/52
P(Y=heart) = 1 / 4
P( X=ace, Y= heart)/P(Y =heart) = 1/13.
Formula
Shorthand: P(X|Y) = P(X,Y)/P(Y).
Product Rule: P(X,Y) = P(X |Y) * P(Y)
Bayes Rule:
P(X|Y) = P(Y|X) *P(X)/P(Y).
Conditional Example
P(A = 0) = .7
P(A = 1) = .3
P(B|A)
.2
P(A,B) = P(B,A)
P(B,A)= P(B|A)*P(A)
P(A,B) = P(A|B)*P(B)
P(A|B) = P(B|
A)*P(A)/P(B)
.9
.8
.1
P(A,B) 10
100
1000
.14
.1
.18
.14
.56
.6
.55
.56
.27
.2
.24
.24
.03
.1
.03
.06
P(B=0) = P(B=0,A=0)+P(B=0,A=1) =
.14+.27 = .41
Simulation
Given prob for A and prob for B given A
First, choose value for A, according to prob
Now use conditional table to choose value
for B with correct probability.
That constructs one world.
Repeats lots of times and count number of
times A= 0 & B = 0, A=0 & B= 1, etc.
Turn counts into probabilities.
P(X1,X2,X3) =P(X1)*P(X2|X1)*P(X3|X1,X2).
Note: These equations make no assumptions!
Last equation is called the Chain or Product Rule
Can pick the any ordering of variables.
Markov Models
MM1: depends only on previous time
P(X1,Xn)= P(X1)*P(X2|X1)*P(Xn|Xn-1).