Eisner-Probability How To Use Prob
Eisner-Probability How To Use Prob
1
Goals of this lecture
0.9
“Paul
probability
Revere”
model
• Past performance?
– Revere’s won 90% of races with clear weather
• Hypothetical performance?
– If he ran the race in many parallel universes …
• Subjective strength of belief?
– Would pay up to 90 cents for chance to win $1
• Output of some computable formula?
– Ok, but then which formulas should we trust?
p(X | Y) versus q(X | Y)
600.465 – Intro to NLP – J. Eisner 5
p is a function on event sets
p(win | clear) p(win, clear) / p(clear)
weather’s
clear
Paul Revere
wins
weather’s
clear p measures total
Paul Revere
wins
probability of a
All Events (races)
set of events.
600.465 – Intro to NLP – J. Eisner 7
most of the
Required Properties of p (axioms)
weather’s
clear p measures total
Paul Revere
wins
probability of a
All Events (races)
set of events.
600.465 – Intro to NLP – J. Eisner 8
Commas denote conjunction
p(Paul Revere wins, Valentine places, Epitaph
shows | weather’s clear)
what happens as we add conjuncts to left of bar ?
• probability can only decrease
• numerator of historical estimate likely to go to zero:
# times Revere wins AND Val places… AND weather’s clear
# times weather’s clear
English
param definition
values of p
Trigram Model
(defined in terms
Polish of parameters like definition
param t h, o, r and t o, r, s ) of q
values
p are
com compute
compute p(X)
q(X)
600.465 – Intro to NLP – J. Eisner 24
What is “X” in p(X)?
• Element of some implicit “event space”
• e.g., race
definition
• e.g., sentence
of p
• What if event is a whole text?
• p(text) definition
= p(sentence 1, sentence 2, …) of q
= p(sentence 1)
* p(sentence 2 | sentence 1) are
p
*… com compute
compute p(X)
q(X)
600.465 – Intro to NLP – J. Eisner 25
What is “X” in “p(X)”?
• Element of some implicit “event space”
• e.g., race, sentence, text …
• Suppose an event is a sequence of letters:
p(horses)
• p(weather’s clear)
• Event is a race
• weather’s clear is true or false of the event
• So p(weather’s clear)
= p(weather’s clear(Event)=true)
picks out the set of events weather’s
with clear weather clear
Paul Revere
wins
p(win | clear) p(win, clear) / p(clear)
All Events (races)
600.465 – Intro to NLP – J. Eisner 30
Random Variables:
What is “variable” in “p(variable=value)”?
p(W1 = horses)
where word vector W is a function of the event (the sentence) just as
character vector X is.
= p(Wi = horses | i=1)
p(Wi = horses) = 7.2e-5
independence assumption says that sentence-initial words w1 are just like
all other words wi (gives us more data to use)