1 ProbabilityAndInference
1 ProbabilityAndInference
The University of Texas at Austin, CS 395T, Spring 2009, Prof. William H. Press 1
What is Computational Statistics, anyway?
The University of Texas at Austin, CS 395T, Spring 2009, Prof. William H. Press 3
http://wpressutexas.net/forum
• Last year, tried Wiki format
– didn’t work very well: people reluctant to initiate new pages
• This year, we’ll try Forum format
– you should register using same email as on sign-up sheet
– start threads or add comments under Lecture Slides or Other Topics
– add comments to Course Administration topics
The University of Texas at Austin, CS 395T, Spring 2009, Prof. William H. Press 4
What should you learn in this course?
• A lot of conventional statistics at a 1st year graduate level
– mostly by practical example, not proving theorems
– but you should also learn to read the statistics and/or machine learning
and/or pattern recognition textbook literature
• A lot about real, as opposed to idealized, data sets
– we’ll supply and discuss some
– you can also use and/or share your own
• A bunch of important computational algorithms
– often stochastic
• Some bioinformatics, especially genomics
– although that is not the main point of the course
• Some programming methodology
– e.g., data parallel methods, notated in MATLAB but more general in
concept
– A computer with either MATLAB or Octave (free) is required.
The University of Texas at Austin, CS 395T, Spring 2009, Prof. William H. Press 5
Laws of Probability
“There is this thing called probability. It obeys the laws of an
axiomatic system. When identified with the real world, it gives
(partial) information about the future.”
• What axiomatic system?
• How to identify to real world?
– Bayesian or frequentist viewpoints are somewhat different
“mappings” from axiomatic probability theory to the real world
– yet both are useful
The University of Texas at Austin, CS 395T, Spring 2009, Prof. William H. Press 6
Axioms:
I. P (A) ≥ 0 for an event A
II. P (Ω) = 1 where Ω is the set of all possible outcomes
III. if A ∩ B = ∅, then P (A ∪ B) = P (A) + P (B)
Example of a theorem:
Theorem: P (∅) = 0
Proof: A ∩ ∅ = ∅, so
P (A) = P (A ∪ ∅) = P (A) + P (∅), q.e.d.
The University of Texas at Austin, CS 395T, Spring 2009, Prof. William H. Press 7
Additivity or “Law of Or-ing”
The University of Texas at Austin, CS 395T, Spring 2009, Prof. William H. Press 8
“Law of Exhaustion”
The University of Texas at Austin, CS 395T, Spring 2009, Prof. William H. Press 9
Multiplicative Rule or “Law of And-ing”
“given”
P (AB) = P (A)P (B|A) = P (B)P (A|B)
P (AB)
P (B|A) =
P (A)
“conditional probability”
“renormalize the
outcome space”
The University of Texas at Austin, CS 395T, Spring 2009, Prof. William H. Press 10
Similarly, for multiple And-ing:
Independence:
Events A and B are independent if
P (A|B) = P (A)
so P (AB) = P (B)P (A|B) = P (A)P (B)
The University of Texas at Austin, CS 395T, Spring 2009, Prof. William H. Press 11
A symmetric die has
. . = P (6) = 16
P (1) = P (2) = .P
Why? Because i P (i) = 1 and P (i) = P (j).
Not because of “frequency of occurence in N trials”.
That comes later!
The University of Texas at Austin, CS 395T, Spring 2009, Prof. William H. Press 12
Law of Total Probability or “Law of de-Anding”
X
P (B) = P (BH1 ) + P (BH2 ) + . . . = P (BHi )
i
X
P (B) = P (B|Hi )P (Hi )
i
“How to put Humpty-Dumpty back together again.”
The University of Texas at Austin, CS 395T, Spring 2009, Prof. William H. Press 13
Example: A barrel has 3 minnows and 2 trout, with
equal probability of being caught. Minnows must
be thrown back. Trout we keep.
What is the probability that the 2nd fish caught is a
trout?
The University of Texas at Austin, CS 395T, Spring 2009, Prof. William H. Press 14
Bayes Theorem
Thomas Bayes
1702 - 1761
Law of And-ing
P (Hi B)
P (Hi |B) =
P (B)
P (B|Hi )P (Hi )
= P
j P (B|Hj )P (Hj )
Law of de-Anding
We usually write this as
The University of Texas at Austin, CS 395T, Spring 2009, Prof. William H. Press 16
Let’s work a couple of examples using Bayes Law:
The University of Texas at Austin, CS 395T, Spring 2009, Prof. William H. Press 17
P (Hi |T ) ∝ P (T |Hi )P (Hi )
2 1
so, 5 · 5 2
P (H1 |T ) = 2 1 1 1 3 =
5 · 5 + 5 · 5 +0· 5
3
The University of Texas at Austin, CS 395T, Spring 2009, Prof. William H. Press 18
Example: The Monty Hall or
Let’s Make a Deal Problem
• Three doors
• Car (prize) behind one door
• You pick a door, but don’t open it yet
• Monty then opens one of the other doors, always revealing no
car (he knows where it is)
• You now get to switch doors if you want
• Should you?
• Most people reason: Two remaining doors were equiprobable
before, and nothing has changed. So doesn’t matter whether
you switch or not.
• Marilyn vos Savant (“highest IQ person in the world”) famously
thought otherwise (Parade magazine, 1990)
• No one seems to care what Monty Hall thought!
The University of Texas at Austin, CS 395T, Spring 2009, Prof. William H. Press 19
Hi = car behind door i, i = 1, 2, 3
Wlog, you pick door 2 (relabeling).
Wlog, Monty opens door 3 (relabeling).
P (Hi |O3) ∝ P (O3|Hi )P (Hi )
1 2
P (H1 |O3) ∝ 1 · 3 = 6
1 1 1
P (H2 |O3) ∝ 2 · 3 = 6
1
P (H3 |O3) ∝ 0 · 3 =0
ignorance of Monty’s preference
between 1 and 3, so take 1/2
The University of Texas at Austin, CS 395T, Spring 2009, Prof. William H. Press 20
Exegesis on Monty Hall
Bayesian viewpoint:
Probabilities are modified by data. This
makes them intrinsically subjective,
because different observers have
access to different amounts of data
(including their “background information”
or “background knowledge”).
The University of Texas at Austin, CS 395T, Spring 2009, Prof. William H. Press 21
Commutivity/Associativity of Evidence
P (Hi |D1 D2 ) desired
We see D1 :
P (Hi |D1 ) ∝ P (D1 |Hi )P (Hi )
Then, we see D2 :
P (Hi |D1 D2 ) ∝ P (D2 |Hi D1 )P (Hi |D1 ) this is now a prior!
But,
= P (D2 |Hi D1 )P (D1 |Hi )P (Hi )
= P (D1 D2 |Hi )P (Hi )
this being symmetrical shows that we would get the same answer
regardless of the order of seeing the data
The University of Texas at Austin, CS 395T, Spring 2009, Prof. William H. Press 22
Bayes Law is a “calculus of inference”, often better (and
certainly more self-consistent) than folk wisdom.
All crows
Ù All non-black things
are black are non-crows
The University of Texas at Austin, CS 395T, Spring 2009, Prof. William H. Press 23
I.J. Good: “The White Shoe
is a Red Herring” (1966)