ML 05
ML 05
Tom M. Mitchell
Evaluating Hypotheses
Sample error, true error
Confidence intervals for observed hypothesis error
Estimators
Binomial distribution, Normal distribution, Central
Limit Theorem
Paired t tests
Comparing learning methods
2
Two Definitions of Error
The true error of hypothesis h with respect to target functi
on f and distribution D is the probability that h will misclas
sify an instance drawn at random according to D.
4
Example
Hypothesis h misclassifies 12 of the 40 examples i
nS
errorS(h) = 12 / 40 = .30
What is errorD(h) ?
5
Estimators
Experiment:
1. choose sample S of size n according to distributi
on D
2. measure errorS(h)
errorS(h) is a random variable (i.e., result of an e
xperiment)
errorS(h) is an unbiased estimator for errorD(h)
Given observed errorS(h) what can we conclude
about errorD(h) ?
6
Confidence Intervals
If
– S contains n examples, drawn independently of h and ea
ch other
– n 30
Then, with approximately N% probability, errorD
(h) lies in interval
N% 50 68 80 90 95 98 99
where % % % % % % %
zN 0.67 1.00 1.28 1.64 1.96 2.33 2.58
7
errorS(h) is a Random Variable
Rerun the experiment with different randomly drawn S (of
size n)
Probability of observing r misclassified examples:
8
Binomial Probability Distribution
Variance of X is
Standard deviation of X, X, is
9
Normal Distribution Approximates Binomial
10
Normal Probability Distribution
(1/2)
The probability that X will fall into the interval (a, b) is given by
Expected, or mean value of X, E[X], is E[X] =
Variance of X is Var(X) = 2
Standard deviation of X, X is X =
11
Normal Probability Distribution
(2/2)
N% 50 68 80 90 95 98 99
% % % % % % %
zN 0.67 1.00 1.28 1.64 1.96 2.33 2.58
12
Confidence Intervals, More Correctly
If
– S contains n examples, drawn independently of h and ea
ch other
– n 30
Then, with approximately 95% probability, errorS(h) lies i
n interval
which is approximately
13
Central Limit Theorem
Consider a set of independent, identically distributed rando
m variables Y1 . . . Yn, all governed by an arbitrary probabili
ty distribution with mean and finite variance 2. Define t
he sample mean,
14
Calculating Confidence Intervals
1. Pick parameter p to estimate
– errorD(h)
2. Choose an estimator
– errorS(h)
3. Determine probability distribution that governs estimator
– errorS(h) governed by Binomial distribution, approximated by Normal
when n 30
4. Find interval (L, U) such that N% of probability mass falls in t
he interval
– Use table of zN values
15
Difference Between Hypotheses
Test h1 on sample S1, test h2 on S2
1. Pick parameter to estimate
d errorD(h1) - errorD(h2)
2. Choose an estimator
^
d errorS1(h1) – errorS2(h2)
3. Determine probability distribution that governs estimator
4. Find interval (L, U) such that N% of probability mass falls in the interv
al
16
Paired t test to compare hA, hB
1. Partition data into k disjoint test sets T1, T2, . . ., Tk of equal
size, where this size is at least 30.
2. For i from 1 to k, do
i errorTi(hA) - errorTi(hB)
3. Return the value , where
19
Comparing learning algorithms
LA and LB (3/3)
Notice we’d like to use the paired t test on to obtain a co
nfidence interval
but not really correct, because the training sets in this algor
ithm are not independent (they overlap!)
more correct to view algorithm as producing an estimate of
ESD0[errorD(LA (S)) - errorD(LB (S))]
instead of
ESD[errorD(LA (S)) - errorD(LB (S))]
but even this approximation is better than no comparison
20