Evaluating Hypotheses Problems Estimating Error: (H) Is Optimistically H Error H Error E Bias
Evaluating Hypotheses Problems Estimating Error: (H) Is Optimistically H Error H Error E Bias
• Binomial distribution, Normal distribution, For unbiased estimate, h and S must be chosen
Central Limit Theorem independently
• Paired t-tests
2. Variance: Even with unbiased S, errorS(h) may
• Comparing Learning Methods still vary from errorD(h)
CS 5751 Machine Chapter 5 Evaluating Hypotheses 1 CS 5751 Machine Chapter 5 Evaluating Hypotheses 2
Learning Learning
CS 5751 Machine Chapter 5 Evaluating Hypotheses 5 CS 5751 Machine Chapter 5 Evaluating Hypotheses 6
Learning Learning
1
Confidence Intervals errorS(h) is a Random Variable
If • Rerun experiment with different randomly drawn S (size n)
• S contains n examples, drawn independently of h and each • Probability of observing r misclassified examples:
other
Binomial distribution for n=40, p=0.3
• n ≥ 30 0.14
0.12
Then 0.10
• With approximately 95% probability, errorD(h) lies in 0.08
P(r)
interval 0.06
0.04
errorS ( h)(1 − errorS ( h)) 0.02
errorS (h) ±1.96 0.00
n
0 5 10 15 20 25 30 35 40
r
n!
P (r ) = errorD (h) r (1 − errorD (h)) n − r
r!(n − r )!
CS 5751 Machine Chapter 5 Evaluating Hypotheses 7 CS 5751 Machine Chapter 5 Evaluating Hypotheses 8
Learning Learning
n! 0.35
P(r ) = e
0.10
P(r ) = p r (1 − p ) n − r 0.3
2πσ 2
0.08
r!( n − r )! 0.25
P(r)
0.06 0.2
0.04 0.15
0.1
0.02
0.05
0.00 0
0 5 10 15 20 25 30 35 40 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3
r
Probabilty P(r) of r heads in n coin flips, if p = Pr (heads) The probability that X will fall into the interval (a,b) is given by
n b
• Expected, or mean value of X : E[X] ≡ ∑ iP (i ) = np ∫ p ( x)dx
a
i =0
• Expected, or mean value of X : E[X] = µ
• Variance of X : Var(X) ≡ E[( X − E[ X ]) ] = np(1 − p ) 2
• Variance of X : Var(X) = σ 2
• Standard deviation of X : σ X ≡ E[( X − E[ X ]) 2 ] = np(1 − p )
• Standard deviation of X : σ X = σ
CS 5751 Machine Chapter 5 Evaluating Hypotheses 9 CS 5751 Machine Chapter 5 Evaluating Hypotheses 10
Learning Learning
σ errorS ( h ) = 0.1
n 0.05
0
-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3
2
Confidence Intervals, More Correctly Calculating Confidence Intervals
If 1. Pick parameter p to estimate
• S contains n examples, drawn independently of h and each
other • errorD(h)
• n ≥ 30 2. Choose an estimator
Then • errorS(h)
• With approximately 95% probability, errorS(h) lies in 3. Determine probability distribution that governs estimator
interval
errorD ( h)(1 − errorD ( h)) • errorS(h) governed by Binomial distribution, approximated
errorD (h) ±1.96
n by Normal when n ≥ 30
• equivalently, errorD(h) lies in interval 4. Find interval (L,U) such that N% of probability mass falls
errorD ( h)(1 − errorD ( h))
errorS ( h) ±1.96 in the interval
n
• which is approximately • Use table of zN values
errorS ( h)(1 − errorS ( h))
errorS (h) ±1.96
n
CS 5751 Machine Chapter 5 Evaluating Hypotheses 13 CS 5751 Machine Chapter 5 Evaluating Hypotheses 14
Learning Learning
3
Comparing Learning Algorithms LA and LB Comparing Learning Algorithms LA and LB
What we would like to estimate: Notice we would like to use the paired t test on δ to
ES ⊂ D [errorD ( LA ( S )) − errorD ( LB ( S ))] obtain a confidence interval
where L(S) is the hypothesis output by learner L using But not really correct, because the training sets in
training set S
this algorithm are not independent (they overlap!)
i.e., the expected difference in true error between hypotheses output
by learners LA and LB, when trained using randomly selected More correct to view algorithm as producing an
training sets S drawn according to distribution D. estimate of
ES ⊂ D0 [errorD ( LA ( S )) − errorD ( LB ( S ))]
But, given limited data D0, what is a good estimator?
Could partition D0 into training set S and training set T0 and instead of
measure ES ⊂ D [errorD ( LA ( S )) − errorD ( LB ( S ))]
errorT0 ( LA ( S 0 )) − errorT0 ( LB ( S 0 )) but even this approximation is better than no
even better, repeat this many times and average the results comparison
(next slide)
CS 5751 Machine Chapter 5 Evaluating Hypotheses 19 CS 5751 Machine Chapter 5 Evaluating Hypotheses 20
Learning Learning