SlideShare a Scribd company logo
Evaluating Hypotheses
• Sample error, true error
• Confidence intervals for observed hypothesis error
• Estimators
• Binomial distribution, Normal distribution,
Central Limit Theorem
• Paired t-tests
• Comparing Learning Methods
Problems Estimating Error
1. Bias: If S is training set, errorS(h) is optimistically
biased
For unbiased estimate, h and S must be chosen
independently
2. Variance: Even with unbiased S, errorS(h) may
still vary from errorD(h)
)
(
)]
(
[ h
error
h
error
E
bias D
S 

Two Definitions of Error
The true error of hypothesis h with respect to target function f
and distribution D is the probability that h will misclassify
an instance drawn at random according to D.
The sample error of h with respect to target function f and
data sample S is the proportion of examples h misclassifies
How well does errorS(h) estimate errorD(h)?
 
)
(
)
(
Pr
)
( x
h
x
f
h
error
D
x
D 


 
  otherwise
0
and
),
(
)
(
if
1
is
)
(
)
(
where
)
(
)
(
1
)
(
x
h
x
f
x
h
x
f
x
h
x
f
n
h
error
S
x
S



 



Example
Hypothesis h misclassifies 12 of 40 examples in S.
What is errorD(h)?
30
.
40
12
)
( 

h
errorS
Estimators
Experiment:
1. Choose sample S of size n according to
distribution D
2. Measure errorS(h)
errorS(h) is a random variable (i.e., result of an
experiment)
errorS(h) is an unbiased estimator for errorD(h)
Given observed errorS(h) what can we conclude
about errorD(h)?
Confidence Intervals
If
• S contains n examples, drawn independently of h and each
other
•
Then
• With approximately N% probability, errorD(h) lies in
interval
30

n
2.53
2.33
1.96
1.64
1.28
1.00
0.67
:
99%
98%
95%
90%
80%
68%
50%
:
N%
where
))
(
1
)(
(
)
(
N
S
S
N
S
z
n
h
error
h
error
z
h
error


Confidence Intervals
If
• S contains n examples, drawn independently of h and each
other
•
Then
• With approximately 95% probability, errorD(h) lies in
interval
30

n
n
h
error
h
error
h
error S
S
S
))
(
1
)(
(
)
(

1.96
errorS(h) is a Random Variable
• Rerun experiment with different randomly drawn S (size n)
• Probability of observing r misclassified examples:
Binomial distribution for n=40, p=0.3
0.00
0.02
0.04
0.06
0.08
0.10
0.12
0.14
0 5 10 15 20 25 30 35 40
r
P(r)
r
n
D
r
D h
error
h
error
r
n
r
n
r
P 


 ))
(
1
(
)
(
)!
(
!
!
)
(
Binomial Probability Distribution
Binomial distribution for n=40, p=0.3
0.00
0.02
0.04
0.06
0.08
0.10
0.12
0.14
0 5 10 15 20 25 30 35 40
r
P(r)
r
n
r
p
p
r
n
r
n
r
P 


 )
1
(
)!
(
!
!
)
(
)
1
(
]
])
[
[(
σ
:
of
deviation
Standard
)
1
(
]
])
[
[(
:
of
Variance
)
(
:
of
mean value
or
Expected,
Pr
if
flips,
coin
in
heads
of
Probabilty
2
2
0
p
np
X
E
X
E
X
p
np
X
E
X
E
Var(X)
X
np
i
iP
E[X]
X
(heads)
p
n
r
P(r)
X
n
i
















Normal Probability Distribution
Normal distribution with mean 0, standard deviation 1
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3
 2
σ
μ
2
1
2
πσ
2
1
)
(



x
e
r
P
σ
σ
:
of
deviation
Standard
:
of
Variance
μ
:
of
mean value
or
Expected,
)
(
by
given
is
interval
the
into
fall
will
y that
probabilit
The
2







X
b
a
X
Var(X)
X
E[X]
X
dx
x
p
(a,b)
X
σ
Normal Distribution Approximates Binomial
n
h
error
h
error
h
error
μ
n
h
error
h
error
h
error
μ
h
error
S
S
h
error
D
h
error
D
D
h
error
D
h
error
s
S
S
S
S
))
(
1
)(
(
σ
deviation
standard
)
(
mean
on with
distributi
Normal
a
by
this
e
Approximat
))
(
1
)(
(
σ
deviation
standard
)
(
mean
with
on,
distributi
Binomial
a
follows
)
(
)
(
)
(
)
(
)
(










Normal Probability Distribution
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3
2.53
2.33
1.96
1.64
1.28
1.00
0.67
:
99%
98%
95%
90%
80%
68%
50%
:
N%
σ
in
lies
ty)
(probabili
area
of
N%
σ
1.28
in
lies
ty)
(probabili
area
of
80%
N
N
z
z




Confidence Intervals, More Correctly
If
• S contains n examples, drawn independently of h and each
other
•
Then
• With approximately 95% probability, errorS(h) lies in
interval
• equivalently, errorD(h) lies in interval
• which is approximately
30

n
n
h
error
h
error
h
error D
D
D
))
(
1
)(
(
)
(

1.96
n
h
error
h
error
h
error D
D
S
))
(
1
)(
(
)
(

1.96
n
h
error
h
error
h
error S
S
S
))
(
1
)(
(
)
(

1.96
Calculating Confidence Intervals
1. Pick parameter p to estimate
• errorD(h)
2. Choose an estimator
• errorS(h)
3. Determine probability distribution that governs estimator
• errorS(h) governed by Binomial distribution, approximated
by Normal when
4. Find interval (L,U) such that N% of probability mass falls
in the interval
• Use table of zN values
30

n
Central Limit Theorem
.
n
σ
variance
and
mean
with
on,
distributi
Normal
a
approaches
governing
on
distributi
the
,
As
.
1
mean
sample
the
Define
.
variance
finite
and
mean
with
on
distributi
y
probabilit
arbitrary
an
by
governed
all
,
variables
random
d
distribute
y
identicall
t,
independen
of
set
a
Consider
2
1
2



Y
n
Y
n
Y
n
i
i






Theorem
Limit
Central
n
1 Y
Y
Difference Between Hypotheses
2
2
2
1
1
1
2
2
2
1
1
1
d
2
1
2
1
2
2
1
1
))
(
1
)(
(
))
(
1
)(
(
ˆ
interval
in the
falls
mass
y
probabilit
of
N%
such that
U)
(L,
interval
Find
4.
))
(
1
)(
(
))
(
1
)(
(
σ
estimator
governs
on that
distributi
y
probabilit
Determine
3.
)
(
)
(
estimator
an
Choose
2.
)
(
)
(
estimate
to
parameter
Pick
1.
on
test
,
sample
on
Test
2
2
1
1
2
2
1
1
2
1
n
h
error
h
error
n
h
error
h
error
z
d
n
h
error
h
error
n
h
error
h
error
h
error
h
error
d
h
error
h
error
d
S
h
S
h
S
S
S
S
N
S
S
S
S
S
S
D
D












Paired t test to Compare hA,hB
buted
lly distri
tely Norma
approxima
Note δ
k
k
s
s
t
d
h
error
h
error
k
i
,...,T
,T
T
k
i
k
i
N,k-
k
i
B
T
A
T
k
i
i











1
2
i
δ
δ
1
1
i
i
2
1
)
δ
δ
(
)
1
(
1
δ
:
for
estimate
interval
confidence
N%
δ
k
1
δ
where
d,
value
Return the
3.
)
(
)
(
δ
do
to
1
from
For
2.
30.
least
at
is
size
this
where
size,
equal
of
sets
est
disjoint t
into
data
Partition
1.
Comparing Learning Algorithms LA and LB













k
i
i
B
T
A
T
i
i
B
B
i
A
A
i
i
i
i
k
k
h
error
h
error
)
(S
L
h
)
(S
L
h
T
D
S
S
T
k
i
,...T
,T
T
k
D
i
i
1
0
2
1
0
δ
1
δ
where
,
δ
value
Return the
3.
)
(
)
(
δ
}
{
set
ng
for traini
data
remaining
the
and
set,
test
for the
use
do
,
to
1
from
For
2.
30.
least
at
is
size
this
where
size,
equal
of
sets
est
disjoint t
into
data
Partition
1.
Comparing Learning Algorithms LA and LB
What we would like to estimate:
where L(S) is the hypothesis output by learner L using
training set S
i.e., the expected difference in true error between hypotheses output
by learners LA and LB, when trained using randomly selected
training sets S drawn according to distribution D.
But, given limited data D0, what is a good estimator?
Could partition D0 into training set S and training set T0 and
measure
even better, repeat this many times and average the results
(next slide)
))]
(
(
))
(
(
[ S
L
error
S
L
error
E B
D
A
D
D
S 

))
(
(
))
(
( 0
0 0
0
S
L
error
S
L
error B
T
A
T 
Comparing Learning Algorithms LA and LB
Notice we would like to use the paired t test on to
obtain a confidence interval
But not really correct, because the training sets in
this algorithm are not independent (they overlap!)
More correct to view algorithm as producing an
estimate of
instead of
but even this approximation is better than no
comparison
δ
))]
(
(
))
(
(
[
0
S
L
error
S
L
error
E B
D
A
D
D
S 

))]
(
(
))
(
(
[ S
L
error
S
L
error
E B
D
A
D
D
S 


More Related Content

PPTX
Evaluating hypothesis
swapnac12
 
PPT
review of statistics for schools and colleges.ppt
ictetourism
 
PDF
Module 4_Machine Learning_Evaluating Hyp
Dr. Shivashankar
 
PDF
U unit8 ksb
Akhilesh Deshpande
 
PDF
Ch3_Statistical Analysis and Random Error Estimation.pdf
Vamshi962726
 
PDF
Review20Probability20and20Statistics.pdf
10CL122CaoHYnNhi
 
PDF
5-Propability-2-87.pdf
elenashahriari
 
PDF
Statistics_summary_1634533932.pdf
YoursTube1
 
Evaluating hypothesis
swapnac12
 
review of statistics for schools and colleges.ppt
ictetourism
 
Module 4_Machine Learning_Evaluating Hyp
Dr. Shivashankar
 
U unit8 ksb
Akhilesh Deshpande
 
Ch3_Statistical Analysis and Random Error Estimation.pdf
Vamshi962726
 
Review20Probability20and20Statistics.pdf
10CL122CaoHYnNhi
 
5-Propability-2-87.pdf
elenashahriari
 
Statistics_summary_1634533932.pdf
YoursTube1
 

Similar to 1_Standard error Experimental Data_ML.ppt (20)

PDF
Statistics_Cheat_sheet_1567847508.pdf
Akashyadav375896
 
PPTX
Lec. 10: Making Assumptions of Missing data
MohamadKharseh1
 
PPTX
Standard Deviaton and Justification of the Mean
cypunksevenfold
 
PDF
10-probability-distribution-sta102.pdfhh
primcejames
 
PDF
Error analysis statistics
Tarun Gehlot
 
PPT
Statistical Methods
Enric Cecilla Real
 
ODP
QT1 - 07 - Estimation
Prithwis Mukerjee
 
PPT
Inferential statistics-estimation
Southern Range, Berhampur, Odisha
 
PPT
ERM-4b-finalERM-4b-finaERM-4b-finaERM-4b-fina.ppt
SnehaLatha68
 
PPTX
probability for beginners masters in africa.ppt
eliezerkbl
 
PPTX
probability types and definition and how to measure
hanifaelfadilelmhdi
 
PPTX
Basic statistics for algorithmic trading
QuantInsti
 
DOCX
Inferential statistics
Maria Theresa
 
PPTX
Econometrics 2.pptx
fuad80
 
PPTX
. Estimation Of Parameters presentation pptx
PrinceShahzaib4
 
DOCX
Estimation in statistics
Rabea Jamal
 
PPTX
The Central Limit Theorem
Long Beach City College
 
PPT
Normal distribution - Unitedworld School of Business
Arnab Roy Chowdhury
 
PDF
Introductory Statistics Explained.pdf
ssuser4492e2
 
PPT
POINT_INTERVAL_estimates.ppt
AngelieLimbagoCagas
 
Statistics_Cheat_sheet_1567847508.pdf
Akashyadav375896
 
Lec. 10: Making Assumptions of Missing data
MohamadKharseh1
 
Standard Deviaton and Justification of the Mean
cypunksevenfold
 
10-probability-distribution-sta102.pdfhh
primcejames
 
Error analysis statistics
Tarun Gehlot
 
Statistical Methods
Enric Cecilla Real
 
QT1 - 07 - Estimation
Prithwis Mukerjee
 
Inferential statistics-estimation
Southern Range, Berhampur, Odisha
 
ERM-4b-finalERM-4b-finaERM-4b-finaERM-4b-fina.ppt
SnehaLatha68
 
probability for beginners masters in africa.ppt
eliezerkbl
 
probability types and definition and how to measure
hanifaelfadilelmhdi
 
Basic statistics for algorithmic trading
QuantInsti
 
Inferential statistics
Maria Theresa
 
Econometrics 2.pptx
fuad80
 
. Estimation Of Parameters presentation pptx
PrinceShahzaib4
 
Estimation in statistics
Rabea Jamal
 
The Central Limit Theorem
Long Beach City College
 
Normal distribution - Unitedworld School of Business
Arnab Roy Chowdhury
 
Introductory Statistics Explained.pdf
ssuser4492e2
 
POINT_INTERVAL_estimates.ppt
AngelieLimbagoCagas
 
Ad

More from VGaneshKarthikeyan (20)

PPT
1.3 Basic coding skills_fundamentals .ppt
VGaneshKarthikeyan
 
PPT
5_Model for Predictions_Machine_Learning.ppt
VGaneshKarthikeyan
 
PPT
2_Errors in Experimental Observations_ML.ppt
VGaneshKarthikeyan
 
PDF
FINAL_DAY11_INTERFACES_Roles_and_Responsibility.pdf
VGaneshKarthikeyan
 
PPTX
FINAL_DAY10_INTERFACES_roles and benefits.pptx
VGaneshKarthikeyan
 
PPTX
FINAL_DAY8_VISIBILITY_LABELS_Roles and.pptx
VGaneshKarthikeyan
 
PPTX
FINAL_DAY9_METHOD_OVERRIDING_Role and benefits .pptx
VGaneshKarthikeyan
 
PPT
JAVA_BASICS_Data_abstraction_encapsulation.ppt
VGaneshKarthikeyan
 
PPT
Java ppt-class_Introduction_class_Objects.ppt
VGaneshKarthikeyan
 
PPT
INT104 DBMS - Introduction_Atomicity.ppt
VGaneshKarthikeyan
 
PDF
6. Implementation of classes_and_its_advantages.pdf
VGaneshKarthikeyan
 
PPT
Operators_in_C++_advantages_applications.ppt
VGaneshKarthikeyan
 
PPTX
Unit III Part I_Opertaor_Overloading.pptx
VGaneshKarthikeyan
 
PPTX
Linear_discriminat_analysis_in_Machine_Learning.pptx
VGaneshKarthikeyan
 
PPTX
K-Mean clustering_Introduction_Applications.pptx
VGaneshKarthikeyan
 
PPTX
Numpy_defintion_description_usage_examples.pptx
VGaneshKarthikeyan
 
PPT
Refined_Lecture-14-Linear Algebra-Review.ppt
VGaneshKarthikeyan
 
PPT
randomwalks_states_figures_events_happenings.ppt
VGaneshKarthikeyan
 
PPT
stochasticmodellinganditsapplications.ppt
VGaneshKarthikeyan
 
PPTX
1.10 Tuples_sets_usage_applications_advantages.pptx
VGaneshKarthikeyan
 
1.3 Basic coding skills_fundamentals .ppt
VGaneshKarthikeyan
 
5_Model for Predictions_Machine_Learning.ppt
VGaneshKarthikeyan
 
2_Errors in Experimental Observations_ML.ppt
VGaneshKarthikeyan
 
FINAL_DAY11_INTERFACES_Roles_and_Responsibility.pdf
VGaneshKarthikeyan
 
FINAL_DAY10_INTERFACES_roles and benefits.pptx
VGaneshKarthikeyan
 
FINAL_DAY8_VISIBILITY_LABELS_Roles and.pptx
VGaneshKarthikeyan
 
FINAL_DAY9_METHOD_OVERRIDING_Role and benefits .pptx
VGaneshKarthikeyan
 
JAVA_BASICS_Data_abstraction_encapsulation.ppt
VGaneshKarthikeyan
 
Java ppt-class_Introduction_class_Objects.ppt
VGaneshKarthikeyan
 
INT104 DBMS - Introduction_Atomicity.ppt
VGaneshKarthikeyan
 
6. Implementation of classes_and_its_advantages.pdf
VGaneshKarthikeyan
 
Operators_in_C++_advantages_applications.ppt
VGaneshKarthikeyan
 
Unit III Part I_Opertaor_Overloading.pptx
VGaneshKarthikeyan
 
Linear_discriminat_analysis_in_Machine_Learning.pptx
VGaneshKarthikeyan
 
K-Mean clustering_Introduction_Applications.pptx
VGaneshKarthikeyan
 
Numpy_defintion_description_usage_examples.pptx
VGaneshKarthikeyan
 
Refined_Lecture-14-Linear Algebra-Review.ppt
VGaneshKarthikeyan
 
randomwalks_states_figures_events_happenings.ppt
VGaneshKarthikeyan
 
stochasticmodellinganditsapplications.ppt
VGaneshKarthikeyan
 
1.10 Tuples_sets_usage_applications_advantages.pptx
VGaneshKarthikeyan
 
Ad

Recently uploaded (20)

PDF
Top 10 read articles In Managing Information Technology.pdf
IJMIT JOURNAL
 
PPTX
Chapter_Seven_Construction_Reliability_Elective_III_Msc CM
SubashKumarBhattarai
 
PPTX
Simulation of electric circuit laws using tinkercad.pptx
VidhyaH3
 
PDF
July 2025: Top 10 Read Articles Advanced Information Technology
ijait
 
PDF
Queuing formulas to evaluate throughputs and servers
gptshubham
 
PPTX
Civil Engineering Practices_BY Sh.JP Mishra 23.09.pptx
bineetmishra1990
 
PDF
Cryptography and Information :Security Fundamentals
Dr. Madhuri Jawale
 
PDF
Introduction to Data Science: data science process
ShivarkarSandip
 
PPTX
Unit 5 BSP.pptxytrrftyyydfyujfttyczcgvcd
ghousebhasha2007
 
PPTX
Module2 Data Base Design- ER and NF.pptx
gomathisankariv2
 
PDF
flutter Launcher Icons, Splash Screens & Fonts
Ahmed Mohamed
 
PPTX
database slide on modern techniques for optimizing database queries.pptx
aky52024
 
PDF
BRKDCN-2613.pdf Cisco AI DC NVIDIA presentation
demidovs1
 
PDF
FLEX-LNG-Company-Presentation-Nov-2017.pdf
jbloggzs
 
PPTX
Edge to Cloud Protocol HTTP WEBSOCKET MQTT-SN MQTT.pptx
dhanashri894551
 
PPTX
TE-AI-Unit VI notes using planning model
swatigaikwad6389
 
PPT
Ppt for engineering students application on field effect
lakshmi.ec
 
PDF
Software Testing Tools - names and explanation
shruti533256
 
PDF
Unit I Part II.pdf : Security Fundamentals
Dr. Madhuri Jawale
 
PPTX
EE3303-EM-I 25.7.25 electrical machines.pptx
Nagen87
 
Top 10 read articles In Managing Information Technology.pdf
IJMIT JOURNAL
 
Chapter_Seven_Construction_Reliability_Elective_III_Msc CM
SubashKumarBhattarai
 
Simulation of electric circuit laws using tinkercad.pptx
VidhyaH3
 
July 2025: Top 10 Read Articles Advanced Information Technology
ijait
 
Queuing formulas to evaluate throughputs and servers
gptshubham
 
Civil Engineering Practices_BY Sh.JP Mishra 23.09.pptx
bineetmishra1990
 
Cryptography and Information :Security Fundamentals
Dr. Madhuri Jawale
 
Introduction to Data Science: data science process
ShivarkarSandip
 
Unit 5 BSP.pptxytrrftyyydfyujfttyczcgvcd
ghousebhasha2007
 
Module2 Data Base Design- ER and NF.pptx
gomathisankariv2
 
flutter Launcher Icons, Splash Screens & Fonts
Ahmed Mohamed
 
database slide on modern techniques for optimizing database queries.pptx
aky52024
 
BRKDCN-2613.pdf Cisco AI DC NVIDIA presentation
demidovs1
 
FLEX-LNG-Company-Presentation-Nov-2017.pdf
jbloggzs
 
Edge to Cloud Protocol HTTP WEBSOCKET MQTT-SN MQTT.pptx
dhanashri894551
 
TE-AI-Unit VI notes using planning model
swatigaikwad6389
 
Ppt for engineering students application on field effect
lakshmi.ec
 
Software Testing Tools - names and explanation
shruti533256
 
Unit I Part II.pdf : Security Fundamentals
Dr. Madhuri Jawale
 
EE3303-EM-I 25.7.25 electrical machines.pptx
Nagen87
 

1_Standard error Experimental Data_ML.ppt

  • 1. Evaluating Hypotheses • Sample error, true error • Confidence intervals for observed hypothesis error • Estimators • Binomial distribution, Normal distribution, Central Limit Theorem • Paired t-tests • Comparing Learning Methods
  • 2. Problems Estimating Error 1. Bias: If S is training set, errorS(h) is optimistically biased For unbiased estimate, h and S must be chosen independently 2. Variance: Even with unbiased S, errorS(h) may still vary from errorD(h) ) ( )] ( [ h error h error E bias D S  
  • 3. Two Definitions of Error The true error of hypothesis h with respect to target function f and distribution D is the probability that h will misclassify an instance drawn at random according to D. The sample error of h with respect to target function f and data sample S is the proportion of examples h misclassifies How well does errorS(h) estimate errorD(h)?   ) ( ) ( Pr ) ( x h x f h error D x D        otherwise 0 and ), ( ) ( if 1 is ) ( ) ( where ) ( ) ( 1 ) ( x h x f x h x f x h x f n h error S x S        
  • 4. Example Hypothesis h misclassifies 12 of 40 examples in S. What is errorD(h)? 30 . 40 12 ) (   h errorS
  • 5. Estimators Experiment: 1. Choose sample S of size n according to distribution D 2. Measure errorS(h) errorS(h) is a random variable (i.e., result of an experiment) errorS(h) is an unbiased estimator for errorD(h) Given observed errorS(h) what can we conclude about errorD(h)?
  • 6. Confidence Intervals If • S contains n examples, drawn independently of h and each other • Then • With approximately N% probability, errorD(h) lies in interval 30  n 2.53 2.33 1.96 1.64 1.28 1.00 0.67 : 99% 98% 95% 90% 80% 68% 50% : N% where )) ( 1 )( ( ) ( N S S N S z n h error h error z h error  
  • 7. Confidence Intervals If • S contains n examples, drawn independently of h and each other • Then • With approximately 95% probability, errorD(h) lies in interval 30  n n h error h error h error S S S )) ( 1 )( ( ) (  1.96
  • 8. errorS(h) is a Random Variable • Rerun experiment with different randomly drawn S (size n) • Probability of observing r misclassified examples: Binomial distribution for n=40, p=0.3 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0 5 10 15 20 25 30 35 40 r P(r) r n D r D h error h error r n r n r P     )) ( 1 ( ) ( )! ( ! ! ) (
  • 9. Binomial Probability Distribution Binomial distribution for n=40, p=0.3 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0 5 10 15 20 25 30 35 40 r P(r) r n r p p r n r n r P     ) 1 ( )! ( ! ! ) ( ) 1 ( ] ]) [ [( σ : of deviation Standard ) 1 ( ] ]) [ [( : of Variance ) ( : of mean value or Expected, Pr if flips, coin in heads of Probabilty 2 2 0 p np X E X E X p np X E X E Var(X) X np i iP E[X] X (heads) p n r P(r) X n i                
  • 10. Normal Probability Distribution Normal distribution with mean 0, standard deviation 1 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3  2 σ μ 2 1 2 πσ 2 1 ) (    x e r P σ σ : of deviation Standard : of Variance μ : of mean value or Expected, ) ( by given is interval the into fall will y that probabilit The 2        X b a X Var(X) X E[X] X dx x p (a,b) X σ
  • 11. Normal Distribution Approximates Binomial n h error h error h error μ n h error h error h error μ h error S S h error D h error D D h error D h error s S S S S )) ( 1 )( ( σ deviation standard ) ( mean on with distributi Normal a by this e Approximat )) ( 1 )( ( σ deviation standard ) ( mean with on, distributi Binomial a follows ) ( ) ( ) ( ) ( ) (          
  • 12. Normal Probability Distribution 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 2.53 2.33 1.96 1.64 1.28 1.00 0.67 : 99% 98% 95% 90% 80% 68% 50% : N% σ in lies ty) (probabili area of N% σ 1.28 in lies ty) (probabili area of 80% N N z z    
  • 13. Confidence Intervals, More Correctly If • S contains n examples, drawn independently of h and each other • Then • With approximately 95% probability, errorS(h) lies in interval • equivalently, errorD(h) lies in interval • which is approximately 30  n n h error h error h error D D D )) ( 1 )( ( ) (  1.96 n h error h error h error D D S )) ( 1 )( ( ) (  1.96 n h error h error h error S S S )) ( 1 )( ( ) (  1.96
  • 14. Calculating Confidence Intervals 1. Pick parameter p to estimate • errorD(h) 2. Choose an estimator • errorS(h) 3. Determine probability distribution that governs estimator • errorS(h) governed by Binomial distribution, approximated by Normal when 4. Find interval (L,U) such that N% of probability mass falls in the interval • Use table of zN values 30  n
  • 16. Difference Between Hypotheses 2 2 2 1 1 1 2 2 2 1 1 1 d 2 1 2 1 2 2 1 1 )) ( 1 )( ( )) ( 1 )( ( ˆ interval in the falls mass y probabilit of N% such that U) (L, interval Find 4. )) ( 1 )( ( )) ( 1 )( ( σ estimator governs on that distributi y probabilit Determine 3. ) ( ) ( estimator an Choose 2. ) ( ) ( estimate to parameter Pick 1. on test , sample on Test 2 2 1 1 2 2 1 1 2 1 n h error h error n h error h error z d n h error h error n h error h error h error h error d h error h error d S h S h S S S S N S S S S S S D D            
  • 17. Paired t test to Compare hA,hB buted lly distri tely Norma approxima Note δ k k s s t d h error h error k i ,...,T ,T T k i k i N,k- k i B T A T k i i            1 2 i δ δ 1 1 i i 2 1 ) δ δ ( ) 1 ( 1 δ : for estimate interval confidence N% δ k 1 δ where d, value Return the 3. ) ( ) ( δ do to 1 from For 2. 30. least at is size this where size, equal of sets est disjoint t into data Partition 1.
  • 18. Comparing Learning Algorithms LA and LB              k i i B T A T i i B B i A A i i i i k k h error h error ) (S L h ) (S L h T D S S T k i ,...T ,T T k D i i 1 0 2 1 0 δ 1 δ where , δ value Return the 3. ) ( ) ( δ } { set ng for traini data remaining the and set, test for the use do , to 1 from For 2. 30. least at is size this where size, equal of sets est disjoint t into data Partition 1.
  • 19. Comparing Learning Algorithms LA and LB What we would like to estimate: where L(S) is the hypothesis output by learner L using training set S i.e., the expected difference in true error between hypotheses output by learners LA and LB, when trained using randomly selected training sets S drawn according to distribution D. But, given limited data D0, what is a good estimator? Could partition D0 into training set S and training set T0 and measure even better, repeat this many times and average the results (next slide) ))] ( ( )) ( ( [ S L error S L error E B D A D D S   )) ( ( )) ( ( 0 0 0 0 S L error S L error B T A T 
  • 20. Comparing Learning Algorithms LA and LB Notice we would like to use the paired t test on to obtain a confidence interval But not really correct, because the training sets in this algorithm are not independent (they overlap!) More correct to view algorithm as producing an estimate of instead of but even this approximation is better than no comparison δ ))] ( ( )) ( ( [ 0 S L error S L error E B D A D D S   ))] ( ( )) ( ( [ S L error S L error E B D A D D S  