



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Methods for estimating the error probability of a designed classification system using techniques such as error counting, resubstitution, holdout, and leave-one-out. The document also covers the statistical properties of these estimators and their reliability for small data sets.
Typology: Slides
1 / 6
This page cannot be seen from the preview
Don't miss anything!




1
The goal is to estimate the error probability of thedesigned classification system
Let
classes
Let
data points in class
for testing.
the number oftest points.
Let
i^
the error probability for class
ω
i
The classifier is assumed to have been designedusing another
independent
data set
Assuming that the feature vectors in the test data setare independent, the probability of
k
i^
vectors from
ω
i
being in error is
M i
i^
(^1) ^
i i
i^
k N i
k i
i i
i
i^
P
P
N k
k
)
(^1) (
classified
wrongly
in
prob
SYSTEM EVALUATION
i
i M N
2
Since
’s are not known, estimate i
i^
by
maximizing the above binomial distribution. Itturns out that
Thus, count the errors and divide by the totalnumber of test points in class.
Total probability of error
i^ i
i^
k N
ˆP
M i^
i i
i^
k N
1
4
Resubstitution method:Use
the
same
data
for
training
and
testing.
It
underestimates the error.
The estimate improves
for large
and large
ratios.
Holdout Method:
Given
divide it into:
training points
test points
2
Less data both for training and test
N^ l
5
Leave-one-out MethodThe steps:
one
sample
out
of
the
Train
the
classifier using the remaining
samples.
Test
the classifier using the selected sample.
Count
an error if it is misclassified.
the
above
by
excluding
a
different
sample each time.
counted errors