0% found this document useful (0 votes)
152 views4 pages

Two Modifications of CNN

This document describes two modifications to the condensed nearest-neighbor (CNN) algorithm that aim to address some of its disadvantages. CNN works by randomly selecting samples from an original dataset to create a reduced dataset, but this can result in (1) retaining unnecessary interior samples and (2) shifting decision boundaries. The two modifications are proposed to only consider points close to the true decision boundary, which is unknown, in order to generate a smaller, more optimal reduced dataset.

Uploaded by

Nguyễn Tom
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
152 views4 pages

Two Modifications of CNN

This document describes two modifications to the condensed nearest-neighbor (CNN) algorithm that aim to address some of its disadvantages. CNN works by randomly selecting samples from an original dataset to create a reduced dataset, but this can result in (1) retaining unnecessary interior samples and (2) shifting decision boundaries. The two modifications are proposed to only consider points close to the true decision boundary, which is unknown, in order to generate a smaller, more optimal reduced dataset.

Uploaded by

Nguyễn Tom
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

CORRESPONDENCE 769

necessarily subsume any other column of Ai since duplicate Two Modifications of CNN
columns are removed in Step 3-b.3. However, at some point in IVAN TOMEK
the reduction of P' to I', P' is reduced to some I'yj*, P' Iyj* c

and, as shown above, there exists an FPI Q such that Q is the Abstract-The condensed nearest-neighbor (CNN) method chooses
disjunction of Q'yj* and possibly Xi and/or Xi. Q' may also have samples randomly. This results in a) retention of unnecessary samples
been reduced to Q" by Step 3-b, Q' c Q". So now, and b) occasional retention of internal rather than boundary samples.
Two modifications of CNN are presented which remove these dis-
col col 2 advantages by considering only points close to the boundary. Per-
formance is illustrated by an example.
I' Q"
Ai = Xixi [..
INTRODUCTION
*
yj* *

yj *
-*
The condensed nearest-neighbor (CNN) method [1 ] is a
method of preprocessing of the design set for pattern recognition.
where col 1 is derived from P and col 2 is derived from Q. Since It is based on the nearest-neighbor (NN) rule [2]. Its purpose is
we areinterested in generating I'XiXi, we will consider two cases. to reduce the size of the original design set D (set of samples
Case 1: F' ' Q". If P contains XiXi, then an e is placed with known membership) by the elimination of certain samples
below col 1 and XiXi' = I is added to A. However, P = without affecting significantly the performance of NN classifica-
XiXiP ' XiXiyi*I' C XiXjI' 1. Therefore, P c I, which
tion: The NN rule used with E (the new design set E c D),
is a contradiction since P is an FPI. Therefore, P cannot contain should give almost the same result as NN used with D.
Xi Xi. So an e is placed below col 1, and 1 is not added to A. CNN works as follows:
Case 2: I' = Q". As shown above, P does not contain XiXi.
If Q contains XiXi, then an e is placed below col 1 and col 2, a) pass v- 1,
and XiXiI' = I is added to A. However, Q = XLXiyJ*Q` C b) choose x E D randomly, D(1) = D - {x}, E = {x},
XiXiy-*Q` XiXiQ = XiXI' = 1. Therefore, Q c I, which is c) D (pass + 1) = 0, count v- 0,
a contradiction since Q is an FPI. Therefore, Q cannot contain d) choose x E D (pass) randomly, classify x by NN using E,
XiXi. So an e is placed below col 1 and col 2, and I is not added e) if classification found in d) agrees with actual membership
to A. of x
Therefore, any nonessential FPI of F will not appear in the then D(pass + 1) = D(pass + 1) U {x}
function resulting from the algorithm of this paper. Q.E.D. else E = F u {x }, count v- count + 1,
f) D(pass)= D(pass) - {x},
V. CONCLUSIONS g) if D(pass) 0 go to d),
h) if count = 0
In this paper we have described an algorithm which finds then end of algorithm
precisely the set of essential fuzzy prime implicants of a fuzzy else pass v- pass + 1, go to b).
function in a sum of products form. At present, this is the only
known fuzzy minimization method for disjunctive functions It is clear that CNN has the following properties. It generates a
which does so without the construction of a cover table. design set E 1) which is a subset of the original design set and
This algorithm is less cumbersome than existing methods of 2) which classifies (NN rule) all samples in D correctly. Property
generating all fuzzy prime implicants in that it does not include 1) usually means that E is much smaller than D and thus com-
an intermediate expansion of the function. Reduction takes place putationally much better suited for NN classification: it requires
only on those phrases which are in the original representation of less storage and computation. Property 2) indicates that NN
the function. For this reason we feel that this is a significant classification with E is very similar (although not necessarily
algorithm, introducing concepts which may prove useful in other identical) to NN classification with D. This is especially true
areas of fuzzy set theory. when D is "representative" (by this we mean that the number of
samples and their distribution is such that the approximation
REFERENCES of the "true" underlying probability distribution by relative
[1] S. R. Das, "An approach for simplifying switching functions by
frequency of samples is "good").
utilizing cover table representation," IEEE Trans. Computers (Short The disadvantage of CNN is that it processes samples from D
Notes), vol. C-20, pp. 355-359, 1971. randomly moves them from D to E quite randomly at the
[2] S. R. Das and N. S. Khabra, "Clause-column table approach for
generating all the prime implicants of switching functions," IEEE beginning and less so later on (when it tends to take samples
Trans. Computers, vol. C-21, pp. 1239-1246, 1972. closer to the boundary). This means that E contains a) interior
[3] A. Kandel, "On minimization of fuzzy functions," IEEE Trans.
Computers, vol. C-22, pp. 826-832, 1973. samples which could be eliminated completely without change in
[41 ,"On the minimization of incompletely specified fuzzy functions," the performance and b) samples which define a boundary on E
Information and Control, pp. 141-153, 1974.
[5] ,"On the Properties of fuzzy switching functions," Journal of but not on D (i.e., samples not essential in D become boundary
Cybernetics, pp. 119- 126, 1974.
[6] -, "A note on the simplification of fuzzy switching functions," points in E). Point a) implies that E is larger than necessary,
Computer Science Report 139, New Mexico Institute of Mining and point b) causes an undesirable shift between boundaries.
Technology, Socorro, New Mexico, 1975.
[71 ,"Inexact switching logic," IEEE Trans. Systems, Man, and The ideal method of reduction of D would work essentially as
Cybernetics, vol. SMC-6, pp. 215-219, 1976. CNN but would only use points close to the decision boundary
[8] A. Kandel and H. A. Davis, "The first fuzzv decade (Bibliography on
fuzzy sets and their applications)," Computer Science Report 140, to generate E. Unfortunately, the true decision boundary is
New Mexico Institute of Mining and Technology, Socorro, New
Mexico, 1976. unknown by definition. The next best is to use only those points
[9] T. P. Neff and A. Kandel, "Simplification of fuzzy switching func- which generate the piecewise-linear decision boundary in D (as
tions," to appear in the International Journal of Computer and In-
formation Sciences, 1977. given by the application of the NN rule). Even this is difficult.
[10] S. M. Rickman and A. Kandel, "Column table approach for the
minimization of fuzzy functions," Computer Science Report 137,
New Mexico Institute of Mining and Technology, Socorro, New
Mexico, 1975. Manuscript received March 12, 1976; revised June 9, 1976.
[11] L. A. Zadeh, "Fuzzy sets," Information and Control, vol. 8, pp. 338- The author is with the Department of Computer Science, Acadia
353. 1965. University, Wolfville, NS, Canada BOP IXO.
770 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, NOVEMBER 1976

0
.
.
.
0
U
0
0

%4
_

ii
U 0 *

I
U
U

Fig. 1. Choice of boundary points-Method 1.

00
n 0C
0
0
0
a
0
0

0
0 lI lI
61 ~~~~1
0

111
0 0 0
0 a
0
0 0 0 a ts
0 0 a o
0 o
0 0~~~~~
a0
Fig. 2. Points represented by full symbols should be chosen for the final
I I
design set. They are not retained by Method 1.

0
*yU ~~~II I
g
I
A .

x *
__0J
a
.

~~~ I
Fig. 3. Step e) in Method 1. y is the nearest neighbor of x (y E F). N

i) I I I I
_..- _m-
qx ~ I

Two even less ideal methods will be described in the remainder of


this text. It will be shown that they are considerable improve-
ments upon CNN. They are both based on intuitive approxima-
tions of the notion of a boundary point.
I
ORDERED CNN
Two methods will now be described which differ from CNN
in that only samples with certain properties are considered for _---I
. i

the reduced design set E.


Method I ~I

Let x e D, y its nearest neighbor from the opposite class


(y = nno(x)). Then y must be close to the decision boundary
(Fig. 1). These points form the basis of our first modification of
CNN. It has to be noted that they are not by themselves sufficient
in all situations. If we only used them, points from set A (Fig. 1)
would never be found, and yet they are necessary for E to
classify all D correctly. To discover points in A we have to
proceed indirectly. If at a given stage of generation of E, z is Fig. 4. Flowchart for Method 2 (notation adapted from [3]). C is the set
classified incorrectly (because no points from A are in E), use of pairs of samples accepted for the final design set. C 0 at the
the nearest neighbor of z in E (u in Fig. 1). It belongs to the beginning.
"opposite" class (since z is classified incorrectly and z is the
nearest neighbor). Now find the nearest neighbor of u which
classifies z correctly-this is v E A. a) pass = 1,
Let us note that while this approach generates E which a) b) choose x E D randomly, find y - nno(x), D(l) = D - {y},
classifies D correctly and b) contains only boundary points, it E = {y}, F # 0,
still does not guarantee that all the desirable boundary points c) D(pass + 1) = 0 count = 0,
are included (Fig. 2). The algorithm is as follows: d) choose x E D(pass) randomly and classify it by NN using E,
CORRESPONDENCE 771

0
YOOk a

a
0 X .
a
0
0
a

0 U
0
Fig. 5. Example of application of Method 2: The upper pair (X,Y) is
accepted as a member of C, the lower pair is not.

(a) (b)

(c) (d)
Fig. 6. Comparison of proposed methods with two others. Example taken from [4] with two classes with uniform distributions
separated by the indicated boundary. Original set (400 samples) is not shown. Results of processing: (a) Method 1 (only subset
C is shown). (b) Method 2. (c) Method from [4]. (d) CNN.
772 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, NOVEMBER 1976

e) if classification found in d) agrees with actual membership XCI


of x,
then D(pass + 1) = D(pass + 1) u {x}
else if x F
then E- £E U {x}, F - F {x}
AX
-

else classify x by F,
if classification agrees
then E = E {x}, F = F- {x}
u
y
else find z nno(x), z D(pass) and assign it to F: F
=
Fig. 7. Illustration for proof in Appendix.
F u {z }, next find u such that dist (u,z) min E A dist
= v

(v,z) where A {w w E D(pass), dist (x,w) < dist (x,z),


=

b) Let x1 c D - C1 be classified incorrectly. Let y E C be its


class (w) = class (x)}, E = E {u}. u
nearest neighbor. Assume x E class 1, so that y E class 2. Let
Steps f), g), and h) are the same as in CNN. z = 0.5*(xl + y) (Fig. 7). Since x1 e C there must be a point
A part of step e) is illustrated in Fig. 3. It is clear that Method x2 c D closer to z than x1 (otherwise x would belong to C by
1 guarantees that E classifies all samples of D correctly. (Note definition). x2 C C is impossible since y is nearest to x1 of all
that set A in step e) is always nonempty since x A.) c points of C and dist (x2,x1) < dist (y,x1).
Now either
Method 2 1) x2 E class 2. Since x2 E C this means that there is another
The method works as CNN but instead of moving to E point (X3 E D) inside S(xl,x2) c (xl,y); otherwise x2 E C which
samples from the complete D, only a subset of D (C c D) is is a contradiction.
used. Subset C is chosen according to the flowchart in Fig. 4. (S(x1,x2) is the bigger sphere centered at 0.5*(xl + X2) with
(It is assumed that x(i), i 1, N
- are all samples of D belonging radius 0.5* dist (x1xJ2)). This either leads us back to the begin-
to class 1 and y(i), i = 1, M all samples of D belonging to ning of argument 1 with x2 replaced by X3, or to case 2. In both
class 2.) See also Fig. 5. situations we arrive at a contradiction due to the finite size of D.
2) X2 E C1, and we are in the same situation as at the beginning
EXPERIMENTAL RESULTS
of the proof, with x1 replaced by x2, but dist (X2,Y) < dist (x1,y).
For illustration and comparison with other methods, an By induction and the finiteness of D it leads to a contradiction.
example presented in [4] has been repeated. Results are shown in This completes the proof.
Fig. 6. The original design set consists of 2 classes of 200 two-
REFERENCES
dimensional samples each. Distribution is uniform with irregular
[1] P. E. Hart, "The condensed nearest neighbor," IEEE Trans. Inform.
decision boundaries as indicated. Theory, vol. IT-14, pp. 515-516, May 1968.
[2] R. 0. Duda and P. E. Hart, Pattern Classification and Scene Analysis.
CONCLUSION New York: John Wiley & Sons. 1973.
[3] R. H. Hammond, W. B. Rogers, and B. Hauck, Jr., Introcuction to
Both proposed modifications work better than CNN since FORTRAN IV. New York: McGraw-Hill, 1976.
a selective nearest neighbor
a) the resulting design set is smaller and b) the retained boundary [4] G. L. Ritter et al., "An algorithm for
decision rule," IEEE Trans. Inforin. Theory, vol. IT-21, pp. 665-669,
points are better chosen (close to the decision boundary). Nov. 1975.
neighbor decision rule," IEEE
[5] G. W. Gates, "The reduced nearest 431-433,
Method 2 has another potentially important property: It finds Trans.Inform. Theoryj, vol. IT-18,
[6] D. L. Wilson, "Asymptotic
pp. May 1972.
properties of nearest neighbor rules using
pairs of boundary points which participate in the formation of edited data," IEEE Trans. Systems, Man, and Cybernetics, vol. SMC-2,
the (piecewise-linear) boundary. This information might be very pp. 408-421, July 1972.
useful in the development of more powerful methods of clas- [71 1. Tomek, "An experiment with the edited nearest neighbor rule,"
IEEE Trans. Systems, Man, and Cybernetics, vol. SMC-6, pp. 448-452,
sification by piecewise-linear classifiers. Such methods could use June 1976.
these pairs to generate progressively simpler descriptions of
acceptably accurate approximations of the original completely
specified boundaries. Note also that the choice of subset C is
order independent. Polynomial Interpolation Errors for Band-Limited Random
Unlike the other existing methods of reduction of the design Signals
set [4], [5 ], the proposed methods explicitly seek to find bound- ROBERT B. KERR
ary points. This results in retaining points closer to the boundary
and also in retaining fewer points. Abstract-One method of digitally simulating a differential system in
Wilson [6] introduced an editing method intended to improve real time involves approximating the input signals by polynomial inter-
the classification. As a by-product, this method also insigni- polations between their sample values. In this note, the statistical error
ficantly reduces the size of the design set (by eliminating most in various polynomial interpolations of band-limited random signals
samples on the "wrong side" of the boundary). When, however, sampled at the Nyquist rate is investigated. The time average of the mean-
the edited set is processed by CNN or other methods, the result squared error is calculated for zero-, first-, and second-order holds, with
best results obtained for the first-order hold (linear interpolation).
is reduction much more significant than that attainable on the
original set. The reason for this is that the edited design set is INTRODUCTION
much "cleaner" than the original one. This is also discussed in [7]. Some approximation is required if one is to simulate the
APPENDIX behavior of a differential system, driven by continuous-time
analog forcing functions, with a digital computer. One of the
Theorem: All points in D are correctly classified by the NN
rule using subset C generated by Method 2.
Proof: Manuscript received April 7, 1976; revised June 9, 1976.
The author is with the Department of Electrical Engineering, Duke
a) All points from C are classified correctly. University, Durham, NC 27706.

You might also like